Publications
The Archives Unleashed project has a body of published research.
Contents
- Archives Unleashed Project Materials
- Journal Articles
- Book
- Peer-Reviewed Conference Publications
- Peer-Reviewed Posters and Demonstrations
- Conference Presentations
- Invited Talks and Lectures
- Datasets
Archives Unleashed Project Materials
- Archives Unleashed, with content provided by Cohort researchers. “Archives Unleashed Cohort Program: Research with Web Archives and Digital Collections” [Cohort Project Summaries Report]. 2023. https://bit.ly/AUCohortProjects | PDF
- Archives Unleashed. 2023. “Creating JUXTA Collages with Web Archive Images”. https://github.com/archivesunleashed/Juxta-Collage
- Archives Unleashed. 2021. “Analyzing Web Archives with the Archives Unleashed Project.” November 2021. Video Tutorial | Slides (PDF)
- Archives Unleashed. 2020. “Archives Unleashed Community Report (2017–2020).” PDF
- Archives Unleashed. 2020. “Reflecting on Collaborations and Community at Archives Unleashed Datathons.” Post | PDF
- Archives Unleashed. “AU NY Datathon (Online).” Online video clip. YouTube, 26-27 March 2020. https://youtu.be/Io6RvhqHfe4
- Ian Milligan, Nick Ruest, Jimmy Lin, and Samantha Fritz. “Archives Unleashed New York Datathon: Introduction.” Archives Unleashed NY Datathon, March 2020, Online. Datathon Slides
Journal Articles
- Nick Ruest, Samantha Fritz, and Ian Milligan. “Creating order from the mess: web archive derivative datasets and notebooks.” Archives and Records, 2022. https://doi.org/10.1080/23257962.2022.2100336
- Samantha Fritz, Ian Milligan, Nick Ruest, and Jimmy Lin. “Fostering Community Engagement through Datathon Events: The Archives Unleashed Experience.” Digital Humanities Quarterly, Vol. 15, No. 1, 2021. http://digitalhumanities.org/dhq/vol/15/1/000536/000536.html
- Nick Ruest, Samantha Fritz, Ryan Deschamps, Jimmy Lin, and Ian Milligan. “From archive to analysis: accessing web archives at scale through a cloud-based interface.” International Journal of Digital Humanities, 2021. https://doi.org/10.1007/s42803-020-00029-6
- Samantha Fritz, Ian Milligan, Nick Ruest, and Jimmy Lin. “Building community at a distance: a datathon during COVID-19.” Digital Library Perspectives, 2020. https://doi.org/10.1108/DLP-04-2020-0024
- Jimmy Lin, Ian Milligan, Jeremy Wiebe, and Alice Zhou. “Warcbase : Scalable Analytics Infrastructure for Exploring Web Archives.” ACM Journal of Computing and Cultural Heritage, Vol. 10, Issue 4, July 2017. [https://doi.org/10.1145/3097570](https://doi.org/10.1145/3097570]
- Ian Milligan, Nick Ruest, and Anna St.Onge. “The Great WARC Adventure : Using SIPS, AIPS and DIPS to Document SLAAPs.” Digital Studies/Le champ numérique, Vol. 6, 2016. https://www.digitalstudies.org/articles/10.16995/dscn.18/
- Ian Milligan. “Lost in the Infinite Archive : The Promise and Pitfalls of Web Archives.” International Journal of Humanities and Arts Computing, Vol. 10, No. 1-2 (2016): 87—94. https://www.euppublishing.com/doi/full/10.3366/ijhac.2016.0161 | preprint
Book
- Ian Milligan, History in the Age of Abundance? How the Web is Transforming Historical Research. Montreal & Kingston: McGill-Queen’s University Press, 2019. amazon.ca | amazon.com | google books | publisher
Peer-Reviewed Conference Publications
- Helge Holzmann, Nick Ruest, Jefferson Bailey, Alex Dempsey, Samantha Fritz, Peggy Lee, and Ian Milligan. “ABCDEF: the 6 key features behind scalable, multi-tenant web archive processing with ARCH: archive, big data, concurrent, distributed, efficient, flexible.” In Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries (JCDL ‘22). Association for Computing Machinery, New York, NY, USA, 1–11 (2022). link | preprint
- Nick Ruest, Jimmy Lin, Ian Milligan, and Samantha Fritz. “The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives.” In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL ’20). Association for Computing Machinery, New York, NY, USA, 157–166 (2020) https://doi.org/10.1145/3383583.3398513 | preprint | presentation
- Jimmy Lin, Ian Milligan, Douglas Oard, Nick Ruest, and Katie Shilton. “We Could, but Should We? Ethical Considerations for Providing Access to GeoCities and Other Historical Digital Collections.” Proceedings of the Fifth ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR 2020). preprint
- Ryan Deschamps, Samantha Fritz, Jimmy Lin, Ian Milligan, and Nick Ruest. “The Cost of a WARC : Analyzing Web Archives in the Cloud.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 19 (2019). https://ieeexplore.ieee.org/document/8791179 | preprint
- Ian Milligan, Nathalie Casemajor, Samantha Fritz, Jimmy Lin, Nick Ruest, Matthew S. Weber, and Nicholas Worby. “Building Community and Tools for Analyzing Web Archives through Datathons.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 19 (2019). https://ieeexplore.ieee.org/document/8791131 | preprint
- Ian Milligan, Nick Ruest, and Jimmy Lin. “Content Selection and Curation for Web Archiving : The Gatekeepers vs. the Masses.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 16 (2016): 107—110. https://doi.org/10.1145/2910896.2910913 | preprint
- Andrew Jackson, Jimmy Lin, Ian Milligan, and Nick Ruest. “Desiderata for Exploratory Search Interfaces to Web Archives in Support of Scholarly Activities.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 16 (2016): 103—106. https://doi.org/10.1145/2910896.2910912 | preprint
- Jimmy Lin. “Scaling Down Distributed Infrastructure on Wimpy Machines for Personal Web Archiving.” Proceedings of the 24th International World Wide Web Conference Companion (WWW 2015), pages 1351-1355, May 2015, Florence, Italy. https://doi.org/10.1145/2740908.2741695
Peer-Reviewed Posters and Demonstrations
- Tobi Adewoye, Xiao Han, Nick Ruest, Ian Milligan, Samantha Fritz, and Jimmy Lin. “Content-Based Exploration of Archival Images Using Neural Networks.” In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL ’20). Association for Computing Machinery, New York, NY, USA, 489–490 (2020). https://doi.org/10.1145/3383583.3398577 | preprint | poster demo | video summary
- Ryan Deschamps, Nick Ruest, Jimmy Lin, Samantha Fritz, and Ian Milligan. “The Archives Unleashed Notebook: Madlibs for Jumpstarting Scholarly Exploration of Web Archives.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 19 (2019). preprint
- Hsiu-Wei Yang, Linqing Liu, Ian Milligan, Nick Ruest, and Jimmy Lin. “Scalable Content-Based Analysis of Images in Web Archives with TensorFlow and the Archives Unleashed Toolkit.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 19 (2019). preprint
- Nick Ruest, Ian Milligan, and Jimmy Lin. “Warclight: A Rails Engine for Web Archive Discovery.” Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Vol. 19 (2019). preprint
Conference Presentations
- Ian Milligan, Helge Holzmann, Nick Ruest, Samantha Fritz, Thomas Padilla. Building the Next Generation of Web Archive Analysis Service, Panel, RESAW Conference 2023, Marseille, France Slides
- Samantha Fritz. Through the ARCHway: Opportunities to Support Access, Exploration, and Engagement with Web Archives, IIPC WAC 2023, Hilversum, Netherlands Slides
- Samantha Fritz. Community Engagement and Research Support as Solutions for Expanding Data Access and Use of Web Archives, RDAP Summit 2023, Online Slides
- Nick Ruest and Jefferson Bailey, “Supporting Computational Research on Large Digital Collections.” Coalition for Networked Information (CNI), December 2022, Washington, DC. Slides
- Helge Holzmann, Nick Ruest, Jefferson Bailey, Alex Dempsey, Samantha Fritz, Ian Milligan and Kody Willis, “Arch-It!.” Web Archiving and Digital Libraries (WADL) Workshop, June 2022, Online.
- Ian Milligan, Jefferson Bailey, Nick Ruest, Helge Holzmann, Samantha Fritz, Kody Willis, “Build: The Archives Research Compute Hub from Idea to Platform.” IIPC Web Archiving Conference, June 2022, Online. Slides]
- Samantha Fritz, “Web Archives as Big Data: Building Tools and Community to Support Access and Use.” IFLA WLIC, August 2021, Online. Slides
- Samantha Fritz, “Accessible Web Archives: Rethinking and Designing Usable Infrastructure for Sustainable Research Platforms.” IIPC Web Archiving Conference, June 2021, Online. Slides
- Samantha Fritz, Ian Milligan, Nick Ruest, “Building Community through Archives Unleashed Datahons.” WAC, June 2021, Online.Slides
- Ian Milligan, Helge Holzmann, “Integrating Archives Unleashed with the Internet Archives.” International Internet Preservation Consortium/RESAW Conference, June 2021, Online. Slides
- Samantha Fritz and Sarah McTavish. “Analyzing Web Archives with the Archives Unleashed Project.” Continuing Education to Advance Web Archiving (CEDWARC), October, 2019, Washington, DC. Overview | Workshop Slides
- Samantha Fritz. “Web Archives: A Doorway to Access and Usability.” Access Conference, September 2019, Edmonton, Alberta. Slides
- Nick Ruest and Ian Milligan. “Lowering the Barrier to Access: The Archives Unleashed Cloud Project.” The Web That Was: Archives, Traces, Reflections, June 2019, Amsterdam, Netherlands. Slides
- Nick Ruest and Ian Milligan. “See a Little Warclight : Building an Open-Source Web Archive Portal with Project Blacklight.” International Internet Preservation Consortium Web Archiving Conference, June 2019, Zagreb, Croatia. Slides
- Nick Ruest and Ian Milligan. “Project Sustainability and Research Platforms : The Archives Unleashed Cloud Project.” International Internet Preservation Consortium Web Archiving Conference, June 2019, Zagreb, Croatia. Slides
- Nick Ruest, “Oh, I Get by with a Little Help from my Friends : Interdisciplinary Web Archive Collaboration.” The Fields Institute Workshop on Quantitative Analysis and the Digital Turn in Historical Studies, February 2019, Toronto, Ontario, Canada.
- Ian Milligan, “Opening up WARCs: The Archives Unleashed Toolkit and Cloud Projects.” International Internet Preservation Consortium Annual Meeting, November 2018, Wellington, New Zealand.
- Nick Ruest, “Make it WALK.” Archives Association of Ontario 2018, May 2018, Waterloo, Ontario, Canada.
- Ryan Deschamps, Jimmy Lin, Nick Ruest, Samantha Fritz, Ian Milligan. “Usability, Accessibility, and Performance: Striking the Right Balance with the Archives Unleashed Toolkit.” CSDH/SCHN Digital Humanities Conference 2018, May 2018, Regina, Saskatchewan, Canada.
- Ian Milligan, “Too Much Information: Transparency, Metadata, and Search in the Age of Web Archives.” American Historical Association Conference, January 2018, Washington, DC, USA.
- Nick Ruest, “Warclight.” Blacklight European Summit 2017, October 2017, Copenhagen, Denmark.
- Ziquan Wang, Borui Lin, Ian Milligan, and Jimmy Lin. “Topic Shifts Between Two US Presidential Administrations.” JCDL 2017 Workshop on Web Archiving and Digital Libraries, June 2017, Toronto, Ontario, Canada. paper draft here
- Nick Ruest and Ian Milligan, “Learning to WALK (Web Archives for Longitudinal Knowledge) : Building a National Web Archiving Collaborative Platform.” International Internet Preservation Consortium/RESAW Conference, June 2017, London, England.
- Ian Milligan and Nick Ruest, “Warcbase : Using Scalable Web Analytics to Analyze Canadian Collections En Masse.” National Symposium on Web Archiving, February 2017, San Francisco, California, USA
- Ian Milligan, Jimmy Lin, Jeremy Wiebe, and Alice Zhou. “Exploring and Discovering Archive-It Collections with Warcbase.” Digital Humanities 2016, July 2016, Krakow, Poland. https://dh2016.adho.org/abstracts/271
- Ian Milligan and Nick Ruest, “Engaging the Public with Web Archives: Providing Access to 10 Years of Political History with WebArchives.ca.” Canadian Society of Digital Humanities/Société canadienne des humanités numériques Conference, May 2016, Calgary, Alberta, Canada.
- Ian Milligan and Nick Ruest, “Hands on with Warcbase.” International Internet Preservation Consortium Conference, April 2016, Reykjavik, Iceland.
Invited Talks and Lectures
- Samantha Fritz, “Archives Unleashed Cohort Program: Opportunities to Access, Explore, and Engage with Web Archives,” Archive-It Partner Meeting, November 2022. Slides | Video Recording
- Samantha Fritz, “Applications of Web Archive Research with the Archives Unleashed Cohort Program,” Internet Archive Library as Labratory Series, March 2022, Online. Intro Slides | Video Recording
- Samantha Fritz, “Web Archives with the Archives Unleashed Project,"Canadian Web Archiving Coalition Web Archives Webinar Series, January 2021, Online. Slides | Video Recording
- Ian Milligan, “History in the Age of Abundance: Skills, Tools, and Methods for the 21st-Century Historian,” Invited Lecture at York University Department of History Historian’s Craft Series, January 2020, Toronto, Ontario, Canada.
- Ian Milligan, “Working with Cultural Heritage at Scale: Developing Tools and Platforms to Enable Historians to Explore History in the Age of Abundance.” ACL Special Interest Group on Language Technologies for Socio-Economic Science and Humanities, June 2019, Minneapolis, Minnesota, USA.
- Nick Ruest, “Hot Tips To Boost Your Interdisciplinary Web Archive Collaboration!” Lewis & Ruth Sherman Centre for Digital Scholarship Speaker Series, April 2018, Hamilton, Ontario, Canada.
- Nick Ruest, “Boosting Your Interdisciplinary Web Archive Collaboration.” BC Research Libraries Group Lecture Series, February 2018, Vancouver, British Columbia, Canada.
- Ian Milligan, “Big Data and History (‘Or How this Historian Learned to Stop Worrying and Love Big Data).”, Love Your Data Week 2018, February 2018, Vancouver, British Columbia, Canada.
- Ian Milligan and Nick Ruest, “Twitter and Web Archive Analysis at Scale.” Data Love-In 2018, February 2018, Vancouver, British Columbia, Canada.
- Ian Milligan and Nick Ruest, “Capturing the Web Today for Tomorrow : Innovations in Capturing and Analyzing Social Media and Websites for the New Scholarly Record.” University Librarian’s Speaker Series on Emergent Research in Digital Scholarship, March 2017, Toronto, Ontario, Canada.
- Ian Milligan and Nick Ruest, “Walking the WALK : Facilitating Interdisciplinary Web Archive Collaboration.” University of Alberta, June 2016, Edmonton, Alberta, Canada.
Datasets
- Nick Ruest. “GeoCities Web Archive Collection Derivatives,” 2021, https://archive.org/details/geocities-webarchive-collection-derivatives
- Helge Holzmann. “Friendster Datasets,” 2021, https://archive.org/details/friendsterdatasets
- Helge Holzmann. “Early Web Datasets,” 2021, https://archive.org/details/earlywebdatasets
- Nick Ruest, Jocelyn Wilk, Alex Thurman. “University Archives web archive collection derivatives,” 2020, https://doi.org/10.5683/SP2/FONRZU
- Nick Ruest, Matthew C. Baker, and Alex Thurman. “Burke Library New York City Religions web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3701455
- Nick Ruest. “Rare Book and Manuscript Library web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3701593
- Nick Ruest, and Samantha Abrams. “Contemporary Composers Web Archive (CCWA) web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3692559
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Harvest Quebec Government Websites from December 2006 web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3688354
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Quebec International Relation and Economy web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3688334
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Sites of the Quebec Ministry of Immigration from 2012 to 2018 web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3687264
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Ministry of Environment of Québec (2011-2014) web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3605525
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Coalition Avenir Québec (CAQ) web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3687262
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Quebec Ministry of Agriculture, Fisheries and Food from 2012-2018 web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3687256
- Nick Ruest, Christine Sala, and Samantha Abrams. “Collaborative Architecture, Urbanism, and Sustainability Web Archive (CAUSEWAY) collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3674173
- Nick Ruest, Christine Sala, and Alex Thurman. “Avery Library Historic Preservation and Urban Planning web archive collection derivatives,” 2020, https://doi.org/10.5683/SP2/Z68EVJ
- Nick Ruest, Amanda Bielskas, Brittany Wofford, Jane Quigley, Emily Wild, and Samantha Abrams. “Geologic Field Trip Guidebooks Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3666295
- Nick Ruest, and Alex Thurman. “Resistance web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3660457
- Nick Ruest. “Freely Accessible eJournals web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633671
- Nick Ruest, Andrew S. Dolkart, and Alex Thurman. “Stonewall 50 Commemoration web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3631347
- Nick Ruest. “General web archive collection derivatives”, 2020, http://doi.org/10.5281/zenodo.3633290
- Nick Ruest, Luo Zhou, Joshua Seufert, and Samantha Abrams. “Independent Documentary Filmmakers from China, Hong Kong, and Taiwan Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3632912
- Nick Ruest, Thomas Keenan, Robert Davis, Anna Arays, Erik Zitser, Alla Roylance, Bogdan Horbal, Samantha Abrams. “Eastern Europe and Former Soviet Union Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633031
- Nick Ruest, Lauris Olson, and Samantha Abrams. “Popline and K4Health Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633022
- Nick Ruest, Christine Sala, and Samantha Abrams. “Latin American and Caribbean Contemporary Art Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633118
- Nick Ruest. “Extreme Right Movements in Europe web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633161
- Nick Ruest, Yoshie Yanagihara, Tetsuyuki Shida, Haruko Nakamura, and Samantha Abrams. “Queer Japan Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633284
- Nick Ruest, Karen Green, Sarah Wenzel, and Samantha Abrams. “Global Webcomics Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633737
- Nick Ruest, Talía Guzman-González, Sócrates Silva, Jill Baron, and Samantha Abrams. “Brazilian Presidential Transition (2018) Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3659692
- Nick Ruest, Kristina Williams, JKeely Wilczek, Jeremy Darrington, Ryan Denniston, and Samantha Abrams. “State Elections Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3635634
- Nick Ruest. “Web Archive of Independent News Sites on Turkish Affairs derivatives,” 2020, http://doi.org/10.5281/zenodo.3633234
- Nick Ruest, Anna Rakityanskaya, Thomas Keenan, Robert Davis, Anna Arays, and Samantha Abrams. “Literary Authors from Europe and Eurasia Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3632728
- Nick Ruest, Chengzhi Wang, Xiao-He Ma, and Samantha Abrams. “#MeToo and the Women’s Rights Movement in China Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633681
- Nick Ruest, James Adams, Marcella Barnhart, Bobray Bordelon, Gwyneth Crowley, Joann Donatiello, and Samantha Abrams. “National Statistical Offices and Central Banks Web Archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3633683
- Nick Ruest, Carole Gagné, and Dave Mitchell. “Ministry of Environment of Québec (2011-2014) web archive collection derivatives,” 2020, http://doi.org/10.5281/zenodo.3605525
- Nick Ruest, Ian Milligan, Jimmy Lin, Ryan Deschamps, and Samantha Fritz, “Derivative Data for Web Archives for Longitudinal Knowledge (WALK),” 2018, https://dx.doi.org/10.20383/101.036
- Ian Milligan, Nick Ruest, and Ryan Deschamps, “Network Data for the Web Archives for Longitudinal Knowledge (WALK) Project,” 2016, https://hdl.handle.net/10864/12040
- Ian Milligan, Nick Ruest, Jimmy Lin, “Derivative data for the Canadian Political Parties and Interest Groups collection,” 2015, https://hdl.handle.net/10864/11301
- Nick Ruest, and Library and Archives Canada, “#elxn42 tweets (42nd Canadian Federal Election),” 2015, https://hdl.handle.net/10864/11311