Toronto Datathon

Toronto Skyline


The World Wide Web has a profound impact on how we research and understand the past. The sheer amount of cultural information that is generated and, crucially, preserved every day in electronic form, presents exciting new opportunities for researchers. Much of this information is captured within web archives.

Web archives often contain hundreds of billions of web pages, ranging from individual homepages and social media posts, to institutional websites. These archives offer tremendous potential for social scientists and humanists, and the questions research may pose stretch across a multitude of fields. Scholars broaching topics dating back to the mid-1990s will find their projects enhanced by web data. Moreover, scholars hoping to study the evolution of cultural and societal phenomena will find a treasure trove of data in web archives. In short, web archives offer the ability to reconstruct large-scale traces of the relatively recent past.

With the latest release of the Archives Unleashed Toolkit (AUT), the project team will be hosting a series of four datathons for archivists, researchers, librarians, computer scientists, and others to collaboratively work with web collections and explore cutting-edge research tools through hands on experience.

The Archives Unleashed Project is pleased to announce the first datathon will take place at the University of Toronto Robarts Library, 26-27 April 2018.

This event will bring together a small group of approximately 15 participants to experiment with the newest release of AUT (Archives Unleashed Toolkit) and the Archives Unleashed Cloud, and to kick-off collaboratively inspired research projects. Participants will have access to analytics software and specialists, and will be exposed to the process of working with web archive files at scale. For more information on AUT and the Cloud, please visit

Team Projects

Team: BC Teachers’ Labor Dispute
  • Nich Worby, University of Toronto Libraries
  • Graeme Campbell, Queen’s University
  • Brandon Locke, Michigan State University
  • Katie Mackinnon, University of Toronto
  • View Presentation
Team: Make Tweets Great Again
  • Jacqueline Whyte Appleby, Scholars Portal/University of Toronto
  • Shawn Jones, Old Dominion University
  • Amanda Oliver, Western University Archives
  • View Presentation
Team: Pipeline
  • Corey Davis, University of Victoria/COPPUL
  • Creighton Barrett, Dalhousie University
  • Ben Goldman, Penn State University Libraries
  • Rebecca Dowson, Simon Fraser University
  • View Presentation
Team Spamlinks
  • Justin Littman, George Washington University Libraries
  • Shawn Walker, Arizona State University
  • Russell White, Library and Archives Canada
  • Brian Griffin, Old Dominion University
  • Jayanthy Chengan, Scholars Portal/University of Toronto
  • View Presentation


  • Ian Milligan (University of Waterloo)
  • Nicholas Worby (University of Toronto)
  • Nick Ruest (York University)
  • Jimmy Lin (University of Waterloo)


This event is possible thanks to the generous support of Andrew W. Mellon Foundation , the University of Toronto Libraries , the University of Waterloo’s Faculty of Arts, York University , Compute Canada , and Start Smart Labs .

alt text alt text alt text
alt text alt text alt text


University of Toronto, John P. Robarts Research Library, Blackburn Room (4036) Blackburn Room


Have we got a schedule for you! We will be kicking off the event with some discussion around the project and set up of AUT. After that it's full steam ahead with team project work. A light breakfast and lunch will be provided on both days.

Please click the image below to check out the full Toronto Datathon schedule!