skip to main content

web and social media archiving

an introduction to user-friendly technologies for preserving web and social media material

why save web and social media information?

a significant amount of information generated by and published for today's society is available only on the web. there are strong systems in place to preserve print materials in libraries, archives, and related organizations around the world. these systems are still under development dealing with digital materials. depending on your project and your needs, you may wish to use some of the tools linked on this page to support your research and scholarship.

  • web content is dynamic and can change or disappear with little notice. this has an impact on scholarship and citations. various studies have found high percentages of links in bibliographies/works cited disappear within years after publication. 
  • government information is increasingly published online only, and print copies are no longer sent to libraries. this information can easily be made inaccessible based on political pressures
  • web content lives on servers, which are subject to environmental disasters or acts of war, just as print libraries and archives have always been. most web hosting is managed by private companies, which are subject to financial pressures, and are governed by terms of service rather than legislative or ethical mandates. 
  • news and current events are enacted on the web. a policy statement or announcement may be made on twitter or on live streaming. records of these (and of the responses to them) can be easily deleted by their creators, even after having had significant social impact. 
  • social media and blogs are a rich pool of personal narratives and public opinion. an online debate can be transformed into a dataset for research. 

these tools can be useful for researchers; they also have applications for activism and for journalism. 

key resources for web archiving