Library of Congress Vows to Archive All Tweets
August 28, 2012
Andrew Phelps of Nieman Journalism Lab recently reported on a huge undertaking by the Library of Congress in the article “The Plan to Archive Every Tweet in the LIbrary of Congress? Definitely Still Happening.”
According to the article, back in 2010 the Library of Congress announced its plan to preserve every public tweet for future generations. Little did it know at the time, there are 400 million public tweets a day and the number is continuing to grow. However, when Canada.com recently reported that the “LOC is quietly backing out of the commitment”, an LOC spokesperson replied saying that the the project is very much still happening.
Library Spokesperson Jennifer Gavin said:
“The process of how to serve it out to researchers is still being worked out, but we’re getting a lot of closer,” Gavin told me. “I couldn’t give you a date specific of when we’ll be ready to make the announcement…We began receiving the material, portions of it, last year. We got that system down. Now we’re getting it almost daily. And of course, as I think is obvious to anyone who follows Twitter, it has ended up being a very large amount of material.”
Since the project is definitely going underway, the real challenge is how will this unstructured data be organized and made searchable. I’m interested to see what they figure out.
Jasmine Ashton, August 28, 2012