Computer Automation Is Making Researchers Obsolete
January 26, 2013
In archives and libraries around the world, piles of historic documents are sitting gathering dust. One of the problems librarians and archivists have with these documents is that they do not have a way to historically date them. The MIT Technology Review may solve that problem, says the article, “The Algorithms That Automatically Date Medieval Manuscripts.” Gelila Tilahun and other people from the University of Toronto have created algorithms that use language and common phrases to date the documents. Certain words and expressions can date a document to a specific time period. It sounds easy, but according to the article it is a bit more complex:
“However, the statistical approach is much more rigorous than simply looking for common phrases. Tilahun and co’s computer search looks for patterns in the distribution of words occurring once, twice, three times and so on. “Our goal is to develop algorithms to help automate the process of estimating the dates of undated charters through purely computational means,” they say. This approach reveals various patterns that they then test by attempting to date individual documents in this set. They say the best approach is one known as the maximum prevalence technique. This is a statistical technique that gives a most probable date by comparing the set of words in the document with the distribution in the training set.”
Tilahun and his team want their algorithms used for more than dating old documents as well. It can be used to find forgeries and verify authorship. The dating tool opens many more opportunities to explore history, but the down side is that research is getting more automated. Librarians and scholars may be kicked out and sent to work at Wal-Mart.
Whitney Grace, January 26, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search