Brown Dog Fetches Buried Data

February 25, 2016

Outdated file formats, particularly those with no metadata, are especially difficult to search and utilize. The National Science Foundation (NSF) reports on a new search engine designed to plumb the unstructured Web in, “Brown Dog: A Search Engine for the Other 99 Percent (ofData).” With the help of a $10 million award from the NSF, a team at the University of Illinois-based National Center for Supercomputing Application (NCSA) has developed two complementary services. Writer Aaron Dubrow explains:

“The first service, the Data Access Proxy (DAP), transforms unreadable files into readable ones by linking together a series of computing and translational operations behind the scenes. Similar to an Internet gateway, the configuration of the Data Access Proxy would be entered into a user’s machine settings and then forgotten. From then on, data requests over HTTP would first be examined by the proxy to determine if the native file format is readable on the client device. If not, the DAP would be called in the background to convert the file into the best possible format….

“The second tool, the Data Tilling Service (DTS), lets individuals search collections of data, possibly using an existing file to discover other similar files in the data. Once the machine and browser settings are configured, a search field will be appended to the browser where example files can be dropped in by the user. Doing so triggers the DTS to search the contents of all the files on a given site that are similar to the one provided by the use….  If the DTS encounters a file format it is unable to parse, it will use the Data Access Proxy to make the file accessible.”

See the article for more on these services, which NCSA’s Kenton McHenry likens to a DNS for data. Brown Dog conforms to NSF’s Data Infrastructure Building Blocks program, which supports development work that advances the field of data science.

 

Cynthia Murrell, February 25, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta