October 27, 2013
If you are in need of a relatively painless way to obtain metadata, DocumentCloud might be your solution. Every uploaded document is run through OpenCalais, allowing for user access to widespread information mentioned in them. It simplifies the search for people, places and organizations from your documents and allows you to plot them by dates mentioned in a timeline that can be as specific or general as the user desires.
“Use our document viewer to embed documents on your own website and introduce your audience to the larger paper trail behind your story.
From our catalog, reporters and the public alike can find your documents and follow links back to your reporting. DocumentCloud contains court filings, hearing transcripts, testimony, legislation, reports, memos, meeting minutes, and correspondence. See what’s already in our catalog. Make your documents part of the cloud.”
If you prefer privacy, that is a built-in feature. If you prefer to publish, your documents become a part of the landscape of primary sources in the DocumentCloud catalogue. There is also a highlighting feature that accommodates both public annotations and more private organizational notes. Each note has its own URL, enabling users to show their readers the exact information they need.
Chelsea Kerwin, October 27, 2013
August 6, 2013
The rise of metadata is here, but will companies be able to harness its value? Concept Searching points to the answer that ROI has not been successful with this across the board. A recent article, “Solving the Inadequacies and Failures in Enterprise Search,” admonishes the laissez-faire approach that some companies have towards enterprise search. The author advocates, instead, towards a hands-on information governance approach.
What the author calls a “metadata infrastructure framework” should be created and should be comprised of automated intelligent metadata generation, auto-classification, and the use of goal and mission aligned taxonomies.
According to the article:
The need for organizations to access and fully exploit the use of their unstructured content won’t happen overnight. Organizations must incorporate an approach that addresses the lack of an intelligent metadata infrastructure, which is the fundamental problem. Intelligent search, a by-product of the infrastructure, must encourage, not hamper, the use and reuse of information and be rapidly extendable to address text mining, sentiment analysis, eDiscovery and litigation support. The additional components of auto-classification and taxonomies complete the core infrastructure to deploy intelligent metadata enabled solutions, including records management, data privacy, and migration.
We wholeheartedly agree that investing in infrastructure is a necessity — across many areas, not just search. However, when it comes to a search infrastructure, we would be remiss not to mention the importance of security. Fortunately there are solutions like Cogito Intelligence API that offer businesses focused on avoiding risks the confidence in using a solution already embedded with corporate security measures.
Megan Feil, August 6, 2013
July 26, 2013
We read numbers about the amount of time wasted on searching for documents all the time, and they are not pretty. When we stumbled upon Document Cloud, we could not help but wonder if this type of service will help with the productivity and efficiency issues that are currently all too common.
The homepage takes potential users through the steps of what using Document Cloud is like. First, users will have access to more information about their documents. Secondly, annotations and highlighting sections are functionalities that can be done with ease.
Finally, sharing work is possible:
“Everything you upload to DocumentCloud stays private until you’re ready to make it public, but once you decide to publish, your documents join thousands of other primary source documents in our public catalog. Use our document viewer to embed documents on your own website and introduce your audience to the larger paper trail behind your story. From our catalog, reporters and the public alike can find your documents and follow links back to your reporting. DocumentCloud contains court filings, hearing transcripts, testimony, legislation, reports, memos, meeting minutes, and correspondence.”
In summary, this is a service that will enable metadata to be produced for documents. If anyone needs us, we will be browsing the documents already in their catalog.
Megan Feil, July 26, 2013
July 26, 2013
The growing web of linked data not only grows in volume of data, but also in a growing set of vocabularies. We recently saw on Open Knowledge Foundation’s site that Mondeca’s Linked Open Vocabularies (LOV) have been updated. A collection of vocabulary spaces.
Users are able to find vocabularies listed and individually described by metadata, classified by vocabulary spaces and interlinked using the dedicated vocabulary VOAF.
We learned more about what LOV is about:
“Most popular ones form now a core of Semantic Web standards de jure (SKOS, Dublin Core, FRBR …) or de facto (FOAF, Event Ontology …). But many more are published and used. Not only linked data leverage a growing set of vocabularies, but vocabularies themselves rely more and more on each other through reusing, refining or extending, stating equivalences, declaring metadata. LOV objective is to provide easy access methods to this ecosystem of vocabularies, and in particular by making explicit the ways they link to each other and providing metrics on how they are used in the linked data cloud, help to improve their understanding, visibility and usability, and overall quality.”
There are a myriad of ways that those interested can feed their inner controlled vocabulary demon. One of which is to suggest a new vocabulary to add to LOV.
Megan Feil, July 26, 2013
January 3, 2013
Information is the only global currency and it is by no means a limited resource. The National Information Exchange Model (NIEM) Resource Database sees this and was initially developed out of a desire for a government-wide, standards-based approach to exchanging information.
Twenty states found that there were too many bureaucratic policies involved in exchanging information across state and city government lines and thus began the NIEM. This effort became known as the Global Justice Information Sharing Initiative.
The website continues on the background of this project and the Department of Homeland Security‘s connection:
“Parallel to the GJXDM effort was the stand up of the U.S. Department of Homeland Security. The mention of metadata in the president’s strategy for homeland security in the summer of 2002 galvanized the homeland security community to begin working towards standardization. These collaborative efforts by the justice and homeland security communities—to produce a set of common, well-defined data elements for data exchange development and harmonization—lead to the beginnings of NIEM.”
While it is difficult not to find this interesting, at the end of the day this is a government initiative in a time of severe financial challenges and we cannot help but wonder if this will hamper efforts to push forward. For now, take a look at the resource database while you can.
Megan Feil, January 03, 2013
December 10, 2012
The healthcare world continues its creep into the twenty-first century, and now Mondeca is lending a hand with the process. The French company’s Web site announces, “Mondeca Helps to Bring Electronic Patient Record to Reality.” Tasked with implementing healthcare management systems across France, that country’s healthcare agency, ASIP Santé, has turned to Mondeca for help. The press release describes the challenge:
“The task is a daunting one since most healthcare providers use their own custom terminologies and medical codes. This is due to a number of issues with standard terminologies: 1) standard terminologies take too long to be updated with the latest terms; 2) significant internal data, systems, and expertise rely on the usage of legacy custom terminologies; and 3) a part of the business domain is not covered by a standard terminology.
“The only way forward was to align the local custom terminologies and codes with the standard ones. This way local data can be automatically converted into the standard representation, which will in turn allow to integrate it with the data coming from other healthcare providers.”
The process began by aligning the standard terminology Logical Observation Identifiers Names and Codes (LOINC) with the related terminology common in Paris hospitals. Mondeca helped the effort with their expertise in complex organizational and technical processes, like setting up collaborative spaces and aligning and exporting terminology.
Our question: Will doctors use these systems without introducing more costs and errors in the push for cost efficiency? Let us hope so.
Established in 1999, Mondeca serves clients in Europe and North America with solutions for the management of advanced knowledge structures: ontologies, thesauri, taxonomies, terminologies, metadata repositories, knowledge bases, and linked open data. The firm is based in Paris, France.
Cynthia Murrell, December 10, 2012
November 9, 2012
Big data has held the media spotlight long enough to surpass any initial thought that it was a passing trend. Now the headlines trumpet how to benefit from the massive amounts of unstructured data flooding the internet and how to process it.
Computer Weekly’s article“How to Manage Unstructured Data for Business Benefit” explains how the next data evolution will be harnessing the benefits of both unstructured and structured data:
“There is as much value in unstructured data in terms of what customers are thinking on the web and what businesses can derive from other organizations’ data. It requires an understanding of the type of information the business is looking for and the kinds of insights business managers are hoping to draw from the data. The more considered the query, and the more focused the search, the better the results. This rule applies to both structured and unstructured data.”
Applying metadata to unstructured data opens up a profound new way to increase the findability of enterprise content, but the right solution is mandatory for success. Businesses looking for secure search and enterprise accessibility will find Intrafind provides customized solutions that combine to organize, tag and ultimately reveal relevant information to users of their enterprise search solutions. Powerful tools like this provide flexible options for data processing that put the power to increase efficiency and ROI back in the hands of the user.
Jennifer Shockley, November 9, 2012
November 8, 2012
In 2009, the W3C published the SKOS-XL (SKOS eXtension for Labels). Now, Voyages of the Semantic Enterprise asks, “Who Needs Skos-XL? Maybe No One.” What does writer Irene Polikoff have against the SKOS extension?
The post begins at the beginning, with an explanation of the Triangle of Reference and the concepts behind the open source SKOS. Polikoff goes on to describe the purpose of SKOS-XL: to allow concept labels to collect their own metadata. This, she says, unnecessarily complicates vocabulary management. She writes:
“Labels are not strings as in SKOS proper, but RDF resources with their own identity. Each label can have only one literal form; this is where the actual text string (the name) goes. The literal form is not one per Label per language as with SKOS’s constraint for assigning preferred labels, but one per Label. So, to accommodate different languages, different label resources must be created. At the same time, there can be multiple Label resources with the same literal form (for example, two different Label resources with the literal form ‘Mouse’). Even a simple SKOS-XL vocabulary is considerably bulkier than its SKOS alternative. Since SKOS-XL format takes far more space, storage, import/export and performance of search and query can become an issue for larger vocabularies.”
Labels can be linked to each other as well as to their concepts, the article notes, further increasing complexity. Also, the same text label may be applied to different entities, potentially leading to confusion. Furthermore, the write up points to a couple of specific integrity clashes between SKOS and SKOS-XL. See the article for more details.
Polikoff closes by offering to help readers who think SKOS-XL is their only choice for vocabulary management to find a simpler solution. Will many users agree it is wise to do so?
Cynthia Murrell, November 08, 2012
October 18, 2012
The post goes on to elaborate on another study with similar results:
“Not enough for you? Seven years ago, an article ran in NewScientist. It highlights a study done at King’s College London, that showed in today’s business setting, marked by emails, smart phone connections,– the connected 24×7 reality of today, the average IQ of an individual drops by about 10 points. The study went on to conclude, (and this is my favorite part), ‘Even smoking dope has less effect on your ability to concentrate on the task in hand.’”
Knowledge management is obviously powerful, but requires one to step back and consider available options and information. Enterprise search is a key ingredient to knowledge management and Intrafind offers some of best in class best practices for secure searching that offers semantic linking and intelligent tagging.
Andrea Hayden, October 18, 2012
September 29, 2012
BeyeNetwork suggests one reason metadata is not implemented comprehensively or well: “Lack of Metaprocess Information Impedes Ability to Collect Metadata.” Writer and database management expert Bill Inmon pins the lack of enterprise-wide metadata primarily on a lack of metaprocess information. Metaprocess covers high-level descriptive details about a process, like its name, the technology that houses it, its input and output, and algorithmic variables. It is pointless, Inmon insists, to attempt to understand a large organization’s information flow without this information.
Why is metaprocess information so hard to come by? The article explains:
“It resides in the old legacy code. In COBOL. In assembler. In AS/400 modules. In PL/1. In technology that has not seen the light of day in decades. Once there were technicians that could be hired to read and go through the old code. Today those technicians have retired or have been promoted to management positions. In another generation, it won’t even be possible to find anyone who understands these older technologies. And by that time, SQL and C++ will be the old legacy technologies of the day.”
How does one solve a metaprocess problem? What is the meta-metaprocess? Inmon doesn’t really have an answer to that. He does suggest that, since legacy code is a form of text, someone may someday find a way to coax this information from a text editor. Anyone up for the challenge?
Cynthia Murrell, September 29, 2012