Hakia Offline

October 22, 2014

In April 2014, I cited a report that suggested Hakia was moving forward. It now appears that the Hakia Web site has gone dark. Information about Hakia’s semantic system is available in this interview with Riza C. Berkan.

Stephen E Arnold, October 22, 2014

Hakia Down

September 18, 2014

We ran a check on the search and content processing vendors in our file. The Hakia.com site appears to be down.

Hakia was a developer of semantic search and offered several demonstrations of its technology. To learn about the company, the interview with Riza C. Berkan, navigate to this Search Wizards Speak issue.

Stephen E Arnold, September 18, 2014

IHS Enterprise Search: Semantic Concept Lenses Are Here

July 29, 2014

I pointed out in http://bit.ly/X9d219 that IDC, a mid tier consulting firm that has marketed my information without permission on Amazon of all places, has rolled out a new report about content processing. The academic sounding title is “The Knowledge Quotient: Unlocking the Hidden Value of Information.” Conflating knowledge and information is not logically satisfying to me. But you may find the two words dusted with “value” just the ticket to career success.

I have not read the report, but I did see a list of the “sponsors” of the study. The list, as I pointed out, was an eclectic group, including huge firms struggling for credibility (HP and IBM) down to consulting firms offering push ups for indexers.

One company on my list caused me to go back through my archive of search information. The firm that sparked my interest is Information Handling Services or IHS or Information Handling Service. The company is publicly traded and turning a decent profit. The revenue of IHS has moved toward $2 billion. If the global economy perks up and the defense sector is funded at pre-drawdown levels, IHS could become a $2 billion company.

IHS is a company with an interesting history and extensive experience with structured and unstructured search. Few of those with whom I interacted when I was working full time considered IHS a competitor to the likes of Autonomy, Endeca, and Funnelback.

In the 2013 10-K on page 20, IHS presents its “cumulative total return” in this way:

image

The green line looks like money. Another slant on the company’s performance can be seen in a chart available from Google Finance.

The Google chart shows that revenue is moving upwards, but operating margins are drifting downward and operating income is suppressed. Like Amazon, the costs for operating and information centric company are difficult to control. Amazon seems to have thrown in the towel. IHS is managing like the Dickens to maintain a profit for its stakeholders. For stakeholders, is the hope is that hefty profits will be forthcoming?

image

Source: Google Finance

My initial reaction was, “Is IHS trying to find new ways to generate higher margin revenue?”

Like Thomson Reuters and Reed Elsevier, IHS required different types of content processing plumbing to deliver its commercial databases. Technical librarians and the competitive intelligence professionals monitoring the defense sector are likely to know about IHS different products. The company provides access to standards documents, regulatory information, and Jane’s military hardware information services. (Yep, Jane’s still has access to retired naval officers with mutton chop whiskers and interesting tweed outfits. I observed these experts when I visited the company in England prior to IHS’s purchase of the outfit.)

The standard descriptions of IHS peg the company’s roots with a trade magazine outfit called Rogers Publishing. My former boss at Booz, Allen & Hamilton loved some of the IHS technical services. He was, prior to joining Booz, Allen the head of research at Martin Marietta, an IHS customer in the 1970s. Few remember that IHS was once tied in with Thyssen Bornemisza. (For those with an interest in history, there are some reports about the Baron that are difficult to believe. See http://bit.ly/1qIylne.)

Large professional publishing companies were early, if somewhat reluctant, supporters of SGML and XML. Running a query against a large collection of structured textual information could be painfully slow when one relied on traditional relational database management systems in the late 1980s. Without SGML/XML, repurposing content required humans. With scripts hammering on SGML/XML, creating new information products like directories and reports eliminated the expensive humans for the most part. Fewer expensive humans in the professional publishing business reduces costs…for a while at least.

IHS climbed on the SGML/XML diesel engine and began working to deliver snappy online search results. As profit margins for professional publishers were pressured by increasing marketing and technology costs, IHS followed the path of other information centric companies. IHS began buying content and services companies that, in theory, would give the professional publishing company a way to roll out new, higher margin products. Even secondary players in the professional publishing sector like Ebsco Electronic Publishing wanted to become billion dollar operations and then get even bigger. Rah, rah.

These growth dreams electrify many information company’s executives. The thought that every professional publishing company and every search vendor are chasing finite or constrained markets does not get much attention. Moving from dreams to dollars is getting more difficult, particularly in professional publishing and content processing businesses.

My view is that packaging up IHS content and content processing technology got a boost when IHS purchased the Invention Machine in mid 2012.

Years ago I attended a briefing by the founders of the Invention Machine. The company demonstrated that an engineer looking for a way to solve a problem could use the Invention Machine search system to identify candidate systems and methods from the processed content. I recall that the original demonstration data set was US patents and patent applications. My thought was that an engineer looking for a way to implement a particular function for a system could — if the Invention Machine system worked as presented — could present a patent result set. That result set could be scanned to eliminate any patents still in force. The resulting set of patents might yield a procedure that the person looking for a method could implement without having to worry about an infringement allegation. The original demonstration was okay, but like most “new” search technologies, Invention Machine faced funding, marketing, and performance challenges. IHS acquired Invention Machine, its technologies, its Eastern European developers, and embraced the tagging, searching, and reporting capabilities of the Invention Machine.

The Goldfire idea is that an IHS client can license certain IHS databases (called “knowledge collections”) and then use Goldfire / Invention Machine search and analytic tools to get the knowledge “nuggets” needed to procure a missile guidance component.

The jargon for this finding function is “semantic concept lenses.” If the licensee has content in a form supported by Goldfire, the licensee can search and analyze IHS information along with information the client has from its own sources. A bit more color is available at http://bit.ly/WLA2Dp.

The IHS search system is described in terms familiar to a librarian and a technical analyst; for example, here’s the attributes for Goldfire “cloud” from an IHS 2013 news release:

  • “Patented semantic search technology providing precise access to answers in documents. [Note: IHS has numerous patents but it is not clear what specific inventions or assigned inventions apply directly to the search and retrieval solution(s)]
  • Access to more than 90 million scientific and technical “must have” documents curated by IHS. This aggregated, pre-indexed collection spans patents, premium IHS content sources, trusted third-party content providers, and the Deep Web.
  • The ability to semantically index and research across any desired web-accessible information such as competitive or supplier websites, social media platforms and RSS feeds – turning these into strategic knowledge assets.
  • More than 70 concept lenses that promote rapid research, browsing and filtering of related results sets thus enabling engineers to explore a concept’s definitions, applications, advantages, disadvantages and more.
  • Insights into consumer sentiment giving strategy, product management and marketing teams the ability to recognize customer opinions, perceptions, attitudes, habits and expectations – relative to their own brands and to those of their partners’ and competitors’ – as expressed in social media and on the Web.”

Most of these will resonate with those familiar with the assertions of enterprise search and content processing vendors. The spin, which I find notable, is that IHS delivers both content and information retrieval. Most enterprise search vendors provide technology for finding and analyzing data. The licensee has to provide the content unless the enterprise search vendor crawls the Web or other sources, creates an archive or a basic index, and then provides an interface that is usually positioned as indexing “all content” for the user.

According to Virtual Strategy Magazine (which presumably does not cover “real” strategy), I learned that US 8666730:

covers the semantic concept “lenses” that IHS Goldfire uses to accelerate research. The lenses correlate with the human knowledge system, organizing and presenting answers to engineers’ or scientists’ questions – even questions they did not think to ask. These lenses surface concepts in documents’ text, enabling users to rapidly explore a concept’s definitions, applications, advantages, disadvantages and more.

The key differentiator is claimed to move IHS Goldfire up a notch. The write up states:

Unlike today’s textual, question-answering technologies, which work as meta-search engines to search for text fragments by keyword and then try to extract answers similar to the text fragment, the IHS Goldfire approach is entirely unique – providing relevant answers, not lists of largely irrelevant documents. With IHS Goldfire, hundreds of different document types can be parsed by a semantic processor to extract semantic relationships like subject-action-object, cause-and-effect and dozens more. Answer-extraction patterns are then applied on top of the semantic data extracted from documents and answers are saved to a searchable database.

According to Igor Sovpel, IHS Goldfire:

“Today’s engineers and technical professionals are underserved by traditional Internet and enterprise search applications, which help them find only the documents they already know exist,” said Igor Sovpel, chief scientist for IHS Goldfire. “With this patent, only IHS Goldfire gives users the ability to quickly synthesize optimal answers to a variety of complex challenges.”

Is IHS’ new marketing push in “knowledge” and related fields likely to have an immediate and direct impact on the enterprise search market? Perhaps.

There are several observations that occurred to me as I flipped through my archive of IHS, Thyssen, and Invention Machine information.

First, IHS has strong brand recognition in what I would call the librarian and technical analyst for engineering demographic. Outside of lucrative but quite niche markets for petrochemical information or silhouettes and specifications for the SU 35, IHS suffers the same problem of Thomson Reuters and Wolters Kluwer. Most senior managers are not familiar with the company or its many brands. Positioning Goldfire as an enterprise search or enterprise technical documentation/data analysis tool will require a heck of a lot of effective marketing. Will positioning IHS cheek by jowl with IBM and a consulting firm that teaches indexing address this visibility problem? The odds could be long.

Second, search engine optimization folks can seize on the name Goldfire and create some dissonance for IHS in the public Web search indexes. I know that companies like Attivio and Microsoft use the phrase “beyond search” to attract traffic to their Web sites. I can see the same thing happening. IHS competes with other professional publishing companies looking for a way to address their own marketing problems. A good SEO name like “Goldfire” could come under attack and quickly. I can envision lesser competitors usurping IHS’ value claims which may delay some sales or further confuse an already uncertain prospect.

Third, enterprise search and enterprise content analytics is proving to be a difficult market from which to wring profitable, sustainable revenue. If IHS is successful, the third party licensees of IHS data who resell that information to their online customers might take steps to renegotiate contracts for revenue sharing. IHS will then have to ramp up its enterprise search revenues to keep or outpace revenues from third party licensees. Addressing this problem can be interesting for those managers responsible for the negotiations.

Finally, enterprise search has a lot of companies planning on generating millions or billions from search. There can be only one prom queen and a small number of “close but no cigar” runner ups. Which company will snatch the crown?

This IHS search initiative will be interesting to watch.

Stephen E Arnold, July 29, 2014

I2E Semantic Enrichment Unveiled by Linguamatics

July 21, 2014

The article titled Text Analytics Company Linguamatics Boosts Enterprise Search with Semantic Enrichment on MarketWatch discusses the launch of 12E Semantic Enrichment from Linguamatics. The new release allows for the mining of a variety of texts, from scientific literature to patents to social media. It promises faster, more relevant search for users. The article states,

“Enterprise search engines consume this enriched metadata to provide a faster, more effective search for users. I2E uses natural language processing (NLP) technology to find concepts in the right context, combined with a range of other strategies including application of ontologies, taxonomies, thesauri, rule-based pattern matching and disambiguation based on context. This allows enterprise search engines to gain a better understanding of documents in order to provide a richer search experience and increase findability, which enables users to spend less time on search.”

Whether they are spinning semantics for search, or if it is search spun for semantics, Linguamatics has made their technology available to tens of thousands of users of enterprise search. Representative John M. Brimacombe was straightforward in his comments about the disappointment surrounding enterprise search, but optimistic about 12E. It is currently being used by many top organizations, as well as the Food and Drug Administration.

Chelsea Kerwin, July 21, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Sindice Support Comes to an End

June 18, 2014

Another semantic system turns out the lights. SemanticWeb hosts a guest post from the founders of Sindice titled, “End of Support for the Sindice.com Search Engine: History, Lessons Learned, and Legacy.” The article delves into a wealth of technical details. It opens, however, with this modest introduction:

“Since 2007, Sindice.com has served as a specialized search engine that would do a crazy thing: throw away the text and just concentrate on the ‘markup’ of the web pages. Sindice would provide an advanced API to query RDF, RDFa, Microformats and Microdata found on web sites, together with a number of other services. Sindice turned useful, we guess, as approximately 1100 scientific works in the last few years refer to it in a way or another.”

The team decided to end support for the specialized search engine in order to focus on serving enterprise users. Besides, they say, their vision has been realized. They write:

“With the launch in 2012 of Schema.org, Google and others have effectively embraced the vision of the ‘Semantic Web.’ With the RDFa standard, and now even more with JSON-LD, richer markup is becoming more and more popular on websites. While there might not be public web data ‘search APIs,’ large collections of crawled data (pages and RDF) exist today which are made available on cloud computing platforms for easy analysis with your favorite big data paradigm.”

The account begins at the beginning, with the team’s first goal of developing a simpler API, and ends with their transition to the startup SindiceTech. In between are interesting details, like a description of their 60-machine “Webstar” operations cluster and details on how they leveraged Hadoop for their RDF analytics. We may be sad to see support for Sindice.com go, but at least the team has shared some of their wisdom on the way out.

Cynthia Murrell, June 18, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

RSuite Incorporates Temis into Content Management Platform

May 8, 2014

RSuite content management users can now can tap into TEMIS, we learn from “RSuite CMS Leverages TEMIS’s Content Enrichment Capabilities to Deliver a Powerful Semantic Solution.” The partnership makes TEMIS’s semantic enrichment capabilities available to RSuite’s customers in the publishing, government, and corporate arenas. The deal was announced at this year’s MarkLogic World conference, held April seventh in San Francisco; both companies are MarkLogic partners.

The press release elaborates:

“RSuite CMS provides an intuitive user interface that minimizes actions required to execute complex searches across an entire set of content. The solution can globally apply metadata, dynamically organize massive amounts of documents into collections, package and distribute content to licensing partners, and enables customers to meet their multi-channel publishing goals.

“By leveraging TEMIS’s Luxid® Content Enrichment Platform, RSuite CMS can enable customers to automatically enrich their content with domain-specific metadata directly within their publishing workflows. This enables faster and more scalable content indexing, improved metadata consistency and governance, more efficient authoring, and more powerful search and discovery features within customer applications and portals.”

With its focus on publishing and media, RSuite strives to meet today’s ever-evolving publication challenges. The company serves such big names as HarperCollins, Audible, and Oxford University Press. RSuite was launched in 2000 and is located in Audubon, Pennsylvania.

With its collaborative platform, TEMIS adds domain-specific metadata to clients’ data, allowing publishers to supply more relevant information to their own audiences. TEMIS maintains several offices across Europe and North America.

Cynthia Murrell, May 08, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Actonomy Joins Belgian HR Group

May 1, 2014

Actonomy’s slogan is: “We simply search smarter!” Actonomy’s claim comes from its semantic technology to optimize human resources recruitment processes and findability. It is a big claim to make and if challenged would Actonomy be able to back it up? The company’s most recent press release, “Actonomy Now Part Of A Larger HR Group” proves that its semantic search technology was one of the leading HR products in the European market.

As a result, Actonomy has joined a Belgian HR Group owned by the Peumans family. The group includes other HR software and service companies, including Cognsis, Prato, and SAP. Actonomy has been a star product for over seven years and it is one of the groundbreaking developers in matching technology and ontology based search. Joining the Belgian HR Group gives them the ability to increase their client list and extend their service offerings:

“Thanks to Actonomy’s technology, Prato can extend its service offering of HRM related processes and include in its service offering Actonomy’s semantic searching and matching technology. Actonomy on the other hand will be able to bring its software to perfection thanks to Prato’s broad know how allowing us to launch a suite of new services packaged on top of our core semantic technology. A win win situation for both companies!”

While these companies will remain separate, they will exchange their technologies to benefit each other. It kind of sounds like open source, except they are remaining proprietary companies.

Whitney Grace, May 01, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Meet The Armadillo

April 3, 2014

Armadillos are not native to France, but the Armadillo digital resources management company is. If you are curious to learn more about the French company peruse the “Company Overview” with a little assistance from Google Translate. Armadillo was founded in 1998 and has since acquired a very long and prestigious client list.

Armadillo’s products offer a range of services that include research and development of information technology, custom data solutions, and packages for various digital content. The products are, of course, advertised as a big data solution and can be customized for any data type, content, and organizational method.

The director describes his products as:

“Armadillo packages are integrated into the information systems of companies and other organizations to facilitate data exchange between former silos. This creates repositories harmonized content easily shared and guaranteed “up to date “. Our solutions have a broad functional coverage with excellent performance for near-zero operating costs. Our technology is based on the latest innovations proposed by the Semantic Web and Big Data.”

It looks like another big data player peddling the usual solutions, however, they have been around longer than other big data startups, so longevity and reliability is on their side.

Whitney Grace, April 03, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Lingway Now Part of Something Bigger

March 23, 2014

What has happened to Lingway, purveyor of vertical semantic solutions for search and analysis? According to a press release on its Web site: “Lingway Chooses Toledo And The Castilla-La Mancha Region As Its Operating Base In Spain.” Lingway has moved to Spain to:

“ ‘The Spanish market is important for Lingway, but the fact that it will give us access to the markets of Latin America makes it even more valuable,” says Bernard Normier, Lingway’s CEO. “One of the main reasons we chose Castilla-La Mancha as our headquarters was that the local authorities were able to put us in touch with the other actors in the region (companies, consultants, universities and government organizations) and provide us with the assistance and support required for our project.”

While Lingway may be brushing up on their Spanish, it was also a foothold for another company. Lingway appears to be part of Eptica, evidenced by a the EpticaLingway blog post,”The Lingway Team Is Pleased To Join Eptica And Will Continue To Serve Its Customers.” Eptica acquired Lingway in 2012 as a way to expand into France, strengthen its research and development investments, and pursuer further international growth.

Eptica has integrated Lingway’s technology to bolster their own products. Eptica has a SaaS to manage online reputation and another software for LEA CV dedicated teams for recruitment companies. Moving and sold, technology companies change constantly.

Whitney Grace, March 23, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Ontotext Offers Interesting Services

March 8, 2014

Ontotext delivers very interesting services to their clients. All of their products are associated with semantic technology and utilizing big data to benefit its users. On their Web site, the company describes itself as:

“Ontotext develops a unique portfolio of core semantic technologies. Our RDF engine powers some of the biggest world-renowned media sites. Our text-mining solutions demonstrate unsurpassed accuracy across different domains – from sport news to macro-economic analysis, scientific articles and clinical trial reports. We enable the next generation web of data and we can efficiently extract information from today’s structured web – be it recipes, adverts or anything else.”

It offers services for job extraction, hybrid semantics, and semantic publishing for industries such as life sciences, government, recruitment, libraries, publishing, and media. Ontotext has a range of products to help people harness semantic technology. The most interesting to us is the Semantic Biomedical Tagger that is described as an extraction system that creates semantic annotations in biomedical texts. Ontotext also has the requisite search engine and semantic database. Its product line is fairly robust and we intend to keep an eye on its offerings.

Whitney Grace, March 08, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Next Page »