PolySpot Lands ERAMET Contract

October 13, 2010

PolySpot, an enterprise content processing vendor based in Paris, landed a contact for the French mining and metals group ERAMET. The outfit is a large producer of nickel and ferronickel alloy, essential for stainless steel. According to ERAMET Fédère Sa Connaissance Grâce à PolySpot!:

The system will be used to make technical and scientific literature search more widely available to ERAMET staff. The ability to search multiple databases adds flexibility to the product. The system will also provide access to information from encyclopedias, journals, articles, file systems and document management applications. Features of the system include support for simple and advanced searches, faceted navigation, dynamic mining of authors, tagging keywords, etc.. The user can either conduct research on a specific source, or on all-sources say. Beyond the search functions and navigation, the thesaurus function establishes a hierarchy and semantic equivalence keywords to adapt to different cultural contexts and identify relevant concepts.

Congratulations to PolySpot.

Stephen E Arnold, October 13, 2010

Freebie

Linguamatics Joins Up with Accelrys

October 11, 2010

Linguamatics, a nifty content processing vendor in the UK, has formed a partnership for “streamlined, high performance text analytics” with Accelrys. Linguamatics will be giving a presentation at the Smart Content Conference in Manhattan later this month, so you can learn about the company first hand, or you can navigate to http://www.linguamatics.com/. The firm’s Web site has been refreshed and you can learn about the firm’s solutions directly.

Accelrys is a company that produces scientific informatics software. If you got a D in biology, you won’t be using Accelrys’ industrial strength analytics and visualization tools any time soon. Chemistry majors, engineers, and molecular biologists will be quite interested in the firm’s solutions.

What does the hook up mean?

According to “Linguamatics and Accelrys Announce Partnership for Streamlined, High-Performance Text Analytics,”

Mutual customers will benefit by embedding powerful natural language querying within more extensive informatics workflows including access via Accelrys web clients. Organizations continue to face the challenge of filtering ever-increasing volumes of text information to gain actionable knowledge. Linguamatics provides the ability to automate document indexing and querying within the I2E software platform in addition to its interactive information extraction capabilities. Embedding I2E within Pipeline Pilot workflows enables further streamlining of the process for high throughput text mining, and provides access to additional content processing, analytics and output display options.

I would not characterize the new capabilities as search or NLP. The companies are moving, like some others, into a data fusion space. Unlike search vendors who announce that they are now involved in Business Intelligence, Linguamatics and Accelrys have industrial strength technology in place to meet the needs of a specific market category. Just my opinion.

Stephen E Arnold, October 11, 2010

Freebie

Nstein in the News

October 11, 2010

I had a couple of comments about my not mentioning Nstein, now a unit of OpenText. Nstein has been an interesting company or unit of a bigger enterprise. Last year, one of Nstein’s executives set up a meeting with me and then did not show up. I pinged the fellow and learned that his plans had changed. Since then, my plans for covering Nstein changed as well. Seemed only fair.

To assuage the aggrieved reader, I took a quick look at the content sucked into my Overflight system about Nstein. One of the more interesting items appeared in a publication for which I write a for-fee column. I don’t cover search in that publication, but Archana Venkatraman wrote “Semantic Content Analytics Can Resolve Digital Information Problems.” I was surprised because a picture of me and links to my recent write ups about SAP appeared in the border for the Web version of Mr. Venkatraman’s article. I was flattered, but I was confused about the premise of the article; to wit, analytics resolving digital information problems. I think of analytics as causing problems, particularly with regard to the methods used to generate output. Data type and source, privacy, and latency – these topics cross the goose’s mind when he thinks about content analytics.

With regard to Nstein, the passage that caught my attention was information which is attributed, I assume, to an OpenText Nstein executive, Lubor Ptacek, vice president, product marketing:

Semantic Navigation first collects content through a crawling process. Then the content is automatically analyzed and tagged with relevant and insightful entities, topics, summaries and sentiments – the key to providing an engaging online experience.  Next, content is served to users through intuitive navigation widgets that encourage audiences to discover the depth of available information or share it on social networks, such as Facebook and Twitter. From there, it supports placement of product and service offerings or advertising to convert page views into sales.Ptacek gives the example of a medical information professional is searching for the name of a disease, content analytics technology can provide him additional information such as the side effects of the illness the drugs used in the past and so on. “And this logic can be applied to other industries as well.” The solution comes after Open Text acquired Nstein Technologies, a content analytics company, six months ago. It acquired Nstein at a time when analysts were suggesting that such e-discovery solutions could provide sophisticated search and content navigation options that info pros are seeking.

I am hearing similar explanations of functionality from a number of companies. These include “sentiment specialists” like Attensity and Lexalytics and from certain mashup vendors such as Digital Reasoning and Kapow Technologies. I have heard the leaders in enterprise search like Autonomy and Exalead reference similar functions. I could toss in IBM, Google, and Microsoft, but I think you get the idea. Quite a few search vendors are morphing into solutions.

If you want more information about OpenText / Nstein, navigate to www.opentext.com. I would also suggest a look at the other vendors making similar assertions. I may have to start covering this new segment of search. Perhaps it warrants a separate Web log?

Stephen E Arnold, October 11, 2010

Freebie

Sophia Embraces Semantic and Contextual Search

October 8, 2010

We came across a search that more effectively identifies and understands information based on its context, helping the users to search relevant information. This is Sophia Search, product of Sophia, a Belfast-based innovation leader in contextual enterprise search solutions, which as per the Marketwire.com press release, “Update: Sophia Launches Sophia Search for Intelligent Enterprise Search and Contextual Discovery,” is already creating waves.

We have seen traditional search solutions that use taxonomies and ontologies, but this remarkable search uses a patented Contextual Discovery Engine (CDE), based on the linguistic model of Semiotics. Summarizing about Sophia Search the press release states, “The CDE platform automatically detects relationships and themes in unstructured content to enable organizations to seamlessly search, extract, deduplicate and eliminate redundancy of content to minimize risk and reduce the cost of retrieving, storing, and managing information.”

The semantic and contextual search will go a long way, and Sophia Search is surely a good pioneering start.

Harleena Singh, October 8, 2010

EasyAsk Embraces the NetSuite Cloud Platform

October 7, 2010

EasyAsk is looking to cloud computing to expand services to their clients according to RedOrbit. “EasyAsk Integrates EasyAsk Business Edition With NetSuite’s Cloud Computing Platform” reports that EasyAsk is combing its EasyAsk Business Edition with the NetSuite cloud computing platform. EasyAsk Business Edition can be thought of as a search engine for the corporate world. This program allows users to search and explore corporate data on a user friendly Google like interface. EasyAsk Business Edition changes business questions into back-end queries, retrieves the data and then delivers answers to the user. The application also employs semantic intelligence which allows it to analyze user searches and provide helpful inquiries and suggestions in order to guide users. NetSuite’s SuiteCloud offers a variety of products, development tools and other services to help companies be more productive while also taking advantage of economical benefits. “EasyAsk Business Edition for NetSuite features rapid implementation and a superior user experience.” The dynamic duo EasyAsk Business Edition and NetSuite’s SuiteCloud development platform gives corporate users access to valuable information that can provide additional information to help them serve customers better and increase overall productivity.

April Holmes, October 7, 2010

Freebie

The Google PSE as a Context Aggregator

October 6, 2010

I may be one of the few people outside of the Googleplex who pays much attention to Ramanathan Guha. I know that none of my neighbors in Harrod’s Creek pays much attention to world outside of University of Kentucky football, the fall machine gun shoot, and squirrel stew.

If you want to keep an eye on the Google and its nifty programmable search engine, you may want to read US20100250513, “Aggregating Context Data for Programmable Search Engines.” That plural is, in my opinion, important. Here’s the official abstract:

Search results are generated using aggregated context data from two or more contexts. When two or more programmable search engines relate to a similar topic, context data associated with the programmable search engines are aggregated. The context is then applied to a query in order to present, in an integrated manner, relevant search results that make use of context intelligence from more than one programmable search engine.

I want to let you know that I realize patent applications may not be much more than the outputs of some idle engineers and attorneys. In fact, the systems and methods may not exist or even work in the real world. Nevertheless, the programmable search engine strikes me as a particularly interest innovation. Work on it has been documented in the open source literature for several years.

Stephen E Arnold, October 6, 2010

Freebie

Are Semantic Experts Losing It?

October 4, 2010

I read “USA Needs More Educated Workforce; Semantic Web Technologies May Help Higher Ed Spend IT Dollars More Wisely To Support Getting There” and wondered how this idea can get from A to B. Now I am a fan of semantic technologies, but I have said that semantic plumbing needs to be hidden behind nicely painted wallboard. I am baffled about the logic of the core argument in this semantic cheerleading write up. Here’s a passage that stumped me:

The latest development on this front is the public launch, set for later this month, at the EDUCAUSE conference of the EdUnify SOA Governance Framework Initiative. EdUnify is described as a shared, neutral, community-based Web services registry and suite of semantic web tools designed to reduce costs of integration and improve efficiency by providing a service-oriented architecture governance framework for education.

Conceptually I see that there are benefits from semantic technology applied to education. The reality is one that is going leave semantic technology marginalized.

First, the present system is not working with large numbers of high school students abandoning their education. This means that the semantic payoff will be for the students who stay in school. My recollection of student who stay in school is that plain old teaching works reasonably well. Chasing technology fairy dust is interesting but not germane to getting reading, writing, and arithmetic in place and providing an environment in which to learn. How will semantic technology help those who drop out or fail to keep up with the bright kids?

Second, the financial situation is pretty grim. The notion of a top down semantic solution is fun to discuss, but the situation in schools is that some basics are now shifted from the school to the teacher. For example, at the start of the school year in Kentucky, students were asked to bring supplies that the school once provided. Hey, no one asked me to bring a ream of copy paper to the first day of school.

Third, the big top down, technology fixes have not worked. I remember going to the middle school where my wife taught for many years and found only one working PC in the computer lab. Sure, the presence of computers is a great idea, but the infrastructure to keep these gizmos working, training teachers in what to have students do with the computers, and the battle of wits between computer savvy kids and lagging teachers makes many such technology fixes a joke.

Semantic technology is plumbing. Where it can be integrated to improve content processing and information retrieval, great. Positioning semantic technology as part of a giant, top down program leaves me baffled. Thank goodness I am 66 and too old to have to worry about this sort of thinking. The write up shows some connection with reality, but the core notion strikes me as something wild and crazy.

Stephen E Arnold, October 4, 2010

Freebie

Lexalytics Finds Meaning in :-) and LOL

September 25, 2010

We received a news release from Lexalytics. “Lexalytics Unveils Sentiment Analysis of Emoticons, Acronyms; First OEM Engine to Examine Short Form Content for Sentiment Analysis” reveals that the vendor processes non-text emoticons such as :-0. We think this is a good use of available message cues in a short text message. According to the news release:

With the use of emoticons, abbreviations, and confusing “social speak” grammar, micro-blog services such as Twitter present a difficult task for natural language processing systems. These improvements come as part of the yearly software license for Salience, Lexalytics’ core text analytics engine.

Police, intelligence agencies, and Madison Avenue types are likely to give the new capabilities a spin. Will non-text characters illuminate terse, sometimes tokenized messages? We will keep our ear to the ground.

Interesting idea. One question: Has Lexalytics made a breakthrough no other vendor can emulate with a look up table? Do we process messages with these types of content payloads? Not so much.

Stephen E Arnold, September 25, 2010

Exclusive Interview with Steve Cohen, Basis Technology

September 21, 2010

The Lucene Revolution is a few weeks away. One of the featured speakers is Steve Cohen, the chief operating officer of Basis Technology. Long a leader in language technology, Basis Technology has ridden a rocket ship of growth in the last few years.

clip_image002

Steve Cohen, COO, Basis Technology

I spoke with Steve about his firm and its view of open source search technology on Monday, November 20, 2010. The full text of the interview appears below:

Why are you interested in open source search?

The open source search movement has brought great search technology to a much wider audience. The growing Lucene and Solr community provides us with a sophisticated set of potential customers, who understand the difference that high quality linguistics can make. Historically we have sold to commercial search engine customers, and now we’re able to connect with – and support – individual organizations who are implementing Solr for documents in many languages. This also provides us with the opportunity to get one step closer to the end user, which is where we get our best feedback.

What is your take on the community aspect of open source search?

Of course, open source only works if there is an active and diverse community. This is why the Apache Foundation has stringent rules regarding the community before they will accept a project. “Search” has migrated over the past 15 years from an adjunct capability plugged onto the side of database-based systems to a foundation around which high performance software can be created. This means that many products and organizations now depend on a great search core technology. Because they depend on it they need to support and improve it, which is what we see happening.

What’s your take on the commercial interest in open source?

Our take, as a mostly commercial software company, is that we absolutely want to embrace and support the open source community – we employ Apache committers and open source maintainers for non-Apache projects – while providing (selling) technology that enhances the open source products. We also plan to convert some of our core technology to open source projects over time.

What’s your view on the Oracle Google Java legal matter with regards to open source search?

The embedded Java situation is unique and I don’t think it applies to open source search technology. We’re not completely surprised, however, that Oracle would have a different opinion of how to manage an open source portfolio than Sun did. For the community at-large this is probably not a good thing.

What are the primary benefits of using open source search?

I’ll tell you what we hear from customers and users: the primary benefits are to avoid vendor lock-in and flexibility. There has been many changes in the commercial vendor landscape over the fifteen years we’ve been in this business, and customers feel like they’ve been hurt by changes in ownership and whole products and companies disappearing. Search, as we said earlier, is a core component that directly affects user experience, so customizing and tuning performance to their application is key. Customers want all of the other usual things as well: good price, high performance, support, etc.

When someone asks you why you don’t use a commercial search solution, what do you tell them?

We do partner with commercial search vendors as well, so we like to present the benefits of each approach and let the customer decide.

What about integration? That’s a killer for many vendors in my experience.

Our exposure to integration is on the “back end” of Lucene and Solr. Our technology plugs in to provide linguistic capabilities. Since we deliver a reliable connector between our technology and the search engine this hasn’t been much of a problem.

How does open source search fit into Basis’ product/service offerings?

Our product, Rosette, is a text analysis toolkit that plugs into search tools like Solr (or the Lucene index engine) to help make search work well in many languages. Rosette prepares tokens for the search index by segmenting the text (which is not easy in some languages, like Chinese and Japanese), using linguistic rules to normalize the terms to enhance recall, and also provide enhanced search and navigation capabilities like entity extraction and fuzzy name matching.

How do people reach you?

Our Web site, at www.basistech.com, contains details on our various products and services, or people can write to info@basistech.com or call +1-617-386-2090.

Stephen E Arnold, September 21, 2010

Sponsored post

Search Industry Spot Changing: Risks and Rewards

September 20, 2010

I want to pick up a theme that has not been discussed from our angle in Harrod’s Creek. Marketers can change the language in news releases, on company blogs, and in PowerPoint pitches with a few keystrokes. For many companies, this is the preferred way to shift from one-size-fits-all search solutions described as a platform or framework into a product vendor. I don’t want to identify any specific companies, but you will be able to recognize them as these firms load up on Google AdWords, do pay-to-play presentations at traditional conferences, and output information about the new products. To see how this works, just turn off Google Instant and run the query “enterprise search”, “customer support”, or “business intelligence.” You can get some interesting clues from this exercise.

image

Source: http://jason-thomas.tumblr.com/

Enterprise search, as a discipline, is now undergoing the type of transformation that hit suppliers to the US auto industry last year. There is consolidation, outright failure , and downsizing for survival. The auto industry needs suppliers to make cars. But when people don’t buy the US auto makers products, dominoes fall over.

What are the options available to a company with a brand based on the notion of “enterprise search” and wild generalizations such as “all your information at your fingertips”? As it turns out, the options are essentially those of the auto suppliers to the US auto industry:

  • The company can close its doors. A good example is Convera.
  • The search vendor can sell out, ideally at a very high price. A good example is Fast Search & Transfer SA.
  • The search vendor can focus on a specific solution; for example, indexing FAQs and other information for customer support. A good example is Open Text.
  • The vendor can dissolve back into an organization and emerge with a new spin on the technology. An example is Google and its Google Search Appliance.
  • The search vendor can just go quiet and chase work as a certified integrator to a giant outfit like Microsoft. Good examples are the firms who make “snap ins” for Microsoft SharePoint.
  • The search vendor can grab a market’s catchphrase like “business intelligence” and say me too. The search vendor can morph into open source and go for a giant infusion of venture funding. An example is Palantir.

Now there is nothing wrong with any of these approaches. I have worked on some projects and used many of the tactics identified above as rivets in an analysis.

What I learned is that saying enterprise search technology is now a solution has an upside and downside. I want to capture my thoughts about each before they slip away from me. My motivation is the acceleration in repositioning that I have noticed in the last two weeks. Search vendors are kicking into overdrive with some interesting moves, which we will document here. We are thinking about creating a separate news service to deal with some of the non-search aspects of what we think is a key point in the evolution of search, content processing and information retrieval.

The Upside of Repositioning One-Size-Fits-All-Search

Let me run down the facets of this view point.

First, repositioning—as I said above—is easy. No major changes have to be made except for the MBA-style and Madison Avenue type explanation of what the company is doing. I see more and more focused messages. A vendor explains that a solution can deliver an on point solution to a big problem. A good example are the search vendors who are processing blogs and other social content for “meaning” that illuminates how a product or service is perceived. This is existing technology trimmed and focused on a specific body of content, specific outputs from specific inputs, and reports that a non-specialist can understand. No big surprise that search vendors are in the repositioning game as they try to pick up the scent of revenues like my neighbor’s hunting dog.

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta