May 26, 2012
I have had to look up the antecedents for InQuira again. I wanted to create this post to make it easy to reference these two firms which were combined to create InQuira. InQuira was acquired by Oracle Corp. in that company’s push to address its long-standing search and content processing issues. I have in my Overflight system the 2006 InQuira marketing collateral which, I noticed, provides a crib sheet for the many enterprise search vendors piling into the customer support segment. What’s interesting is that customer support is one of the sectors where open source search is getting some attention.
The antecedents of InQuira were:
- Answerfriend. The company had software which could understand text. In 2000, the company landed Accenture as a customer. Answerfriend pivoted on its natural language processing technology. Allegedly Answerfriend could handle both structured an unstructured data. Sound familiar in 2012?
- Electric Knowledge Inc. This also was an NLP shop. The technology was based on computational linguistic technology. This company had licensed its technology to Bank of America, an outfit which has had a long history of trying to find a search system which meets its requirements.
InQuira was created in 2002. The notion of hooking together two separate vendors to do the 1+1=3 thing has been used more recently by Lexalytics and Attensity.
At one time, InQuira was the answer system used by Yahoo’s customer support service. I encountered this when I tried to cancel a Yahoo service. The InQuira service was not too helpful to me. I just killed the credit card and solved the problem.
The marketing pitch of InQuira is as fresh today as it was in 2002. How much progress has there been in search and content processing in the last decade? Could the marketing collateral for a 2002 Oldsmobile be used without any changes? Probably not. Search has a limited supply of jargon, and it gets recycled endlessly in my opinion.
Stephen E Arnold, May 26, 2012
Sponsored by Polyspot
May 7, 2012
Social media is swarming with sound bites about social media. We recently came across a bit of information about Bitext’s recent SIG brainstorming meeting, which prompted further investigation into their company. As their name implies, they are concerned with text bits. Or, as the name we know it as: unstructured content.
There event was a big success with attendance turning out to be double what they expected. Social media and business strategies were discussed, in particularly in relation to their primary concern of semantics.
Amongst several solutions, consulting services and research and development, NaturalFinder stood out as having value on par with other semantically enriched search technology:
“NaturalFinder is the essential complement for any Internet or intranet search engine as it allows users to query in natural language (Spanish, English, French…) without using Booleans or wildcards. Thanks to its linguistic technology, users can focus on typing their queries in his/hew own words as if he/she talked to another person. NaturalFinder will return all relevant documents and more documents than traditional search engines, which are based on keywords.”
It is clear here that technology is continuing to adapt to the larger trend of pervasive informal language. First, we saw unstructured content, as opposed to traditional structured content, utilized for business analytics. Now, we are creating tools that allow search engines to mimic human intelligence.
Megan Feil, May 7, 2012
Sponsored by Ikanow
March 20, 2012
WillQuitSmoking.com recently shared a video on a new analytics system being used at a healthcare facility in Austin, Texas that is using this technology to save lives by manages large amount of unstructured data.
According to the article, “Seton Healthcare Uses IBM Content and Predictive Analytics to Improve Care & Lower CHF Readmissions,” Seton Healthcare relies on IBM Content and Predictive Analytics to identify high-risk congestive heart failure (CHF) patients for interventive care and to avoid preventable readmissions.
The article states:
Natural language processing enables analysis of both structured (ie lab results) and unstructured data (ie physician notes, discharge summaries), opening the door to rich clinical and operational insights that were hidden in inaccessible free text files. Seton can now identify trends and patterns in patient care and outcomes, uncovering sometimes obscure correlations or disparities buried in years of medical records; these can dramatically improve diagnosis and treatment.
One of the reasons that this article is really cool is because you can learn by watching a video, not using a live, online demo of the technology. Yep, we think movies are much better than live systems. Are videos easier to control than a game show? Yep. Yep. Yep.
Jasmine Ashton, March 20, 2012
Sponsored by Pandia.com
January 4, 2012
As if to continue trying to prove that it can do anything, “IBM’s Watson to Help Doctors Diagnose, Treat Cancer,” reports eWeek. The AI supercomputer will be working with the Cedars-Sinai cancer center and insurance company WellPoint to evaluate cancer treatment options. Writer Brian T. Horowitz explains:
Using its data analytics and NLP [Natural Language Processing] capabilities, Watson would integrate data such as medical literature, patient histories, clinical trials, side effects and outcomes data to help doctors decide on courses of treatment. . . . Watson would also look at the characteristics of a patient’s cancer and make recommendations on cost-effective treatment that would lead to the best outcome.
Of course, this advice would not replace that of a doctor, but it could become a valuable tool. Other health care organizations have been turning to technology for solutions. For example, Dell just donated an entire cloud infrastructure to the Translational Genomics Research Institute for storing medical trial data on pediatric cancer.
Good to see technology being used for the good of humanity, right? We would like to see IBM put Watson up on a test corpus for the public to use. Wishful thinking I suppose.
Cynthia Murrell, January 4, 2012
Sponsored by Pandia.com
November 30, 2011
I learned from one of my two or three readers that Barcelona was home to a natural language processing company. Several years ago, I spoke with a person familiar with the company Artificial Solutions. After a bit of fumbling around, I located a trade show at which the company was exhibiting. The company’s NLP system is called “Teneo.” The application which I recalled was the use of the NLP system for customer support. The company has expanded since I first learned about the firm. The technology has been applied to mobile devices, for example.
The company told me:
Teneo Mobile is a platform independent technology designed to enable companies, organizations, manufacturers and developers to create their own virtual assistant as a mobile app, regardless of platform, mobile device and even language. The Natural Language Interaction (NLI) engine is covered by patents. The system can currently be built in up to 21 different languages, including Mandarin and Russian.
The company, founded in 2001, is owned by its founders, the private equity fund Scope Growth II and some private investors. The company has tallied more than 200 projects in the public and private sector in 30 countries and 21 languages. In the telecommunications sector, the firm’s customers include:
The firm’s technology is based on the Teneo Interaction Engine. According to the firm, its system will:
reason like a human, using advanced linguistic and business rules to decide how best to respond to your customer’s request. Context comes into play here, such as time, date and place, as well as information picked up from previous conversations, customer data retrieved from your CRM system and transaction data from your ERP system. At this point, the Teneo will also eliminate any ambiguities from its initial analysis. Even one word can alter the meaning of a customer’s request. Teneo will instantly and dynamically re-assess content as the interaction develops, to understand what has changed and give the right answers. Natural language is full of subtle nuances, which Teneo is able to pick up and interpret. It understands idiom and slang, even dialect and SMS shorthand – and it’s also sympathetic to grammar, syntax or spelling mistakes.
For more information about the technology and its vertical applications, navigate to www.artificial-solutions.com.
Stephen E Arnold, November 30, 2011
Sponsored by Pandia.com
September 29, 2011
Editor’s Note: The Beyond Search team invited Craig Bassin, president of EasyAsk, a natural language processing specialist and search solution provider to provide his view of the market for next generation search systems. For more information about EasyAsk, navigate to www.easyask.com
This past February I watched, along with millions of others, IBM’s spectacular launch of Watson on Jeopardy! Watson was IBM’s crowning achievement in developing a Natural Language based solution finely tuned to compete, and win, on Jeopardy.
By IBM’s own estimates they invested between $1 and $2 billion to develop Watson. IBM ranks Watson as one of the 3 most difficult challenges in their long and successful history, along with spectacular accomplishments such as the Deep Blue chess program and the Blue Gene, Human Genome mapping program. Rarified air, indeed.
While many were watching to see if a computer could defeat human players my interests were different. Watson was about to introduce natural language solutions to the broader public and show the world that such solutions are truly the wave of the future.
The results were historic. Watson soundly defeated the human competitors. On the marketing side, IBM continues to spend hundreds of millions of dollars to tell the world that the time for natural language is now.
IBM is not the only firm to bring natural language processing (NLP) into the application mainstream:
- Microsoft acquired Powerset, a small company with strong NLP technology, to create Bing and compete head-on with Google,
- Yahoo, one of the original Internet search companies, found Bing compelling enough to strike an OEM agreement with Microsoft and make Bing Yahoo’s search solution,
- Apple acquired a linguistic natural language interface tool called Siri, which is now being incorporated into the Mac and iPhone operating systems,
- Oracle Corporation bought Inquira for its NLP-based customer support solution,
- RightNow Technologies similarly acquired Q-Go, a Dutch company also providing NLP-based customer support solutions.
Many companies are now positioning themselves as natural language tools and have expanded the once tight definition of NLP to include things such as analyzing text to understand intent or sentiment. This is the impact of Watson – it has put natural language into the mainstream and many organizations want to ride the marketing current driven by Watson regardless of closely aligned their technology is with Watson.
But let’s also look at Watson for what it really is – one of the most expensive custom solutions every built. Watson required an extremely large (and expensive) cluster of computers to run – 90 Power Server 750 systems, totaling 2,880 processor cores. It also required substantial R&D staff to build the analytics, content and natural language processing software stack. In fact, IBM didn’t come to Jeopardy, Jeopardy came to IBM. They replicated the Jeopardy set at IBM labs, placing a a great deal of horsepower underneath that stage.
The first foray of Watson into the real world will be in healthcare and the possibilities are exciting. Clearly IBM intends to focus Watson on some of the largest, most difficult challenges. But how does that help you run your business? You’re not going to see Watson running in your IT environment or on your preferred SaaS cloud anytime soon.
If Watson is focused on big problems, how can I use natural language solutions to better my business today? Perhaps you want to increase website customer conversion and user experience, better manage sales processes, deliver superior customer support, or in general, make it easier for your workers to find the right information to do their job. So where do you go?
That’s where EasyAsk comes in.
September 26, 2011
Editor’s Note: This is an article written by Tim Estes, founder of Digital Reasoning, one of the world’s leading providers of technology for entity based analytics. You can learn more about Digital Reasoning at www.digitalreasoning.com.
Most university programming courses ignore entity extraction. Some professors talk about the challenges of identifying people, places, things, events, Social Security Numbers and leave the rest to the students. Other professors may have an assignment related to parsing text and detecting anomalies or bound phrases. But most of those emerging with a degree in computer science consign the challenge of entity extraction to the Miscellaneous file.
Entity extraction means processing text to identify, tag, and properly account for those elements that are the names of person, numbers, organizations, locations, and expressions such as a telephone number, among other items. An entity can consist of a single word like Cher or a bound sequence of words like White House. The challenge of figuring out names is tough one for several reasons. Many names exist in richly varied forms. You can find interesting naming conventions in street addresses in Madrid, Spain, and for the owner of a falafel shop in Tripoli.
Entities, as information retrieval experts have learned since the first DARPA conference on the subject in 1987, are quite important to certain types of content analysis. Digital Reasoning has been working for more than 11 years on entity extraction and related content processing problems. Entity oriented analytics have become a very important issue these days as companies deal with too much data, the need to understand the meaning and not the just the statistics of the data and finally to understand entities in context – critical to understanding code terms, etc.
I want to highlight the six weaknesses of traditional entity extraction and highlight Digital Reasoning’s patented, fully automated method. Let’s look at the weaknesses.
1 Prior Knowledge
Traditional entity extraction systems assume that the system will “know” about the entities. This information has been obtained via training or specialized knowledge bases. The idea is that a system processes content similar to that which the system will process when fully operational. When the system is able to locate or a human “helps” the system locate an entity, the software will “remember” the entity. In effect, entity extraction assumes that the system either has a list of entities to identify and tag or a human will interact with various parsing methods to “teach” the system about the entities. The obvious problem is that when a new entity becomes available and is mentioned one time, the system may not identify the entity.
2 Human Inputs
I have already mentioned the need for a human to interact with the system. The approach is widely used, even in the sophisticated systems associated with firms such as Hewlett Packard Autonomy and Microsoft Fast Search. The problem with relying on humans is a time and cost equation. As the volume of data to be processed goes up, more human time is needed to make sure the system is identifying and tagging correctly. In our era of data doubling every four months, the cost of coping with massive data flows makes human intermediated entity identification impractical.
3 Slow Throughput
Most content processing systems talk about high performance, scalability, and massively parallel computing. The reality is that most of the subsystems required to manipulate content for the purpose of identifying, tagging, and performing other operations on entities are bottlenecks. What is the solution? Most vendors of entity extraction solutions push the problem back to the client. Most information technology managers solve performance problems by adding hardware to either an on premises or cloud-based solution. The problem is that adding hardware is at best a temporary fix. In the present era of big data, content volume will increase. The appetite for adding hardware lessens in a business climate characterized by financial constraints. Not surprisingly entity extraction systems are often “turned off” because the client cannot afford the infrastructure required to deal with the volume of data to be processed. A great system that is too expensive introduces some flaws in the analytic process.
September 20, 2011
Digital Reasoning empowers decision makers with timely, actionable intelligence to creating software to automatically make sense of complex data.
Our flagship product, Synthesys®, solves the problem of achieving actionable intelligence out of massive amounts of unstructured and structured text . . . A typical customer might be trying to completely understand how to locate an individual within massive amounts of reports . . . Sifting through all this data to accurately develop this profile even among misspellings, aliases, code names, etc. is typically something that can only be done by reading. Our ability to automate understanding is critical to customers with concerns about time, accuracy, completeness, or even the ability to leverage the massive amount of data they have generated.
August 30, 2011
Microsoft is making a concerted effort to tackle natural language processing with its Redmond-based Natural Language Processing Group. The Microsoft page devoted to the group highlights current and older projects, downloads, and researchers involved.
The goal of the Natural Language Processing (NLP) group is to design and build software that will analyze, understand, and generate languages that humans use naturally, so that eventually you will be able to address your computer as though you were addressing another person. This goal is not easy to reach. “Understanding” language means, among other things, knowing what concepts a word or phrase stands for and knowing how to link those concepts together in a meaningful way.
Of particular interest are the recent publications authored by those in the group. Work includes everything from social media implementation, to multi-lingual Wikipedia content, to syntactic language modeling. The papers are well worth a read for anyone interested in the pressing field of natural language processing. Microsoft is definitely putting time and energy into the project, but it remains to be seen who of the tech giants will emerge the victor in the battle for natural language processing supremacy.
If you track NLP, including the newly minted azure chip consultants, you will want to monitor this aspect of Microsoft’s many, many search and text processing activities.
Emily Rae Aldridge, August 29, 2011
Sponsored by Pandia.com
August 29, 2011
Have you ever tried to find ink or toner for a not-so-new printer? The process can be confusing, and shoppers are unlikely to feel warm and fuzzy about any ink seller whose Web site only adds to the frustration.
One purveyor of ink and toner made a wise choice when it picked EasyAsk’s eCommerce Edition. EasyAsk asserts, “NetSuite Customer InkJet Superstore Jets Past Competitors Using EasyAsk Natural Language E-Commerce Search Software for SaaS.” The press release states,
Using EasyAsk eCommerce edition, InkJet Superstore has dramatically simplified finding the right printer cartridges and accessories, providing the easiest online experience for customers, increasing online orders and revenue. The news release said: “InkjetSuperstore.com sells toner and ink cartridges for virtually every make and model of printer, copier, and fax machine, with over 6,000 items. InkJet Superstore’s vision is clearly articulated on the company website: ‘To be the best, the easiest, the cheapest and friendliest place to buy printer accessories.’ To back this up, InkJet Superstore offers a 100% satisfaction guarantee, which includes paying for return shipping cost.
EasyAsk is helping InkJet Superstore deliver on its promises. Since the business implemented the solution, the site has had 80% fewer “no results” returns; increased order conversion rates by six percent; and decreased its phone calls and live chat requests, indicating that customers are more easily finding what they need.
The solution didn’t stop there. With their its rapidly expanding, Inkjet Superstore is taking advantage of the EasyAsk’s auto-sync feature to assimilate new products into the Web site. Furthermore, rich analytics mine customer search terms for items that are in demand, suggesting potential new products.