Exclusive Interview: Paul Doscher, President of Lucid Imagination
April 16, 2012
The Search Wizards Speak features Paul Doscher, the new president of Lucid Imagination. Mr. Doscher joined Lucid Imagination in December 2011. He had been president of Dassault Exalead USA prior to assuming the top spot at fast-growing, customer- and community-centric Lucid Imagination.
I spoke with Mr. Doscher when he was working for the Dassault Exalead organization. When he shifted to Lucid Imagination, I spoke with him about his views of open source search. After that brief initial conversation, I met again with Mr. Doscher and probed into his views about the impact open source search is having on traditional for-fee, proprietary search systems.
When I asked about the shift from proprietary search systems to open source search, he told me:
Today organizations need the flexibility to adapt and make changes. A proprietary solution may not permit the licensee to make enhancements. If a change is made, the proprietary search vendor may “own” the fix and will add that innovation to its core product. The licensee who created the fix gets nothing and may have had to pay for the right to innovate. As corporate information technology struggles to keep up with escalating business information demands and an ever increasing mountain of growing content of all types, open source search provides a cost effective and efficient way to develop applications to address the challenges and opportunities in today’s enterprise.
Mr. Doscher has strong views about how licensees of enterprise search systems have learned about costs, the time required to deploy a system, and the effort needed to keep a search system up and running. I asked him about Lucid Imagination’s approach to a search engagement. He said:
Our approach to an engagement is to listen to what our customers need, prepare an action plan, and then deliver. In a sense, our approach is the type of involvement that many software companies have stepped away from. We have an enthusiastic group of engineers and professionals who work with clients to meet their needs.
The full text of the interview appears on the ArnoldIT.com Web site. For more information about Lucid Imagination’s open source search system, you will want to explore the company’s Web site and its blog. In addition, an interview with one of the founders of Lucid Imagination, Marc Krellenstein, and with Eric Gries, a former executive at Lucid Imagination, is available in the Beyond Search archives.
Stephen E Arnold, April 16, 2012
Sponsored by Pandia.com
Inteltrax: Top Stories, March 26 to March 30
April 2, 2012
Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the ways in which unstructured data is impacting the big data industry.
Our feature story this week, “Digital Reasoning Makes Major Move in Military,” shows how the leader in unstructured data wrangling is helping the military increase its reach.
“Unstructured Data Demands Right Tools” proves that not all unstructured data softwares are created equal. That’s not a bad thing, it’s just a shell game for users to find the right one for their needs.
“Governments Get Self Conscious with Analytics” showed how clever government agencies are clearing up inaccuracies and becoming more efficient by utilizing the massive collections of unstructured data lingering in their systems.
If you aren’t familiar with the term “unstructured data” you will be. It’s the big horizon in the analytics world. We, fortunately, are well versed in the ephemeral stuff. It’s going to change the way the entire industry works and we’ll be following it every day.
Follow the Inteltrax news stream by visiting www.inteltrax.com
Patrick Roland, Editor, Inteltrax.
April 2, 2012
The Invisibility of Open Source Search
March 27, 2012
I was grinding through my files and I noticed something interesting. After I abandoned the Enterprise Search Report, I shifted my research from search and retrieval to text processing. With this blog, I tried to cover the main events in the post-search world. The coverage was more difficult than I anticipated, so we started Inteltrax, which focuses on systems, companies, and products which “find” meaning using numerical recipes. But that does not do enough, so we are contemplating two additional free information services about “findability.” I am not prepared to announce either of these at this time. We have set up a content production system with some talented professionals working on our particular approach to content. We are also producing some test articles.
Until we make the announcement, I want to reiterate a point I made in my talks in London in 2011 about open source search and content processing:
Most reports about enterprise search ignore open source search solution vendors. A quiet revolution is underway, and for many executives, the shift is all but invisible.
We think that the “invisible” nature of the open source search and content processing options is due to four factors:
Most of the poobahs, self appointed experts and former home economics majors have never installed, set up, or optimized an open source search system. Here at ArnoldIT we have that hands on experience. And we can say that open source search and content processing solutions are moving from the desks of Linux wizards to more mainstream business professionals.
Next, we see more companies embracing open source, contributing to the overall community with bug fixes and new features and functions. At the same time, the commercial enterprises are “wrapping” open source with proprietary, value-added features and functions. The leader in this movement is IBM. Yep, good old Big Blue is an adherent of open source software. Why? We will try to answer this in our new information services.
Third, we think the financial pressure on organizations is greater than ever. CNBC and the Murdoch outfitted Wall Street Journal are cheering for the new economic recovery. We think that most organizations are struggling to make sales, maintain margins, and generate new opportunities. Open source search and content solutions promise some operating efficiencies. We want to cover some of the business angles of the open source search and content processing shift. Yep, open source means money.
Finally, the big solutions vendors are under a unique type of pressure. Some of it comes from licensees who are not happy with the cost of “traditional” solutions. Other comes from the data environment itself. Let’s face it. Certain search systems such as your old and dusty version of IBM STAIRS or Fulcrum won’t do the job in today’s data and information rich environment. New tools are needed. Why not solve a new information problem without dragging the costs, methods, and license restrictions of traditional enterprise software along for the ride? We think change is in the wind just like the smell of sweating horses a couple of months before the Kentucky Derby.
Our approach to information in our new services will be similar to that taken in Beyond Search. We want to provide pointers to useful write ups and offer some comments which put certain actions and events in a slightly different light. Will you agree with the information in our new services? We hope not.
Stephen E Arnold, March 27, 2012
Sponsored by Pandia.com
Are We Ready for More Watson?
March 13, 2012
No, here at Beyond Search we are ready for a live online demo. We are not ready for more PR, however.
On the year anniversary of its creation, Fern Halper’s Data Makes the World Go ‘Round has provided inquisitive minds with a rundown of IBM’s supercomputer, Watson in the article, “Are You Ready for IBM Watson?”
The blog post is broken down into three informative sections, entitled “What is Watson and why is it unique?” “Can I buy Watson?” and finally, “Ready for Watson?”
For those who are unfamiliar with this device, Watson is a set of technologies that process and analyze massive amounts of both structured and unstructured data in a unique way. A noteworthy fact: Watson analyze information from 200 million books in three seconds.
Consumers are unable to purchase Watson at this time, however:
“IBM recently rolled out its “Ready for Watson.” The idea is that a move to Watson might not be a linear progression. It depends on the business problem that companies are looking to solve. So IBM has tagged certain of its products as “ready” to be incorporated into a Watson solution. IBM Content and Predictive Analytics for Healthcare is one example of this. It combines IBM’s content analytics and predictive analytics solutions that are components of Watson.”
Some people are nervous while others are excited about what the mass production of Watson could be. Our we ready for Watson? How about that live online demo?
Jasmine Ashton, March 13, 2012
Sponsored by Pandia.com
IBM Watson Fights Cancer in 2012
January 31, 2012
All Things D recently reported on IBM’s new super computer Watson in the article “Seven Questions With IBM’s Manoj Saxena About Watson and Cancer.”
According to the article, IBM is planning on using Watson as a reference tool to assist human physicians in the treatment of breast, lung and colon cancer.
The write up provides readers with the text from an interview between All Things D and Manoj Saxena, general manager of the Watson program at IBM, to talk about what Watson will — and won’t — be doing in helping doctors treat humans with cancer, and what that might mean for the future of medicine.
In response to the question, will Watson be directly involved in treatment? Saxena replied:
Watson doesn’t make the decisions. It’s a physician’s assistant. But before it becomes that, it has a lot to learn. Out of the box, Watson has the knowledge of a first-year medical resident. That is where it’s at today. With Cedars-Sinai and Wellpoint, we’re going to teach it all about cancer during the next six months.
Watson still has a lot to learn to be able to be utilized to solve the problem of cancer. After utilizing the super computer in the medical field, IBM plans to apply Watson to financial services. We look forward to more public relations about Watson, the smart search system. It would be interesting to have an online demonstration using the IBM patent corpus or Wikipedia content. Talk is one thing. Test queries are quite another.
Jasmine Ashton, January 31, 2012
Sponsored by Pandia.com
So Where Are You? Come Here! Hello, Hello
January 24, 2012
Maybe it is I, but fatigue washes over me each time I read about Lucene-based Watson. Quite a bit of PR chatter and very little online demo.
All Things D recently reported on IBM’s new super computer Watson in the article “Seven Questions With IBM’s Manoj Saxena About Watson and Cancer.”
According to the article, IBM is planning on using Watson as a reference tool to assist human physicians in the treatment of breast, lung and colon cancer.
The write up provides readers with the text from an interview between All Things D and Manoj Saxena, general manager of the Watson program at IBM, to talk about what Watson will — and won’t — be doing in helping doctors treat humans with cancer, and what that might mean for the future of medicine.
In response to the question, will Watson be directly involved in treatment? Saxena replied:
Watson doesn’t make the decisions. It’s a physician’s assistant. But before it becomes that, it has a lot to learn. Out of the box, Watson has the knowledge of a first-year medical resident. That is where it’s at today. With Cedars-Sinai and Wellpoint, we’re going to teach it all about cancer during the next six months.
A couple of observations:
- NLP on a medical corpus is a piece of cake compared to colloquial blog posts in Arabic
- Talk is not search; talk is baloney
- Wrapping layers of code around Lucene does not inspire confidence in high speed throughput on large content collections.
- Updates? And what about updates?
Watson still has a lot to learn to be able to be utilized to solve the problem of cancer. After utilizing the super computer in the medical field, IBM plans to apply Watson to financial services.
Jasmine Ashton, January 24, 2012
Sponsored by Pandia.com
IBM Public Relations Chugs Away on Watson and Health Care
October 2, 2011
IBM shows continues to pitch the potential of its supercomputer Watson. IBM sees Watson as the health professional’s Florence Nightingale with a degree in library science and a PhD in health insurance statistics. Computerworld reported on this development with the article “IBM’s Watson to Diagnose Patients.”
The idea is that two is always better than one and its not just IBM that is in on this idea–Blue Cross Blue Shield’s largest health plan Well Point agreed to create joint applications.
These applications are rooted in evidence-based medicine and have the capacity to use Watson’s search analysis to review all prior cases, clinical know-how, and medical literature to assist doctors in diagnosing and treating their patients.
In helping us understand what makes this possible, Computerworld outlines the specs on Watson in their article:
Watson consists of 90 IBM Power 750 Express servers powered by 8-core processors — four in each machine — for a total of 32 processors per machine. The servers are virtualized with a kernel-based virtual machine scheme, resulting in a server cluster with a total processing capacity of 80 teraflops. (A teraflop is one trillion operations per second.)
Our local hospital still uses big fat monitors and clunky monospaced input forms. No supercomputers in sight. In fact, the hospital riffed 500 health professionals with more cuts coming. Watson may be just what the doctor ordered, right?
Clinical pilots early next year will be the first step before Watson becomes the norm in your physicians office. As IBM continues to hammer away at this project, they’re definitely not putting a dent in their reputation.
Sounds good. Now the cost, the infrastructure, and the upkeep? No problem, of course. My doctor wants a new Audi convertible. He didn’t mention Watson as a must have the last time I spoke with her.
Megan Feil, October 2, 2011
Sponsored by Pandia.com
Natural Language, a Solution Who’s Time Has Come
September 29, 2011
Editor’s Note: The Beyond Search team invited Craig Bassin, president of EasyAsk, a natural language processing specialist and search solution provider to provide his view of the market for next generation search systems. For more information about EasyAsk, navigate to www.easyask.com
This past February I watched, along with millions of others, IBM’s spectacular launch of Watson on Jeopardy! Watson was IBM’s crowning achievement in developing a Natural Language based solution finely tuned to compete, and win, on Jeopardy.
By IBM’s own estimates they invested between $1 and $2 billion to develop Watson. IBM ranks Watson as one of the 3 most difficult challenges in their long and successful history, along with spectacular accomplishments such as the Deep Blue chess program and the Blue Gene, Human Genome mapping program. Rarified air, indeed.
While many were watching to see if a computer could defeat human players my interests were different. Watson was about to introduce natural language solutions to the broader public and show the world that such solutions are truly the wave of the future.
The results were historic. Watson soundly defeated the human competitors. On the marketing side, IBM continues to spend hundreds of millions of dollars to tell the world that the time for natural language is now.
IBM is not the only firm to bring natural language processing (NLP) into the application mainstream:
- Microsoft acquired Powerset, a small company with strong NLP technology, to create Bing and compete head-on with Google,
- Yahoo, one of the original Internet search companies, found Bing compelling enough to strike an OEM agreement with Microsoft and make Bing Yahoo’s search solution,
- Apple acquired a linguistic natural language interface tool called Siri, which is now being incorporated into the Mac and iPhone operating systems,
- Oracle Corporation bought Inquira for its NLP-based customer support solution,
- RightNow Technologies similarly acquired Q-Go, a Dutch company also providing NLP-based customer support solutions.
Many companies are now positioning themselves as natural language tools and have expanded the once tight definition of NLP to include things such as analyzing text to understand intent or sentiment. This is the impact of Watson – it has put natural language into the mainstream and many organizations want to ride the marketing current driven by Watson regardless of closely aligned their technology is with Watson.
But let’s also look at Watson for what it really is – one of the most expensive custom solutions every built. Watson required an extremely large (and expensive) cluster of computers to run – 90 Power Server 750 systems, totaling 2,880 processor cores. It also required substantial R&D staff to build the analytics, content and natural language processing software stack. In fact, IBM didn’t come to Jeopardy, Jeopardy came to IBM. They replicated the Jeopardy set at IBM labs, placing a a great deal of horsepower underneath that stage.
The first foray of Watson into the real world will be in healthcare and the possibilities are exciting. Clearly IBM intends to focus Watson on some of the largest, most difficult challenges. But how does that help you run your business? You’re not going to see Watson running in your IT environment or on your preferred SaaS cloud anytime soon.
If Watson is focused on big problems, how can I use natural language solutions to better my business today? Perhaps you want to increase website customer conversion and user experience, better manage sales processes, deliver superior customer support, or in general, make it easier for your workers to find the right information to do their job. So where do you go?
That’s where EasyAsk comes in.
IBM, Natural Language PR, and Television
September 13, 2011
I read “IBM and Jeopardy! Relive History with Encore Presentation of Jeopardy!: The IBM Challenge.” Frankly with TV in the summer slump, a reprise of the competition between IBM and humans is not likely to kick off the fall TV season with a bang. Reruns are, in fact, recycled information.
The idea is that on September 12, 13, and 14, 2011, I can watch humans match wits with IBM’s natural language search system, Watson.
Now Watson, based on what I have heard, is quite a lot of Lucene (an open source search system which IBM uses in its OmniFind 9.x product), and an extremely large database of analytics and content. To some degree it is not too different from having Wikipedia on your hard drive with IBM’s highly customized proprietary software.
To make Watson work, IBM needed three key ingredients: (a) very large systems – ninety IBM Power 750 servers with four 8-core processors each (2880 cores total!), (b) numerous engineers from the IBM R&D Labs, and (c) an army of technicians to baby sit the machine and database. Watson does not understand spoken speech, but like the computer on my desk, Watson can accept typed inputs. Watson also does not work from my iPad or my mobile phone.
While it is a solid achievement and nice step forward for Natural Language, the reality is that Watson is pretty much a raw demo from the R&D labs, and the Jeopardy! angle is an expensive and somewhat amusing marketing play. Not many Jeopardy! watchers are going to license IBM’s natural language processing technology. A better question is, “How many Jeopardy! watchers know what natural language processing is anyway?”
The problem for me is that television is not the real world. Reality shows are loosely scripted. When I see a commercial television production, the operative word is postproduction. The video wizards snip and segue to make the talent and the floor personnel, the writers, the sound team, the videographers, and the teleprompter operator fuse as a seamless whole.
But let’s look at Watson’s impact in the commercial software world. Jeopardy! is a show and the Watson system and content is highly customized to the show. In applying Watson to the real world, I personally have some doubts about how “smart” Watson is. Watson has potential but clearly needs much more work to prove it can be applied to everyday business problems due to the immaturity of the technology and the tremendously high cost of the systems and the databases involved.
Watson Is Back from Medical Leave to Your Enterprise?
September 13, 2011
You know Watson. The software system that wraps Lucene so it can answer questions on a scripted reality TV game show. “Lights. Camera. Action. Retake. Can it.” That’s reality TV when it meets super searching technology from IBM.
EWeek thinks that “IBM’s Watson Could Shake Up The Enterprise.” Craig Rhinehart, head of IBM’s enterprise content management (ECM) division, says using the technology behind the Jeopardy!– winning machine for ECM is a no-brainer. The wealth of unstructured data that companies have been collecting is the perfect target for Watson’s decision engine. Writer David Jamieson comments,
This unstructured data – in human language, not computer language – represents about 80 percent of the data hoarded by the enterprise, according to Rhinehart. And there is valuable information contained within that NLP [Natural Language Processing] can unlock, revealing trends and insights to be acted upon.
Rhinehart emphasizes that Watson’s technology is not a search tool, but a “decision support tool.” This means users have to refine their queries, but in exchange they get the advanced natural language algorithms Watson is famous for. With them, companies can make important correlations that lead to increased revenue and/or decreased losses. Areas that could particularly benefit are ones brimming with data, like healthcare and law.
Does “decision support’ sound like Microsoft’s “decision engine” angle? It does to us.
IBM has much invested in content management and analytics, and the technology behind Watson is the prize. One may be tempted to dismiss the whole Jeopardy! appearance as a publicity gimmick. What’s the promotional hook for the enterprise? A free mainframe with Lotus Notes with every Watson licensed? If so, we want one. We can decide without asking Watson too.
Cynthia Murrell, September 13, 2011
Sponsored by Pandia.com