NetBase and Content Intelligence

April 30, 2009

Vertical search is alive and well. Technology Review described NetBase’s Content Intelligence here. The story, written by Erica Naone, was “A Smarter Search for What Ails You”. Ms. Naone wrote:

organizes searchable content by analyzing sentence structure in a novel way. The company created a demonstration of the platform that searches through health-related information. When a user enters the name of a disease, he or she is most interested in common causes, symptoms, and treatments, and in finding doctors who specialize in treating it, says Netbase CEO and cofounder Jonathan Spier. So the company’s new software doesn’t simply return a list of documents that reference the disease, as most search engines would. Instead, it presents the user with answers to common questions. For example, it shows a list of treatments and excerpts from documents that discuss those treatments. The Content Intelligence platform is not intended as a stand-alone search engine, Spier explains. Instead, Netbase hopes to sell it to companies that want to enhance the quality of their results.

NetBase (formerly Accelovation) has developed a natural language processing system.Ms. Naone reported:

NetBase’s software focuses on recognizing phrases that describe the connections between important words. For example, when the system looks for treatments, it might search for phrases such as “reduce the risk of” instead of the name of a particular drug. Tellefson notes that this isn’t a matter of simply listing instances of this phrase, rather catching phrases with an equivalent meaning. Netbase’s system uses these phrases to understand the relationship between parts of the sentence.

At this point in the write up, I heard echoes of other vendors with NLP, semantics, bound phrase identification, etc. Elsevier has embraced the system for its illumin8 service. You can obtain more information about this Elsevier service here. Illumin8 asked me, “What if you could become an expert in any topic in a few minutes?” Wow!

The NetBase explanation of content intelligence is:

… understanding the actual “meaning” of sentences independent of custom lexicons. It is designed to handle myriads of syntactical sentence structures – even ungrammatical ones – and convert them to logical form. Content Intelligence creates structured semantic indexes from massive volumes of content (billions of web-pages and documents) used to power question-and-answer type of search experiences.

NetBase asserts:

Because NetBase doesn’t rely on custom taxonomies, manual annotations or coding, the solutions are fully automated, massively scalable and able to be rolled-out in weeks with a minimal amount of effort. NetBase’s semantic index is easy to keep up-to-date since no human editing or updates to controlled vocabulary are needed to capture and index new information – even when it includes new technical terms.

Let me offer several observations:

  • The application of NLP to content is not new and it imposes some computational burdens on the search system. To minimize those loads, NLP is often constrained to content that contains a restricted terminology; for example, medicine, engineering, etc. Even with a narrow focus, NLP remains interesting.
  • “Loose” NLP can squirm around some of the brute force challenges, but it is not yet clear if NLP methods are ready for center stage. Sophisticated content processing often works best out of sight, delivering to the user delightful, useful ways to obtain needed information.
  • A number of NLP systems are available today; for example, Hakia. Microsoft snapped up PowerSet. One can argue that some of the Inxight technology acquired first by Business Objects then by the software giant SAP are NLP systems. To my knowledge, none of these has scored a hat trick in revenue, customer uptake, and high volume content processing.

You can get more information about NetBase here. You can find demonstrations and screenshots. A good place to start is here. According to TechCrunch:

NetBase has been around for a while. Originally called Accelovation, it has raised $9 million in two rounds of venture funding over the past four years, has 30 employees…

In my files, I had noted that the funding sources included Altos Ventures and ThomVest, but these data may be stale or just plain wrong. I don’t have enough information about Netbase to offer substantive comments. NLP requires significant computing horsepower. I need to know more about the plumbing. Technology Review provided the sizzle. Now we need to know about the cow from which the prime rib comes.

Stephen Arnold, April 30, 2009

Digital Reef: A Similarity Search Engine

March 9, 2009

Straightaway, there are two “digital reefs”. One is an elearning company. The other–www.digitalreefinc.com–is a content processing company. In my notes, I described the company as offering an “unstructured data management platform.” The headline on the content processing company’s Web site here is “massively scalable”, which is a good thing. The company, according to my notes, was originally Auraria Networks. When an infusion of venture funding arrived, the Digital Reed name was adopted. I’m grateful. I didn’t know how to spell or pronounce “Auraria”. I filed the company under Aura, which was close enough for horseshoes.

Organizations are awash in data, and most are clueless about the nuggets within nor about the potential risks the data contain. To get a peek under the hood, you will want to download the company’s white paper here. The document is 13 pages long. You can review it at your leisure. The company’s news release here said:

Digital Reef (www.digitalreefinc.com), one of Matrix Partners’ and Pilot House Ventures’ premier portfolio companies, today announces a new approach to discovering and managing unstructured and semi-structured data. The Digital Reef solution helps large enterprises deal with key business issues that cannot be properly addressed using traditional solutions. These issues include eDiscovery, data risk mitigation, knowledge reuse, and strategic storage initiatives—all of which stem from lack of control over unstructured data, and require a degree of scalability and performance that traditional solutions cannot provide.

The company’s system “was designed to rapidly address very large stores of unstructured data, without manual effort or disruption to data center or business activity.” With the company’s analysis and classification tools, a licensee can:

  • Locate specific kinds of data, including sensitive data like Social Security and credit card numbers
  • Identify regulated data for compliance
  • Pinpoint relevant documents for pending legal action
  • Find intellectual property that can be reused for competitive advantage.

The company’s Web log with posts from founder and president Steve Askers (a former Lucent executive) is here. Entries are sparse at this time.

Despite the lousy economy, new entrants continue to pursue the content processing sector. With each new system, I chuckle when I read about “simple” and “stabile” market conditions. Crazy. I don’t have screenshots in my files nor do I have pricing. On the surface, Digital Reef seems to offer tools that overlap with Inxight Software‘s and Megaputer‘s offerings. I will add the company to my watch list.

Stephen Arnold, March 9, 2009

Google: A Powerful Mental Eraser

October 23, 2008

Earlier today I learned that a person who listened to my 20 minute talk at a small conference in London, England, heard one thing only–Google. I won’t mention the name of this person, who has an advanced degree and is sufficiently motivated to attend a technical conference.

What amazed me were these points:

  1. The attendee thought I was selling Google’s eDiscovery services
  2. I did not explain that organizations require predictive services, not historical search services
  3. I failed to mention other products in my talk.

I looked at the PowerPoint deck I used to check my memory. At age 64, I have a tough time remembering where I parked my car. Here’s what I learned from my slide deck.

image

Mention Google and some people in the audience lose the ability to listen and “erase” any recollection of other companies mentioned or any suggestion that Google is not flawless. Source: http://i265.photobucket.com/albums/ii215/Katieluvr01/eraser-2.jpg.

First, I began with a chart created by an SAS Institute professional. I told the audience the source of the chart and pointed out the bright red portion of the chart. This segment of the chart identifies the emergence of the predictive analytics era. Yep, that’s the era we are now entering.

Second, I reviewed the excellent search enabled eDiscovery system from Clearwell Systems. I showed six screen shots of the service and its outputs. I pointed out that attorneys pay big sums for the Clearwell System because it creates an audit trail so queries can be rerun at any time. It generates an email thread so an attorney can see who wrote whom when and what was said. It creates outputs that can be submitted to a court without requiring a human to rekey data. In short, I gave Clearwell a grade of “A” and urged the audience to look at this system for competitive intelligence, not just eDiscovery. Oh, I pointed out that email comprises a larger percentage of content in eDiscovery than it has in previous years.

Read more

Powerset’s Approach to Search

October 6, 2008

Powerset was acquired by Microsoft for about $100 million in June 2008. I haven’t paid too much attention to what Microsoft has done or is doing with the Powerset semantic, natural language, latent semantic indexing, et al system it acquired. A reader sent me a link to Jon Udell’s well Web log interview that focuses on Powerset. If you want to know more about how Microsoft will leverage the aging Xerox Parc technology, you will want to click here to get an introduction to the Perspectives interview conducted on September 30, 2008, with Scott Prevost. You will need to install Silverlight, or you can read the interview transcript here.

I can’t summarize the lengthy interview. For several three points were of particular interest:

  1. The $100 million bought Powerset, but Microsoft had to then license the Xerox Parc technology. You can get some “inxight” into the functions of the technology by exploring the SAP/ Business Objects’ information here.
  2. The Powerset technology can be used with both structured and unstructured information.
  3. Microsoft will be doing more work to deliver “instant answers”.

A happy quack to the reader who sent me this link, and two quacks for Mr. Udell for getting some useful information from Scott Prevost. I am curious about the roles of Barney Pell (Powerset founder) and Ron Kaplan (Powerset CTO and former Xerox Parc wizard) in the new organization. If anyone can shed light on this, you too will warrant a happy quack.

Stephen Arnold, October

Another Extract from the Harmann-Communicatie Interview

August 24, 2008

I received a couple of requests for additional extracts from my interview with Eric Hartmann, who is sponsoring a conference about content management and content processing in Utrecht in September 2008. You can obtain more information about the program here. Here are three more snippets from the interview. The question is in bold. My response is in normal weight.

Everybody who’s talking about search has Google on his mind. Is that good or bad?

I have written two detailed studies of Google, The Google Legacy in 2005 and Google Version 2.0 in 2007. Google is an important company because it legitimized an alternative to desktop applications and on premises enterprise solutions. Along the way, Google changed the Web search landscape, dominated online advertising, and pushed its snout into telephony, online payments, publishing, and several other major non-search market sectors.

Google now has 70 percent of search traffic in North America. In Denmark and Germany, Google’s share of the search market is over 90 percent.

There’s a lot of talk about Google, but there is not much understanding of how the company’s strategy of disruption works, its business model options, or its potential to move into non search markets without warning.

Google’s also important because innovators are learning from the Google model. People who quit Google to start a new company—what are called Xooglers—build on the ideas made concrete by Google. As a result, Google the company could go out of business. But Google the model will have a continuing impact for many years.

hot seat fixed

On the hot seat.

After several take-overs, the market of enterprise search parties has somewhat shrunk. What’s your view on the investment and revenue opportunities?

That’s a good question. On the surface, it looks as if search companies are selling out. For example, Lexalytics has fused with a UK company. Powerset sold out to Microsoft. Fast Search also accepted a Microsoft offer. SAS Institute bought Teragram. Business Objects (now part of SAP) purchased Inxight Software.

However, there’s investment as well. Intel and SAP pumped $14 million in Endeca. I have worked on a couple of investments in search and content processing systems not yet announced to the public.

In my files I have the names of more than 300 companies engaged in search, text analytics, and content processing. The search sector is quite active even in the present economic climate.

The reason is that many people think, “If Google did it, so can we”. I don’t see any let up in search activity for the foreseeable future. Most search systems are not so good; therefore, there’s a big payday in the enterprise market. There’s a growing suspicion that Google may not be everyone’s idea of “My Favorite Monopoly”.

The search space is still like two or three interacting magnetic fields. It’s dynamic, unpredictable, and exciting to some.

What can we expect from Google, Microsoft, Autonomy and other parties?

There companies are good at keeping secrets and each is willing to sue anyone who provides highly specific information about what’s next from their creative ovens. I can offer some high level opinions with the caveat that my hunches may not be what these outfits actually do.

Autonomy. This company is morphing from search into a different type of information solutions company.  When  I look at the range of products on offer, I see a mini solutions conglomerate, not a search or content processing company. For example, fraud detection may or may not involve words. Fraud detection focuses on patterns in data, not search. Another example, is the company’s video solutions. Search plays a part, but Autonomy offers a more robust way for an organization to manipulate its rich content. On the strength of its non search businesses, Autonomy seems poised to grow to $300 million or more in revenue. This is a great achievement, but it is not a pure search play.

Google is a bit of a mystery to me. The company has some interesting patent documents and fascinating demonstration services. Google is content to collect billions from online advertising and sit on its hands as Amazon, Salesforce.com, and other companies push aggressively into cloud services. Google makes money from ads, but I am reluctant to say, “Google is a search company.” Google is an applications platform. Search and advertising are a couple of popular applications, not the whole company.

Microsoft is quite interesting to me. I think the fate of Microsoft will  be to end up as an applications company, a game company, and a server company. Microsoft wants to have an online company like Google, but I don’t think it can achieve this unless it shatters itself and then starts online without the baggage from the past. In terms of search, Microsoft is a me-too squared company. Google is deeply duplicative of AltaVista.com, Overture.com, and Microsoft.com. Microsoft, oddly enough, is trying to duplicate Google which has duplicated part of Microsoft. Copies of copies get blurry, so Microsoft lacks focus in its search efforts across its very different business units. The Microsoft money comes from upgrades to operating systems and applications. I think the company has a struggle for the foreseeable future.

Stephen Arnold, August 24, 2008

Powerset as Antigen: Can Google Resist Microsoft’s New Threat

August 20, 2008

I found the write ups about Satya Nadella’s observations about Microsoft’s use of the Powerset technology in WebProNews, Webware.com, and Business Week magnetizing. Each of these write ups converged on a single key idea; namely, Microsoft will use the Powerset / Xerox PARC technology to exploit Google’s inability to deal with tailoring a search experience to deliver a better search experience a user. The media attention directed at a conference focused on generating traffic to a Web site without regard to the content on that site, its provenance, or its accuracy is downright remarkable. Add together the assertion that Powerset will hobble the Google, and I may have to extend my anti-baloney shields another 5,000 kilometers.

Let’s tackle some realities:

  1. To kill Google, a company has to jump over, leap frog, or out innovate Google. Using technology that dates from the 1990s, poses scaling challenges, and must be “hooked” into the existing Microsoft infrastructure is a way to narrow a gap, but it’s not enough to do much to wound, impair, or kill Google. If you know something about the Xerox PARC technology that I’m missing, please, tell me. I profiled Inxight Software in one of my studies. Although different from Xerox PARC technology used by Powerset, it was close enough to identify some strengths and weaknesses. One issue is the computational load the system imposes. Maybe I’m wrong but scaling is a big deal when extending “context” to lots of users.
  2. Microsoft is slipping further behind Google. The company is paying users, and it is still losing market share. Read my short post on this subject here. Even if the data are off by an order of magnitude, Microsoft is not making headway in the Web search market share.
  3. Cost is a big deal. Microsoft appears to have unlimited resources. I’m not so sure. If Google’s $1 of infrastructure investment buys 4X the performance that a Microsoft $1 does, Microsoft has an infrastructure challenge that could cost more than even Microsoft can afford.

So, there are computational load issues. There are cost issues. There are innovation issues. There are market issues. I must be the only person on the planet who is willing to assert that small scale search tweaks will not have the large scale effects Microsoft needs.

Forget the assertion that Business Week offers when its says that Google is moving forward. Google is not moving forward; Google is morphing into a different type of company. “Moving forward” only tells part of the story. I wonder if I should extend my shields of protection to include filtering baloney about search emanating from a conference focused on tricking algorithms into putting a lousy site at the top of a results list.

Agree? Disagree? I’m willing to learn if my opinions are scrambled.

Stephen Arnold, August 20, 2008

Answering Questions: Holy Grail or Wholly Frustrating

July 2, 2008

The cat is out of the bag. Microsoft has acquired Powerset for $100 million. You can read the official announcement here. The most important part of the announcement to me was:

We know today that roughly a third of searches don’t get answered on the first search and first click…These problems exist because search engines today primarily match words in a search to words on a webpage [sic]. We can solve these problems by working to understand the intent behind each search and the concepts and meaning embedded in a webpage [sic]. Doing so, we can innovate in the quality of the search results, in the flexibility with which searchers can phrase their queries, and in the search user experience. We will use knowledge extracted from webpages [sic] to improve the result descriptions and provide new tools to help customers search better.

I agree. The problem is that delivering on these results is akin to an archaeologist finding the Holy Grail. In my experience, delivering “answers” and “better results” can be wholly frustrating. Don’t believe me? Just take a look at what happened to AskJeeves.com or any of the other semantic / natural language search systems. In fact, doubt is not evident in the dozens of posts about this topic on Techmeme.com this morning.

So, I’m going to offer a different view. I think the same problems will haunt Microsoft as it works to integrate Powerset technology into its various Live.com offerings.

Answering Questions: Circa 1996

In the mid 1990s, Ask Jeeves differentiated itself from the search leaders with its ability to answer questions. Well, some questions. The system worked for this query which I dredged from my files:

What’s the weather in Chicago, Illinois?

At the time, the approach was billed as natural language processing. Google does not maintain comprehensive historical records in its public-facing index. But you can find some information about the original system here or in the Wikipedia entry here.

How did a start up in the mid-1990s answer a user’s questions online? Computers were slow by today’s standards and expensive. Programming was time consuming. There were no tools comparable to python or Web services. Bandwidth was expensive and modems, chugged along south of 56 kilobits per second, eagerly slowing down in the course of a dial up session.

jeeves 1997

I have no inside knowledge about AskJeeves.com’s technology, but over the years, I have pieced together some information that allows me to characterize how AskJeeves.com delivered NLP (natural language processing) magic.

Humans.

AskJeeves.com compiled a list of frequently asked questions. Humans wrote answers. Programmers put data into database tables. Scripts parsed the user’s query and matched it to the answers in the tables. The real magic, from my point of view, was that AskJeeves.com updated the weather table, so when the system received my query “What is the weather in Chicago, Illinois?”, the system would pull the data from the weather table and display an answer. The system also showed links to weather sites in case the answer part was incorrect or not what the user wanted.

Over time, AskJeeves.com monitored what questions users asked and added these to the system.

What happened when the system received a query that could not be matched to a canned answer in a data table? The system picked the closest question to what the user asked and displayed that answer. So a question such as “What is the square of aleph zero plus N?” generated an answer along the lines “The Cubs won the pennant in 1918?” or some equally crazy answer.

AskJeeves.com discovered several facts about its approach to natural language processing:

  1. Humans were expensive. AskJeeves.com burned cash. The company tried to apply its canned question answering system to customer support and ended up part of the Barry Diller empire. Humans can answer questions, but the expense of paying humans to craft templates, create answer tables, and code the system were too high then and remain cash hungry today.
  2. Humans asked questions but did not really mean what they asked? Humans are perverse. A question like “What’s a good bar in San Francisco?” can go off the rails in many ways. For example, what type of bar does the user require? Biker, rock, blue collar? What’s San Francisco? Mission, Sunset, or Powell Street? The problem with answering questions, then, is that humans often have a tough time formulating the right question.
  3. Information changes. The answer today may not be the answer tomorrow. A system, therefore, has to have some way of knowing what the “right” answer is in the moment. As it turns out, the notion of “real time”–that is, accurate information at this moment–is an interesting challenge. In terms of stock prices, the “now quote” costs money. The quote from yesterday’s closing bell is free. Not only is it tricky to keep the index fresh, to have current information may impose additional costs.

This mini-case sheds light on two challenges in natural language processing.

Read more

Microsoft Powerset: Is There a Role for Amazon?

June 27, 2008

On May 10, 2008, I offered some thoughts about Microsoft’s alleged interest in Powerset. You can find this bit of goose quacking here.

In case you missed the flurry of articles, essays, and opinion pieces, more rumors of a Microsoft Powerset tie up are in the wind. Matt Marshall ignited this story with his write up “Microsoft to Buy Semantic Search Engine Powerset for $100 Million Plus”. You must read this here. The most interesting statement in the essay is:

Google has generally dismissed Powerset’s semantic, or “natural language” approach as being only marginally interesting, even though Google has hired some semantic specialists to work on that approach in limited fashion.

My research for BearStearns last year revealed that Google has more than “some specialists” working on semantic issues. Alas, that document “Google’s Semantic Web: the Radical Change Coming to Search
and the Profound Implications to Yahoo! & Microsoft” is no longer easily available. There is some information about the work of Dr. Ramanathan Guha in my Google Version 2.0 study, but the publisher insists on charging people for the analysis of Dr. Guha’s five patent applications. Each of these comes at pieces of the semantic puzzle in quite innovative ways. If Dr. Guha’s name does not ring a bell, he worked on the documents that set forth the so-called Semantic Web.

So, Google is–according to this statement by Mr. Marshall not too keen on Powerset-style semantics. I agree, and I will get to the reasons in the Observations section of this essay.

The story triggered a wave of comments. You can find very useful link trails at Techmeme.com and Megite.com. The one essay you will want to read is Michael Arrington’s “Microsoft to Buy Powerset? Not Just Yet.” By the time you read this belated write up, there will be more information available. I enjoy Mr. Arrington’s writing, and his point about the Powerset user interface is dead accurate. We must remember that user’s are creatures of habit, and the user community seems to like type a couple of words, hitting the enter key, and accepting the first three or four Google results as pretty darn good.

powerset-233x300

Semantic technology is very important. Martin White and I are working on a new study, and at this point it appears that semantic technology is something that belongs out of site. Semantic technology can improve the results, but like my late grandmother’s girdle and garters, the direct experience is appropriate only for a select few. Semantic technology seems to share some similarities with this type of best-left-unseen experience from my childhood.

An Amazon Connection?

My interest in a Microsoft Powerset deal pivots around some information that I believe to have a kernel of truth buried in it. Earlier this year, I learned the Microsoft had a keen interest in Amazon’s database technology. Actually, the interest was not in the Oracle database that sites, like a black widow spider in the center of a Web, but in the wrapper that Amazon allegedly used to prevent direct access to the Oracle tables from creating some technical problems.

Amazon had ventured into new territory, tapping graduate students from the Netherlands, open source, specialist vendors, and internal Amazon wizards to build its present infrastructure. Amazon has apparently succeeded in creating a Google-like infrastructure at a fraction of the cost of Google’s own infrastructure. Amazon also has fewer engineers and more commercial sense than Google.

In the last 18 months, Amazon has pushed into cloud computing, Amazon Web services, and jump starting a wide range of start ups needful of a sugar daddy. I recently wrote about Zoomii.com, one innovator surfing on the Amazon Web services “wave”. You can read that essay here.

Microsoft needs a NASCAR engine for its online business. Microsoft is building data centers. But compared to Amazon and Google, Microsoft’s data centers are a couple of steps behind, based on my research work.

At one meeting in Seattle, I heard that Microsoft was “quite involved” with Amazon. When I probed the speaker for details, the engineer quickly changed the subject.

Powerset–if my sources are correct (which I often doubt)–is using Amazon Web services for some its processing. If true, we have an interesting possibility that Microsoft may be pulled into an even closer relationship with Amazon.

I am one of the people who thought that Microsoft would be better able to compete in the post-Google world if Microsoft bought Amazon. Now let me get to my thinking, and, as always, I invite comments. First, Microsoft would gain Amazon’s revenue and technical know how. Arguably these assets could provide a useful platform for a larger presence in the online world.

Second, Microsoft gains the cloud-based infrastructure that Amazon has up and running. From my point of view, this approach makes more sense than trying to whip Windows Server and SQL Server into shape. The Live.com services could run on Amazon or, alternatively, the whopping big Microsoft data centers could be used to provide more infrastructure for Amazon. An added benefit is that Microsoft–despite its spotty reputation for engineering–seem to me to be more disciplined than Amazon’s engineers. I have heard that Amazon pivots on teams that can be fed with a pizza. While good for the lone ranger programmers, the resulting code can be tough to troubleshoot. Each team can do what it needs to do to resolve a problem. The approach may be cheaper in the short run, but in my opinion, may create the risk of a cost time bomb. A problem can be tough to troubleshoot and then fix. Every minute of downtime translates to a loss in credibility or revenue.

Read more

Business Objects: Number One in Business Intelligence… for Now

June 26, 2008

Business intelligence–along with content management and enterprise search–is a mid-sized blob of marketing mercury. The big names in the US are SPSS and SAS Institute. Both work hard to get colleges and universities to teach eager math students how to make these proprietary systems make data walk on their hind legs, roll over, and sit on command. Business Objects, a sales-oriented company, has made in roads into the SPSS and SAS client base and now the Gartner Group has named Business Objects as the number one business intelligence outfit.

You can read SearchDataManagement.com’s summary of the Gartner research here. You can read the Business Objects news release here. Let’s get to the meat of the Gartner study. For me this was the key point:

Combined, SAP and Business Objects controlled 26.3% of the global BI platform market in 2007, nearly double their nearest competitors. IBM and Cognos held 14.7% market share, followed by the SAS Institute at 14.5%.

So, “combined” makes Business Objects number one. Chop out the SAP part and Business Objects posts nearly $1.0 billion in revenues. Will Business Objects be able to maintain is revenues? Will the company be able to make Inxight Software into more than a content utility? Will superplatforms such as IBM. Microsoft, and Oracle bundle business intelligence with higher value systems sucking the air out of Business Objects’ growth?

For me, Business Objects means excellent sales management. Could its success come from the lack of marketing and sales management expertise, not its technology?

Stephen Arnold, June 26, 2008

Forbes on Powerset

June 19, 2008

Forbes Magazine has an interesting article about Powerset, Chris Taylor’s “The Next Search Frontier: Just Ask Your Question“. I often have difficulty locating information on the Forbes’ Web site. Sometimes I grow frustrated with the pop up ads and page latency, so snag this article quickly.)

The key point in the article for me was this statement:

Powerset’s main asset is a partnership with PARC, the Palo Alto research center that incubated the computer mouse and the laser printer. In 2005, Pell discovered that PARC researchers had been working for 30 years on turning English into software code. Pell promptly licensed PARC’s research and hired the top scientists in the field, starting with Powerset co-founder Lorenzo Thione.

Xerox PARC (now simply PARC — it’s officially a subsidiary company of Xerox) has been an innovator for many years. But my experience has been that some of its better ideas are difficult to commercialize and convert into major revenue winners. Inxight Software, a PARC spin out, gained some market success and was acquired by Business Objects, which in turn was acquired by SAP. Powerset’s tie up with PARC will be another opportunity to convert ideas into revenue.

You can test drive Powerset here. Information about PARC is here.

I am accustomed to formulating queries with Boolean ANDs and NOTs. Typing questions is too much work for me. With the average query creeping up to 2.3 words on major public search engines, the idea that a well formed question will revolutionize search seems unlikely.

Natural language processing, like semantic and linguistics mechanisms, may be best suited for work behind the scenes, not in front of the user.

Stephen Arnold, June 19, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta