IBM Watson and NLP: Marketing or Solution?

August 27, 2011

Watson, the IBM supercomputer, cause quite a stir earlier this year when it swept through the Jeopardy playing field, winning round after round handedly. On a more technical level, Watson may have greater implications for how unstructured data is tackled in search. Brian McKenna’s interview with Craig Rhinehart at IBM, “Watson’s natural language processing takes crack at unstructured data,” tells us more.

Craig Rhinehart said:

We think of it (Watson) as a breakthrough in computing. Unstructured information and communicating in natural language have not been well-adopted in IT terms. This technology will enable new ways to interact with computers, opening up new solutions. Natural language is very ambiguous, as opposed to data, where a five is always a five . . . In natural language, we speak in riddles, abbreviations, with pop culture references . . . But 80% plus of our information is unstructured, and we are expecting 44 times growth more in the next 10 years.

The problems encountered by natural language processing are numerous and no one seems to have a perfect solution for how to tackle all of them at once. Watson itself made an embarrassing move once or twice on Jeopardy, when it seemed to misunderstand a question. Considering the overall success that Watson had in interpreting colloquial language, it is a major breakthrough. Are we to believe that Watson is also responsible for introducing the concept of natural language processing to everyday Americans?

Emily Rae Aldridge, August 27, 2011

Sponsored by Pandia.com

IBM Replicates Brain Activity

August 26, 2011

The world of machines is one step closer to replicating human brain activity. BBC News covers the latest IBM development in, “IBM Produces First Brain Chips.” We learned:

IBM has developed a microprocessor which it claims comes closer than ever to replicating the human brain . . . The SyNAPSE system uses two prototype “neurosynaptic computing chips”. Both have 256 computational cores, which the scientists described as the electronic equivalent of neurons. One chip has 262,144 programmable synapses, while the other contains 65,536 learning synapses.

Man versus machine is a common cultural expression, but “man machine” might soon enter the picture. The learning synapses included in the system are able to simulate the process of learning, or strengthening of certain connections through exposure to certain experiences. While a miraculous leap in the worlds of technology and engineering, experts are still cautious. Cognition is a process over and above that of computation. And while the SyNAPSE system might use high-level computation to simulate cognition, some wonders of the human brain remain to be solved.

Our view is that IBM public relations is active and we will have to wait to see if the innovation delivers. A public demo of Watson running on these new chips would be quite useful to us here in Harrod’s Creek.

Emily Rae Aldridge, August 26, 2011

Sponsored by Pandia

Janya Releases Semantic Analysis Platform

August 26, 2011

Janya, Inc., a leader in natural language processing, recently announced its release of Semantex 5.0, the most powerful version of its semantic analysis platform yet. The San Francisco Chronicle reports in, “Janya Announces Semantex™ 5.0 Multilingual Semantic Analysis Platform.”

Semantex has historically powered enterprise, SaaS social media analysis and government intelligence applications. The new enhancements reflected in Semantex™ 5.0 add significant value for existing customers and provide additional capabilities enabling a wide range of government, commercial, and academic uses. Semantex™ 5.0 can power solutions including market research, competitive intelligence, scientific, patent and medical data mining, e-discovery and compliance monitoring.

The new features mentioned include “improved language support, broader support for office document formats, new output formats, optimized pre-configured levels of processing, and highly scalable service oriented architecture support for Big Data deployments.”

Big data is a term hotly contested in tech circles, but we see the implication here. The more powerful and efficient the semantic analysis platform, the more meaning that can be derived from an enormous pool of data. Whether or not companies are ready to put the effort into analyzing and using such data remains to be seen. Regardless, Janya has produced a product that seems to be both efficient and highly customizable.

Emily Rae Aldridge, August 26, 2011

Sponsored by Pandia.com

MIIAtech Unveils NLP Processing Platform

August 26, 2011

MIIAtech, to a new player in the world of search and analysis software, is making its market debut at CRM Evolution in New York City. The press release, “MIIAtech, a Search and Analysis Software Company, Unveils Enterprise Software Platform at CRM Evolution,” explains more.

Tautona broadly addresses the problem of information overload and inadequate searches that hamper large organizations . . . Tautona is able to search voluminous databases filled with structured and unstructured information by understanding the meaning of both the request and the stored information. The platform is fully cross-lingual, allowing questions to be asked in one language while the search is conducted in another or multi-language database (e.g. Chinese; Russian; and/or French). The answer is returned in the originating language.

The market has not had time to vet this new product, but if the price is right there will be customers who give it a try. The NLP market is getting more and more competitive, as firms and companies understand that the future of data is heading out of the structured realm and into the unknown. Stay tuned for the success of MIIAtech and Tautona.

Emily Rae Aldridge, August 26, 2011

Sponsored by Pandia.com

Local Deals Appeal to the Google

August 26, 2011

Google seems to be interested in the mobile advertising business. First they purchased AdMob for $750 million and now they are combining local searches with mobile advertising. This strategy is discussed by Jason Spero, Google’s head of mobile for the Americas, in the Tech Crunch article “Google’s U.S. Mobile Head Talks Local Intent, M-Commerce, Geo-Targeting and More”.

Spero says the move to combine local and mobile advertising was simply driven by consumer behavior. They found that one in three mobile interactions had local intent. So they have been exploring several different local ad formats, they have been particularly focusing on click-to-call ads.

[T]hese ads perform well on phones because the natural path for a consumer is to want to engage the phone and get more info from a merchant or service. The key in these advertisements on mobile phones is a ‘call to action’ element, which can work for a retailer, florist, insurance company and many other smaller businesses. So far, he says that over 500,000 Google customers are running click-to-call campaigns on a mobile phone.

Google’s strategy seems to be right on target. I cannot count the times that I have used my phone of look up a local pizza joint or a retail store. This could lead to some major advertising dollars for Google, a significant profit increase for the advertiser and a happy customer with the quick and easy ad format. Seems like a win-win-win to me. What about exisiting local ad and deal vendors? Maybe a lose-lose?

Jennifer Wensink August 26, 2011

Sponsored by Pandia.com

Protected: SharePoint and the Search for Social Data

August 26, 2011

This content is password protected. To view it please enter your password below:

Quite an August 2011: Search and More in Transition

August 25, 2011

I have been watching the computer earthquakes for decades. I have survived the mainframe to mini revolution, the mini to desktop, and the other seismic events which transform the hills and plains of the technology world. My view on the summer 2011 season is easy to sum up: A New Era is upon us.

image

Major disruption.Source: http://www.theberkeleygraduate.com/2009/10/preparing-for-an-earthquake/

First, consider search. Of the blue chip enterprise search vendors I analyzed in the first edition of the Enterprise Search Report in 2004, there is one independent firm, Endeca. Autonomy, Convera, Exalead, and Fast Search are either out of business (Convera) or absorbed into much larger firms and likely to be integrated into other enterprise solutions. Of this group, only Endeca remains independent, which begs the question, “Why?” Of the 2004 big five—Autonomy, Convera, Endeca, Exalead, and Fast Search & Transfer—only Exalead retains its technical leadership under the Dassault management system. (What then are the management systems of Hewlett Packard and Microsoft? I will let you answer the question.)

The search landscape is now chock-a-block with firms who will have an opportunity to expand their reach and market impact. For a listing of some firms I track, navigate to Overflight. The question is, “Will in the present double dip environment will these promising companies have the money and time required to achieve the type of revenues associated with Autonomy, which was on track to break the $1.0 billion glass ceiling against which many search and content processing have pressed their noses?” On a call with an investment firm this week, I said when asked about a data management vendor’s roll out of an enterprise search system, “Long shot.”

Second, what about Google? The disruptions triggered by Google in August are significant for several reasons. Google purchased Motorola Mobility and acquired additional litigation and intellectual property. Along with the purchase comes a hardware business which seems to be capable of creating friction with some of the companies in the Android ecosystem. More important to me is that the purchase of Motorola Mobility makes clear that the Google of old is no more. The commitment to mobile means that search has to monetize, not deliver on point results. I know that most users find Google the go-to search engine. My concern is that without significant growth in ad revenues, the bets are significant. How does one generate money from traffic? Sell access to that traffic is my answer. Search and Google are now similar to HP. The companies must move forward.

Read more

Social Media: Making the Personal Impersonal

August 25, 2011

Search engines are now using social media data to rank query results. As crazy as it sounds, your Tweets could now alter the way Google gives you information on such benign things as “George Washington” or “grilled cheese sandwiches.” eSchool News takes a look at how “New Web-Search Formulas Have Huge Implications for Students and Society.”

Search results now differ from person to person based on algorithms have been altered to include social media data. Knowing that most people don’t go past the second page of results, they have tailored their ranking system to consider links you have clicked on and create a filter system based on those previous links. This isn’t something ground breaking since Amazon and Netflix have been using it for years to recommend books and movies, but is new to the major search engines.

At the 2011 Technology, Entertainment, and Design talk, Eli Pariser, the author of The Filter Bubble, shared his reservations with the “invisible algorithmic editing of the web.” He believes it only shows us what it thinks we want and not what we need to see.

[I]t was believed that the web would widen our connections with the world and expose us to new perspectives, Pariser said: Instead of being limited to the newspapers, books, and other writings available in our local communities, we would have access to information from all over the globe. But thanks to these new search-engine formulas, he said, the internet instead is coming to represent ‘a passing of the torch from human gatekeepers [of information] to algorithmic ones.’ Yet, algorithms don’t have the kind of embedded ethics that human editors have, he noted. If algorithms are going to curate the world for us, then ‘we need to make sure they’re not just keyed to [personal] relevance—they also should show us things that are important, or challenging, or uncomfortable.’

It seems that search engines may be focusing on personal factors, but are not personalizing the process. The user has no control over results. That customization is left to a rigid algorithm. If a restaurant says that they make burgers “made-to-order,” then I expect to be able to pick mustard and onions on one visit, and pick cheese and ketchup on the next visit. The server should not just look at my past orders and make an educated guess. There is nothing “personal” about that.

Could this lead some well meaning people down an unintended and risky path to censorship-by-computer. Users must gain more control over these search formulas. There are certainly times when social media parameters are acceptable, but sometimes you want and need to see the other side. It depends if you are settling an argument between your friends over song lyrics or writing a thesis on communism. Until users are offered more liberal control, I think this “personal” ranking system will actually suppress and limit a lot of important information that users are seeking. The social impact on a search comes at a pretty high price.

Jennifer Wensink, August 25, 2011

Sponsored by Pandia.com

SearchBlox: Targeting the Enterprise for Search

August 25, 2011

The U.S. based solutions corporation is making waves worldwide. Withmore than 300 clients (big name clients like IBM and Capitol One), thecompany has grown exponentially since its inception in 2003. We will be adding SearchBlox to our Overflight service in September 2011.

SearchBlox is an open source search application built upon the ApacheLucene software. SearchBlox offers an innovative solution toconducting enterprise searches utilizing integrated crawlers. These can process multiple file formats in the HTTP/HTTPS file system as well as news feeds. The system also supports a cloud deployment.

Built on Lucene, you can download the system without charge. “Free” is a magnetic idea in today’s business climate. SearchBlox believes that those seeking Web site, Intranet and search solutions will find the SearchBlox approach of interest.

After releasing their new 6.4 version, some of their newest endeavorsinclude utilizing it’s SearchBlox solution to search through fileslocated in the cloud. Their latest assignment is tackling the Amazoncloud. We learned:

SearchBlox also provides an end user interface which allows you tosetup your search and allow users to directly search them. UseAmazon’s Free Tier and setup your entire your store and searchcapability on the cloud. SearchBlox provides complete control overyour index.

Redtopia.com software and design is saying good things about theresults it is deriving from the software; for example:

My search [directed] me to SearchBlox, which is built on Lucene, the samesearch engine that Solr uses. It’s free, open source, and with nocommercial license limitations and works on Windows, Unix, and Mac OSX. Within minutes, I had the package installed on my local machine andI immediately started crawling my local development sites. Promising.

The privately held company is growing at a rapid pace. SearchBlox hasalready proven that its free software can rival that of the best paidsoftware in the U.S. and around the world. They’ve even trended a time or two on Twitter.

Leslie Radcliff, August 25, 2011

Sponsored by Pandia.com

Will Open Source Software Disrupt Law Firm eDiscovery Vendors?

August 25, 2011

The International Legal Technology Association (ILTA) is about to gather in the heart of Opryland for their annual conference. As discussed in “ILTA Open Source Panel Focuses on Bates Stamping, E-Discovery” their agenda includes a discussion about the growing trends in open source software (OSS) for law firms.

The Open Source Gurus panel will be weighing in on several legal applications including their Bates Master software which is aimed to replace the tedious, hand-operated page numbering system used by firms. They will also take a look at FreeEed which is the only open source e-discovery software available to the public. Discovery, which is a fact finding process leading up to trial, traditionally leads to voluminous amounts of documents being exchanged between the parties, but OSS is working to change that.

It seems that open source technology for the legal world is not a big business, but certainly could be a niche opportunity for software developers.

Software for managing unbundled legal services such as document assembly and limited-scope representation could be good fits for the open-source model, whether the users are small firms with limited IT budgets or large organizations that hire contract counsel.

 

I don’t see open source legal software being used for much more than numbering pages until trusted developers join the fray. The legal community is regimented and until they are confident in open source products, they will not change. So ILTA has their work cut out for itsself if it plans to sway firms–especially the big, traditoinal law firms–to the exciting world of free and open source software.

Jennifer Wensink, August 25, 2011

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta