July 9, 2012
Wikipedia is a go to source for quick answers outside the classroom, but many don’t realize Wiki is an ever evolving information source. Geekosystem’s article “Wikistats Show You What Parts Of Wikipedia Are Changing” provides a visual way to see what is changing within Wikipedia.
The performance program was explained as:
“Utilizing technology from Datasift, a social data platform with a specialization in real-time streams, Wikistats lists some clear, concise information you can use to see how Wikipedia is flowing and changing out from under you. Using Natural Language Processing, Wikistats is able to suss realtime trends and updates. In short, Wikistats will show you what pages are being updated the most right now, how many edits they get by how many unique users, and how many lines are being added vs. how many are being deleted.”
Enlightenment was gained when actually viewing the chart below:
This program calculates well defined reports on Wikipedia’s traffic, and Wiki frequenters might find the above chart surprising. The report in this case shows the reality that Wikipedia is an over flowing pool of information.
We are not saying Wikipedia is unreliable, but one should never solely rely on one information source. The chart simply provides a visual way to see what is changing within Wikipedia and help users understand how data flows. This programs potential for real time use on other sites could be tremendous.
Jennifer Shockley, July 9, 2012
July 2, 2012
It is possible to teach an old dog new tricks according to Semanticweb.com’s article, ‘FirstRain Spotlights Semantics Across Domains’. Semantic approaches for a targeted domain work well because one can train the NLP engine to recognize key words that are applied. The downside is that the business world of today is vast and the current training limitations for specific domains cannot always scale.
FirstRain has opened a unique version of a semantic obedience school as:
“Affinity scoring must be a breakthrough for classes of information where there is a lot of ambiguity, and the cool thing about it is that you can actually apply it in a way to create a virtuous self-improving spiral that works across massively different information domains. When you set up the correct feedback loop of affinity scoring and don’t encode to different domains, but let it swing across those you are trying to match things to, you can create a self-learning system.”
The new system derived by FirstRain is capable of re-training the most stubborn of semantics and inspiring functionality. By creating adaptable semantics they have taught an already workable system to handle a variety of information in an even more efficient process. The semantic obedience school could very well be the next big thing in the business world if all goes as they plan. The new routine seems feasible, so has FirstRain cracked the tough training nut of cross domain semantics?
Jennifer Shockley, Juuly 2, 2012
June 11, 2012
A new, “cool,” vendor has been announced in a list of Cool Vendors in the Analytics and Business Intelligence, 2012 report by Garner, Inc.
According to the article, “EasyAsk Named ‘Cool Vendor’ by Leading Analyst Firm,” EasyAsk’s Siri-like mobile app for corporate data is one to note. The app, named Quiri, combines voice and NLP to provide a usable, and apparently “cool,” user-experience. A video demonstration of the product is available here. The article states:
“Quiri offers users Siri-like built-in speech recognition and natural language processing, allowing users to conveniently speak their business questions and get immediate answers to business questions. Users tap a microphone button, speak a request and Quiri retrieves the answer from existing corporate data.
EasyAsk eCommerce search and merchandising software – available on-premise or as a service (SaaS) – leads the industry in customer conversion by providing the right products on the first page, every time.”
We find this to be an interesting angle for a product spotlight. We aren’t sure if this is a pay-to-play write-up or an objective analysis. We also aren’t sure what “cool” means when referring to a product’s usability, but look forward to seeing more from EasyAsk.
Andrea Hayden, June 11, 2012
Sponsored by PolySpot
June 5, 2012
More explanations of how Google’s smart system becomes so intelligent; not too much illumination on precision and recall however. Research Blog hosts a post from a Google research team titled “From Words to Concepts and Back: Dictionaries for Linking Text, Entities and Ideas.” They begin by laying out the primary Google challenge:
“Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning — from turning search queries into relevant results to suggesting targeted keywords for advertisers — is also Google’s core competency, and important for many other tasks in information retrieval and natural language processing.”
Researchers Valentin Spitkovsky and Peter Norvig go on to detail some of the techniques they have used, including building on the traditional encyclopedia model, much like Wikipedia. They then get into some technical particulars like language strings and inverted indexes; see the article for more. Or for in-depth detail, see the teams paper, “A Cross-Lingual Dictionary for English Wikipedia Concepts.”
Cynthia Murrell, June 5, 2012
Sponsored by PolySpot
May 31, 2012
The current version of Semantic Knowledge’s Troupe is now available for Download at no cost. This useful tool has been benefiting business for over a decade and has yet to outlive its usefulness.
Semantic-Knowledge has been in business since 1994 providing business consumers with the means to increase ROI with simplified Natural Language Processing software including; Semantic Search Engine, Text Analysis, Intelligent Desktop Search, Text Mining and Classification systems.
Tropes will perform different types of text analyses but the overall purpose is to assign, to analyze and to examine text. A basic summery of the program is:
“Content analysis consists in revealing the framework of a text, i.e. its meaning. This necessarily implies two things. First, there must be a theoretical conception of the text: this must describe both the textual organization of the things that are said and the structural organization of the thought-processes of the people who say them. Secondly, it implies the use of a tool derived from this theoretical conception and rigorously excludes the subjectivity of the investigator, at least until the analysis is finished.”
Tropes offer’s considerable time savings and enhances strategic data. Therefore it can help businesses to yield an exceptional Return On Investments (ROI). Since Tropes is no longer a commercial product, now users can experiment with this text based programming without the cost incurred during its initial release.
Jennifer Shockley, May 31, 2012
Sponsored by PolySpot
May 26, 2012
I have had to look up the antecedents for InQuira again. I wanted to create this post to make it easy to reference these two firms which were combined to create InQuira. InQuira was acquired by Oracle Corp. in that company’s push to address its long-standing search and content processing issues. I have in my Overflight system the 2006 InQuira marketing collateral which, I noticed, provides a crib sheet for the many enterprise search vendors piling into the customer support segment. What’s interesting is that customer support is one of the sectors where open source search is getting some attention.
The antecedents of InQuira were:
- Answerfriend. The company had software which could understand text. In 2000, the company landed Accenture as a customer. Answerfriend pivoted on its natural language processing technology. Allegedly Answerfriend could handle both structured an unstructured data. Sound familiar in 2012?
- Electric Knowledge Inc. This also was an NLP shop. The technology was based on computational linguistic technology. This company had licensed its technology to Bank of America, an outfit which has had a long history of trying to find a search system which meets its requirements.
InQuira was created in 2002. The notion of hooking together two separate vendors to do the 1+1=3 thing has been used more recently by Lexalytics and Attensity.
At one time, InQuira was the answer system used by Yahoo’s customer support service. I encountered this when I tried to cancel a Yahoo service. The InQuira service was not too helpful to me. I just killed the credit card and solved the problem.
The marketing pitch of InQuira is as fresh today as it was in 2002. How much progress has there been in search and content processing in the last decade? Could the marketing collateral for a 2002 Oldsmobile be used without any changes? Probably not. Search has a limited supply of jargon, and it gets recycled endlessly in my opinion.
Stephen E Arnold, May 26, 2012
Sponsored by Polyspot
May 7, 2012
Social media is swarming with sound bites about social media. We recently came across a bit of information about Bitext’s recent SIG brainstorming meeting, which prompted further investigation into their company. As their name implies, they are concerned with text bits. Or, as the name we know it as: unstructured content.
There event was a big success with attendance turning out to be double what they expected. Social media and business strategies were discussed, in particularly in relation to their primary concern of semantics.
Amongst several solutions, consulting services and research and development, NaturalFinder stood out as having value on par with other semantically enriched search technology:
“NaturalFinder is the essential complement for any Internet or intranet search engine as it allows users to query in natural language (Spanish, English, French…) without using Booleans or wildcards. Thanks to its linguistic technology, users can focus on typing their queries in his/hew own words as if he/she talked to another person. NaturalFinder will return all relevant documents and more documents than traditional search engines, which are based on keywords.”
It is clear here that technology is continuing to adapt to the larger trend of pervasive informal language. First, we saw unstructured content, as opposed to traditional structured content, utilized for business analytics. Now, we are creating tools that allow search engines to mimic human intelligence.
Megan Feil, May 7, 2012
Sponsored by Ikanow
March 20, 2012
WillQuitSmoking.com recently shared a video on a new analytics system being used at a healthcare facility in Austin, Texas that is using this technology to save lives by manages large amount of unstructured data.
According to the article, “Seton Healthcare Uses IBM Content and Predictive Analytics to Improve Care & Lower CHF Readmissions,” Seton Healthcare relies on IBM Content and Predictive Analytics to identify high-risk congestive heart failure (CHF) patients for interventive care and to avoid preventable readmissions.
The article states:
Natural language processing enables analysis of both structured (ie lab results) and unstructured data (ie physician notes, discharge summaries), opening the door to rich clinical and operational insights that were hidden in inaccessible free text files. Seton can now identify trends and patterns in patient care and outcomes, uncovering sometimes obscure correlations or disparities buried in years of medical records; these can dramatically improve diagnosis and treatment.
One of the reasons that this article is really cool is because you can learn by watching a video, not using a live, online demo of the technology. Yep, we think movies are much better than live systems. Are videos easier to control than a game show? Yep. Yep. Yep.
Jasmine Ashton, March 20, 2012
Sponsored by Pandia.com
January 4, 2012
As if to continue trying to prove that it can do anything, “IBM’s Watson to Help Doctors Diagnose, Treat Cancer,” reports eWeek. The AI supercomputer will be working with the Cedars-Sinai cancer center and insurance company WellPoint to evaluate cancer treatment options. Writer Brian T. Horowitz explains:
Using its data analytics and NLP [Natural Language Processing] capabilities, Watson would integrate data such as medical literature, patient histories, clinical trials, side effects and outcomes data to help doctors decide on courses of treatment. . . . Watson would also look at the characteristics of a patient’s cancer and make recommendations on cost-effective treatment that would lead to the best outcome.
Of course, this advice would not replace that of a doctor, but it could become a valuable tool. Other health care organizations have been turning to technology for solutions. For example, Dell just donated an entire cloud infrastructure to the Translational Genomics Research Institute for storing medical trial data on pediatric cancer.
Good to see technology being used for the good of humanity, right? We would like to see IBM put Watson up on a test corpus for the public to use. Wishful thinking I suppose.
Cynthia Murrell, January 4, 2012
Sponsored by Pandia.com
November 30, 2011
I learned from one of my two or three readers that Barcelona was home to a natural language processing company. Several years ago, I spoke with a person familiar with the company Artificial Solutions. After a bit of fumbling around, I located a trade show at which the company was exhibiting. The company’s NLP system is called “Teneo.” The application which I recalled was the use of the NLP system for customer support. The company has expanded since I first learned about the firm. The technology has been applied to mobile devices, for example.
The company told me:
Teneo Mobile is a platform independent technology designed to enable companies, organizations, manufacturers and developers to create their own virtual assistant as a mobile app, regardless of platform, mobile device and even language. The Natural Language Interaction (NLI) engine is covered by patents. The system can currently be built in up to 21 different languages, including Mandarin and Russian.
The company, founded in 2001, is owned by its founders, the private equity fund Scope Growth II and some private investors. The company has tallied more than 200 projects in the public and private sector in 30 countries and 21 languages. In the telecommunications sector, the firm’s customers include:
The firm’s technology is based on the Teneo Interaction Engine. According to the firm, its system will:
reason like a human, using advanced linguistic and business rules to decide how best to respond to your customer’s request. Context comes into play here, such as time, date and place, as well as information picked up from previous conversations, customer data retrieved from your CRM system and transaction data from your ERP system. At this point, the Teneo will also eliminate any ambiguities from its initial analysis. Even one word can alter the meaning of a customer’s request. Teneo will instantly and dynamically re-assess content as the interaction develops, to understand what has changed and give the right answers. Natural language is full of subtle nuances, which Teneo is able to pick up and interpret. It understands idiom and slang, even dialect and SMS shorthand – and it’s also sympathetic to grammar, syntax or spelling mistakes.
For more information about the technology and its vertical applications, navigate to www.artificial-solutions.com.
Stephen E Arnold, November 30, 2011
Sponsored by Pandia.com