December 30, 2014
Natural language processing is becoming a popular analytical tool as well as a quicker way for search and customer support. Dragon Nuance is at the tip of everyone’s tongue when NLP enters a conversation, but there are other products with their own benefits. Code Project recently reviewed three of NLP in, ”A Review Of Three Natural Language Processors, AlchemyAPI, OpenCalais, And Semantria.”
Rather than sticking readers with plain product reviews, Code Project explains what NLP is used for and how it accomplishes it. While NLP is used for vocal commands, it can do many other things: improve SEO, knowledge management, text mining, text analytics, content visualization and monetization, decision support, automatic classification, and regulatory compliance. NLP extracts entities aka proper nouns from content, then classifies, tags, and provides a sentiment score to give each entity a meaning.
In layman’s terms:
“…the primary purpose of an NLP is to extract the nouns, determine their types, and provide some “scoring” (relevance or sentiment) of the entity within the text. Using relevance, one can supposedly filter out entities to those that are most relevant in the document. Using sentiment analysis, one can determine the overall sentiment of an entity in the document, useful for determining the “tone” of the document with regards to an entity — for example, is the entity “sovereign debt” described negatively, neutrally, or positively in the document?”
NLP categorizes the human element in content. Its usefulness will become more apparent in future years, especially as people rely more and more on electronic devices for communication, consumerism, and interaction.
December 18, 2014
A smaller big data sector that specializes in text analysis to generate content and reports is burgeoning with startups. Venture Beat takes a look out how one of these startups, Narrative Science, is gaining more attention in the enterprise software market: “Narrative Science Pulls In $10M To Analyze Corporate Data And Turn It Into Text-Based Reports.”
Narrative Science started out with software that created sport and basic earnings articles for newspaper filler. It has since grown into help businesses in different industries to take their data by the digital horns and leverage it.
Narrative Science recently received $10 million in funding to further develop its software. Stuart Frankel, chief executive, is driven to help all industries save time and resources by better understanding their data
“ ‘We really want to be a technology provider to those media organizations as opposed to a company that provides media content,’ Frankel said… ‘When humans do that work…it can take weeks. We can really get that down to a matter of seconds.’”
From making content to providing technology? It is quite a leap for Narrative Science. While they appear to have a good product, what is it they exactly do?
December 12, 2014
I need this in my office. I will dump my early 1940s French posters and go for logos.
Navigate to this link: http://bit.ly/1sdmBL0. You will be able to download a copy of an infographic (poster) that summarizes “The Current State of Machine Intelligence.” There are some interesting editorial decisions; for example, the cheery Google logo turns up in deep learning, predictive APIs, automotive, and personal assistant. I quite liked the inclusion of IBM Watson in artificial intelligence—recipes with tamarind and post-video editing game show champion. I found the listing of Palantir as one of the “intelligence tools” outfits. Three observations:
- I am not sure if the landscape captures what machine intelligence is
- The categories, while brightly colored, do not make clear how a core technology can be speech recognition but not part of the “rethinking industries” category
- Shouldn’t Google be in every category?
I am confident that mid tier consultants and reputation surfers like Dave Schubmehl will find the chart a source of inspiration. Does Digital Reasoning actually have a product? The company did not make the cut for the top 60 companies in NGIA systems. Hmmm. Live and learn.
Stephen E Arnold, December 12, 2014
December 12, 2014
Analytics outfit Lexalytics is going all-in on their European expansion. The write-up, “Lexalytics Expands International Presence: Launches Pain-Free Text Mining Customization” at Virtual-Strategy Magazine tells us that the company has boosted the language capacity of their recently acquired Semantria platform. The text-analytics and sentiment-analysis platform now includes Japanese, Arabic, Malay, and Russian in its supported-language list, which already included English, French, German, Chinese, Spanish, Portuguese, Italian, and Korean.
Lexalytics is also setting up servers in Europe. Because of upcoming changes to EU privacy law, we’re told companies will soon be prohibited from passing data into the U.S. Thanks to these new servers, European clients will be able to use Semantria’s cloud services without running afoul of the law.
Last summer, the company courted Europeans’ attention by becoming a sponsor of the 2014 Enterprise Hackathon in Prague. The press release tells us:
“All participants of the Hackathon were granted unlimited access and support to the Semantria API during the event. Nearly every team tried Semantria during the 36 hours they had to build a program that could crunch enough data to be used at the enterprise level. Redmore says, “We love innovative, quick development events, and are always looking for good events to support. Please contact us if you have a hackathon where you can use the power of our text mining solutions, and we’ll talk about hooking you up!”
Lexalytics is proud to have been the first to offer sentiment analysis, auto theme detection, and Wikipedia integration. Designed to integrate with third-party applications, their text analysis software is chugs along in the background at many data-related organizations. Founded in 2003, Lexalytics is headquartered in Amherst, Massachusetts.
Cynthia Murrell, December 12, 2014
December 8, 2014
YouTube informational videos are great. They are short, snappy, and often help people retain more information about a product than reading the “about” page on a Web site. Rocket Software has its own channel and the video “Rocket Enterprise Search And Text Analytics” packs a lot of details into 2.49 minutes. The video is described as:
“We provide an integrated search platform for gathering, indexing, and searching both structured and unstructured data?making the information that you depend on more accessible, useful, and intelligent.”
How does Rocket Software defend that statement? The video opens with a prediction that by 2020 data usage will have increased to forty trillion gigabytes. It explains that data is the new enterprise currency and that it needs to be kept organized, then it drops into a plug for the company’s software. The compare themselves to other companies by saying Rocket Software makes the enterprise search and text analytics as simple as a download and then it will be up and running. Other enterprise searches require custom coding, but Rocket Software explains it offers these options out of the box. Plus it is a cheaper product without having to sacrifice quality.
Software usage these days is about functionality and ease of use for powerful software. Rocket Software states it offers this. Try putting it to the test.
November 11, 2014
Through the News section of their website, eDigitalResearch announces a new partnership in, “eDigitalResearch Partner with Lexalytics on Real-Time Text Analytics Solution.” The two companies are integrating Lexalytics’ Salience analysis engine into eDigital’s HUB analysis and reporting interface. The write-up tells us:
“By utilising and integrating Lexalytics Salience text analysis engine into eDigitalResearch’s own HUB system, the partnership will provide clients with a real-time, secure solution for understanding what customers are saying across the globe. Able to analyse comments from survey responses to social media – in fact any form of free text – eDigitalResearch’s HUB Text Analytics will provide the power and platform to really delve deep into customer comments, monitor what is being said and alert brands and businesses of any emerging trends to help stay ahead of the competition.”
Based in Hampshire, U.K., eDigitalResearch likes to work closely with their clients to produce the best solution for each. The company began in 1999 with the launch of the eMysteryShopper, a novel concept at the time. As of this writing, eDigitalResearch is looking to hire a developer and senior developer (in case anyone here is interested.)
Founded in 2003, Lexalytics is proud to have brought the first sentiment analysis engine to market. Designed to integrate with third-party applications, their text analysis software is chugging along in the background at many data-related companies. Lexalytics is headquartered in Amherst, Massachusetts.
Cynthia Murrell, November 11, 2014
November 6, 2014
Here’s an interesting development from the world of text-processing technology. GeekWire reports, “Microsoft and Amazon Vets Form Textio, a New Startup Looking to Discover Patterns in Documents.” The new company expects to release its first product next spring. Writer John Cook tells us:
“Kieran Snyder, a linguistics expert who previously worked at Amazon and Microsoft’s Bing unit, and Jensen Harris, who spent 16 years at Microsoft, including stints running the user experience team for Windows 8, have a formed a new data visualization startup by the name of Textio.
“The Seattle company’s tagline: ‘Turn business text into insights.’ The emergence of the startup was first reported by Re/code, which noted that the Textio tool could be used by companies to scour job descriptions, performance reviews and other corporate HR documents to uncover unintended discrimination. In fact, Textio was formed after Snyder conducted research on gender bias in performance reviews in the tech industry.”
That is an interesting origin, especially amid the discussions about gender that currently suffuse the tech community. Textio sees much room for improvement in text analytics, and hopes to help clients reach insights beyond those competing platforms can divine. CEO Snyder’s doctorate and experience in linguistics and cognitive science should give the young company an edge in the competitive field.
Cynthia Murrell, November 06, 2014
July 21, 2014
The article titled Text Analytics Company Linguamatics Boosts Enterprise Search with Semantic Enrichment on MarketWatch discusses the launch of 12E Semantic Enrichment from Linguamatics. The new release allows for the mining of a variety of texts, from scientific literature to patents to social media. It promises faster, more relevant search for users. The article states,
“Enterprise search engines consume this enriched metadata to provide a faster, more effective search for users. I2E uses natural language processing (NLP) technology to find concepts in the right context, combined with a range of other strategies including application of ontologies, taxonomies, thesauri, rule-based pattern matching and disambiguation based on context. This allows enterprise search engines to gain a better understanding of documents in order to provide a richer search experience and increase findability, which enables users to spend less time on search.”
Whether they are spinning semantics for search, or if it is search spun for semantics, Linguamatics has made their technology available to tens of thousands of users of enterprise search. Representative John M. Brimacombe was straightforward in his comments about the disappointment surrounding enterprise search, but optimistic about 12E. It is currently being used by many top organizations, as well as the Food and Drug Administration.
Chelsea Kerwin, July 21, 2014
May 5, 2014
SAS is a well-recognized player in IT game as a purveyor of data, security, and analytics software. In modern terms they are a big player in big data and in order to beef up their offerings we caught word that SAS had updated its Text Miner. SAS Text Miner is advertised as a way for users to not only harness information in legacy data, but also in Web sites, databases, and other text sources. The process can be used to discover new ideas and improve decision-making.
SAS Text Miner a variety of benefits that make it different from the standard open source download. Not only do users receive the license and tech support, but Text Miner offers the ability to process and analyze knowledge in minutes, an interactive user interface, and predictive and data mining modeling techniques. The GUI is what will draw in developers:
“Interactive GUIs make it easy to identify relevance, modify algorithms, document assignments and group materials into meaningful aggregations. So you can guide machine-learning results with human insights. Extend text mining efforts beyond basic start-and-stop lists using custom entities and term trend discovery to refine automatically generated rules.”
Being able to modify proprietary software is a deal breaker these days. With multiple options for text mining software, being able to make it unique is what will sell it.
April 27, 2014
I read “Algorithm Distinguishes Memes from Ordinary Information.” The article reports that algorithms can pick out memes. A “meme”, according to Google, is “an element of a culture or system of behavior that may be considered to be passed from one individual to another by nongenetic means, especially imitation.” The passage that caught my attention is:
Having found the most important memes, Kuhn and co studied how they have evolved in the last hundred years or so. They say most seem to rise and fall in popularity very quickly. “As new scienti?c paradigms emerge, the old ones seem to quickly lose their appeal, and only a few memes manage to top the rankings over extended periods of time,” they say.
The factoid that reminded me how far smart software has yet to travel is:
To test whether these phrases are indeed interesting topics in physics, Kuhn and co asked a number of experts to pick out those that were interesting. The only ones they did not choose were: 12. Rashba, 14. ‘strange nonchaotic’ and 15. ‘in NbSe3′. Kuhn and co also checked Wikipedia, finding that about 40 per cent of these words and phrases have their own corresponding entries. Together this provides compelling evidence that the new method is indeed finding interesting and important ideas.
Systems produce outputs that are not yet spot on. I concluded that scientists, like marketers, like whizzy new phrases and ideas. Jargon, it seems, is an important part of specialist life.
Stephen E Arnold, April 27, 2014