New Tool Integrates with Text Analytics
March 21, 2013
Language and analytics are starting a new trend by coming together. According to the Destination CRM.com article “New SDL Machine Translation Tool Integrates with Text Analytics” SDL has announced that its machine translation tool can now be integrated to work with text analytics solutions. SDL BeGlobal can translate both structured and unstructured information across more than 80 different language combinations. The information is then analyzed using text analytics solutions. This gives users the ability to access global customer insights as well as important business trends. Jean-Francois Damais, Deputy Managing Director of loyalty global clients solutions at Ispos had the following to say regarding SDL BeGlobal.
“With the growth in global business and the accessibility of online information, we now have a much greater need to access and analyze data from multiple languages. As a company focused on innovation and dedicated to our clients’ successes, we deployed SDL BeGlobal machine translation to further improve our research insights and bring new value to our customers.”
SDL BeGlobal has already caught on with several companies in the text analytics industry and several well known companies have jumped on the bandwagon. Raytheon BBN Technologies currently uses the technology for broadcast and Web content monitoring and Expert Systems uses it for semantic intelligence. Language and analytics are two things that are not normally thought of together but seems like SDL BeGlobal has a good thing going. Only time will tell if the new friendship between language and analytics will last the test of time.
April Holmes, March 21, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Semantria Adds Value to Unstructured Data With Sentiment Analysis
March 19, 2013
We are constantly on the lookout for movers and shakers in the area of text analysis and sentiment analysis. So, I was intrigued when I came across Semantria’s Web site recently, a company claiming text and sentiment analysis is made fast and easy with their software. With claims to simplify costs and high-value capturing, I had to research further.
The company was founded in 2011 as a software-as-a-service and services company, specializing in cloud-based text and sentiment analysis.The team boasts a foundation from text analytics provider Lexalytics, software development Postindustria, and demand generation consultancy DemandGen.
The company page shares about how its software can give insight into unstructured content:
“Semantria’s API helps organizations to extract meaning from large amounts of unstructured text. The value of the content can only be accessed if you see the trends and sentiments that are hidden within. Add sophisticated text analytics and sentiment analysis to your application: turn your unstructured content into actionable data.”
Semantria API is powered by the Lexalytics Salience 5 analytics engine and is fully REST compliant. A processing demo is available at at https://semantria.com/demo. We think it is well worth a look.
Andrea Hayden, March 19, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Come Here, Watson. I Want a Cusp of Commercialization
February 28, 2013
For a moment, I thought I was reading a sitcom script. You judge for yourself. Navigate to “And Now, from IBM, It’s Chef Watson.” If you have an environmentally unfriendly version of the New York Times, you can find the script—sorry, real news story—on page B1 of the February 28, 2013, edition.
Let me highlight several phrases and sentences which I found amusing and somewhat troubling for those trying to convince people to license next generation search systems. Keep in mind that the point of the story is Watson, IBM’s next generation Jeopardy winning search system. The peripatetic Watson has done education, insurance, and cancer cracking. Now, Watson and its formidable technical amalgamation of open source and proprietary code is prepping for the Food Network.
IBM Watson’s is hunting for revenues and finding publicity. Can a $100 billion dollar entity find money in search, content processing, and analytics with a silicon Watson? Someday perhaps.
Here are the items I noted, highlighted in dark red and bold to make the words easy to spot:
First, this phrase, “…tries to expand its [IBM’s] artificial intelligence technology and turn turn Watson into something that actually makes commercial sense.” Reading this statement in the context of Hewlett Packard’s interesting commercial activities related to the write down of the spectacular $11 billion purchase of Autonomy is ripe with irony, probably unintentional too.
Second, I found the phrase “on the cusp of commercialization.” Interesting. The Jeopardy show aired in early 2011. A “cusp,” according to one of the online dictionaries is “A transitional point or time, as between two astrological signs.” Yep, I believe is astrology.
Exclusive Interview: Tom Reamy KAPS Group
February 27, 2013
You have encountered a special page. To continue, click here.
Another Palantir Push: But Little Hard Financial Data. Why Not?
February 23, 2013
I was reading about the TED Conference’s yo-yo presentation. My eye drifted across an expanse of cellulose and landed on “The Humane Way to Crack Terrorists.” (This link will go dead so be aware that you may have to pay to read the item online.) The subtitle was one of those News Corp. Google things: “Big data may make enhanced interrogation obsolete.” The source? Some minor blog from America’s hinterland, Silicon Valley? Nope. The Wall Street Journal, February 23, 2013, page C 12.
What’s the subject – really? The answer, in my opinion, Palantir. If you monitor the flagship, traditional media, Palantir has a solid track record of getting written about in print magazines. I suppose that the folks who have pumped about $150 million into the “big data” company read those magazines and the Wall Street Journal type publications each day. I know I do, and I am an addled goose in rural Kentucky, the high tech nerve center of the new industrial revolution. After February 28, 2013, I am not sure about the economy, however.
Here’s the passage I noted:
There’s a tellingly brief passage in “The Finish: The Killing of Osama bin Laden” by Mark Bowden. “The hunt for bin Laden and others eventually drew on an unfathomably rich database,” he writes. “Sifting through it required software capable of ranging deep and fast and with keen discernment—a problem the government itself proved less effective at solving than were teams of young software engineers in Silicon Valley. A startup called Palantir, for instance, came up with a program that elegantly accomplished what TIA [Terrorism Information Awareness program, set up in 2002] had set out to do.” When I met the chief executive and co-founder of Palantir, Alex Karp, recently, he was straightforward: “It is my personal belief that flawless data integration at any kind of scale, with a rigorous access control model, allows analysts to perform operations that are only intrusive on the data. They are not intrusive on human beings.” Obviously, Palantir doesn’t comment on classified work. But its technological phalanx—processing countless leads, from flight manifests to tapped phone calls, into one resource for people to interpret—is known to have been key in locating bin Laden. The company, founded in 2004, has large contracts across the intelligence community and is enterprise-wide at the FBI. Its first client was the CIA.
Nifty stuff. Palantir has high profile clients like intelligence and law enforcement outfits. But where is a hedge fund or a consumer products company? Allegedly the fancy math technology can work wonders. The implication is that outfits like Digital Reasoning, Recorded Future, and even Tibco are not in Palantir’s league. Oh, really? What about outfits like IBM and Oracle and SAS? Nah. Palantir seems to be where the good stuff happens in the context of this Wall Street Journal article.
In my view, the write up triggered several notes on my ubiquitous 4×6 paper note cards, just like the ones I used in high school debate competitions:
First, what about that legal dust up with i2 Group? Here’s a link to refresh one’s memory. I recall that there was also some disagreement, a few real media stories, and then a settlement regarding sector leader i2. Note: I did some work years ago for this out, which is now owned by IBM. Oh, and after the settlement silence. Just what was that legal dispute about anyway? The Wall Street Journal story does not touch on that obviously trivial issue related to the legal matter. Why not? The space in the newspaper was probably needed to cover the yo-yo guy.
Second, can software emulate the motion picture approach to reality? In my experience, numerical recipes can be useful, but they can also provide some points which are subject to contention. A recent example is the gentleman’s disagreement about an electric vehicle. Data, analyses, and interpretations—muddled. Not like the motion pictures’ tidiness and quite final end point. “The end” solves a lot of fictional problems. Life is less clear, a lot less clear in my experience.
Third, how is Palantir doing as a business? After all, the story ran in the Wall Street Journal, which is about business. I appreciate the references to a motion picture, but I am curious about how Palantir is doing on its march to generate a billion or more in revenues. At some point, the investors are going to look at the money pumped into Palantir, the time spent developing the magical technology which warrants metaphorical juxtaposition to Hollywood outputs, and the profitability of the company’s sales. Why doesn’t the Wall Street Journal do the business thing? Revenue, commercial customers, and case studies which do not flaunt words which Bing and Google love to consume in their indexing systems?
It is Saturday, and I suppose I there are lots of 20 somethings working at 0900 Eastern as I write this. They will fill the gap. I will have to wait. I wonder if the predictive algorithms from Palantir can tell me how long before hard facts become available?
One final question: If this Palantir type of system worked, why aren’t the firms in this Palantir-type software sector dominating in financial services, marketing, and consumer products? I wonder if the reason is that fancy math generates high expectations and then creates some situations in which reality does not work just like a cinema thriller?
Stephen E Arnold, February 23, 2013
IBM and Price Cuts: Is Watson a Factor?
February 17, 2013
I read “IBM Cuts Price of Watson Based Power Servers.” I have no clue if the story is correct, half current, or incorrect. What’s important is that CIOL.com thought the notion of a Watson related price cut newsworthy.
The Power7 based servers were hot stuff several years ago. CPU performance is no longer the gating factor as it was in the days of STAIRS III. Input output, memory subsystems, and various types of latency make a system fast or not. Heck, careless programming can make Google’s zippy boxes howl with pain when its innards suffer a computational cramp.
The write up asserts:
IBM will roll out eight new Power Systems for entry level starting at $5,947. The new systems include Power Express 710, 720, 730 and 740 family of products…. IBM will also introduce two new PowerLinux Systems – 7R1 and 7R2 – optimized for IBM InfoSphere BigInsights and InfoSphere Streams big data analytics software. The company will also introduce two new Power Systems – 750 and 760 – for midsized and large enterprises.
The hot item in the story in my opinion is this reference:
The new systems are based on IBM’s Watson system and are powered by its Power7+ microprocessor technology. These will enable users to build and deploy infrastructure for private and hybrid clouds, as per a release.
The write up includes the now obligatory baloney about the big data, cloud and caching tactics for performance.
If the story is incorrect, no big deal. Any publicity is good, even for a dog movie like “Heaven’s Gate” and its expensive roller skates. If the story is half correct, why is Watson making an appearance in juxtaposition to “entry level.” Is the vaunted Jeopardy winning technology not generating sufficient revenue to payback the development time and the sunk marketing costs? If the story is correct, I am interested in the fact that high end information technology has to be bundled at lower prices.
Years ago, I was told by an informed person that IBM knew what it was doing when it came to search and information retrieval. Maybe the company will come to dominate the enterprise market for big data, analytics, and smarter search. On the other hand, hasn’t IBM travelled this road before and yet the journey continues.
Stay tuned to Jeopardy or monitor the cancer related news stream. Watson is with us along with a Power7 chip which may be experiencing some symptoms of rheumatism.
Stephen E Arnold, February 17, 2013
Sinequa France: Update 2013
February 14, 2013
My research team was winnowing our archive of information about European search vendors. Since Martin White’s article for eContent in 2011, a number of changes have swept through the search and content processing sector. Some changes were significant; for example, HP’s stunning acquisition of Autonomy. Others were more modest; for example, the steady progress of such companies as Sinequa and Spotter, among others.
The European technical grip on search is getting stronger. Google is the dominant player in Web search. But in enterprise content processing, some European firms are moving more rapidly than their North American or Pacific Rim counterparts.
The Sinequa tag cloud. See http://www.sinequa.com/en/page/solutions/category-1.aspx
One interesting example is Sinequa, based in Paris. The company, like other French technology firms, has a staff of capable engineers and managers. However, unlike some other companies, Sinequa has continued to establish a track record as a company innovating in technology and capturing some important accounts; for example, Siemens, the German industrial powerhouse.
Sinequa’s approach is to emphasize that enterprise search has moved to unified information access. A number of companies make similar claims. Sinequa has established that its technology can deliver the type of one-stop access to structured and unstructured content that almost every vendor claims to deliver. You can get a useful overview of the architecture of the Sinequa platform at http://www.sinequa.com/en/page/product/product.aspx.
A relatively recent addition to the Sinequa.com Web site are case analysis videos. I find case examples extremely useful. The presentation of this type of information in rich media format makes it easier for me to get a sense of the value of the solution a vendor delivers. I found the Mercer video particularly interesting. You can find these testimonials at http://www.sinequa.com/en/page/clients/clients-video.aspx.
The trajectory of European search, content processing, and analytics vendors is difficult to plot in today’s uncertain economic climate. Sinequa warrants a close look for organizations seeking an integrated approach to its content assets. For more information about Sinequa’s current activities, tap into the firm’s blog at http://blog.sinequa.com/
Stephen E Arnold, February 14, 2013
Sponsored by EMRxNow, the information service which tracks automated indexing of electronic medical records
Change Comes to Attensity
February 14, 2013
Just as the demand for analytics is ascending, Attensity makes a management change. We learn the company recently named J. Kirsten Bay their head honcho in “Attensity Names New President/CEO,” posted at Destination CRM. The press release stresses the new CEO’s considerable credentials:
“Bay brings to Attensity nearly 20 years of strategic process and organizational policy experience derived from the information management, finance, and consumer product industries. She is an expert in advising both the public and private sector on the development of econometric policy models. Most recently, as vice president of commercial business with iSIGHT Partners, Bay provided strategic counsel to Fortune 500 companies on managing intelligence requirements and implementing customer and development programs to integrate intelligence into decision programs.”
The company’s flagship product Attensity Pipeline collects and semantically annotates data from social media and other online sources. From there, it passes to Attensity Analyze for text analytics and customer engagement suggestions.
Headquartered in Palo Alto, California, folks at Attensity pride themselves on the accuracy of their analytic engines and their intuitive reports. Rooted in their development of tools that serve the intelligence community, the company now provides semantic solutions to many Global 2000 companies and government agencies.
Cynthia Murrell, February 14, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
From Jeopardy to Cancer Treatment: An IBM Story
February 10, 2013
I read “IBM Supercomputer Watson to Help in Cancer Treatment.” I am burned out on the assertions of search, content processing, and analytics vendors. The algorithms predict, deliver actionable information, and answer tough questions. Okay, I will just believe these statements. Most of the folks with whom I interact either believe these statements or do not really care.
Watson, as you may know, takes open source goodness, layers on a knowledge base, and wraps the confection in layers of smart software. I am simplifying, but the reality is irrelevant given the marketing need.
Here’s the passage I noted:
A year ago, a team at Memorial Sloan-Kettering started working with an IBM and a WellPoint team to train Watson to help doctors choose therapies for breast and lung cancer patients. They continue to share their knowledge and expertise in oncology and information technology, beginning with hundreds of lung cancers, the aim being to help Watson learn as much as possible about cancer care and how oncologists use medical data, as well as their experiences in personalized cancer therapies. During this period, doctors and technology experts have spent thousands of hours helping Watson learn how to process, analyze and interpret the meaning of sophisticated clinical data using natural language processing; the aim being to achieve better health care quality and efficiency.
There you go. For the dozens of companies working to create next generation information retrieval systems which are affordable, actually work, and can be deployed without legions of engineers—game over. IBM Watson has won the search battle. Now for the optimists who continue to pump money into decade old search companies which have modest revenue growth, kiss those bucks goodbye. For the PhD students working on the revolutionary system which promises to transform findability, get a job at Kentucky Fried Chicken. And Google? Well, IBM knows your limits so stick to selling ads.
IBM is doing it all:
Manoj Saxena, IBM General Manager, Watson Solutions, said:
“IBM’s work with WellPoint and Memorial Sloan-Kettering Cancer Center represents a landmark collaboration in how technology and evidence based medicine can transform the way in which health care is practiced. breakthrough capabilities bring forward the first in a series of Watson-based technologies, which exemplifies the value of applying big data and analytics and cognitive computing to tackle the industry’s most pressing challenges.”
How different is Watson from the HP Autonomy, Recommind, or even the DR LINK technology? Well, maybe the open source angle is the same. But IBM needs to do more than make assertions and buy analytics companies as the company recycles open source technology in my opinion. I thought IBM was a consulting firm? Here I am wrong again. Watson probably “knew” that after hours of training, tuning, and talking. But in the back of my mind, I ask, “What if those training data are inapplicable to the problem at hand? What if the journal articles are fiddled by tenure seekers or even pharmaceutical outfits or institutions trying to maximize insurance payouts or careless record keeping by medical staff? Nah, irrelevant questions. IBM has this smart system nailed. Search solved. What’s next IBM?
Stephen E Arnold, February 10, 2013
eDiscovery: A Source of Thrills and Reduced Costs?
February 2, 2013
When I hear the phrase “eDiscovery”, I don’t get chills. I suppose some folks do. I read after dinner last night (February 1, 2013) “Letter From LegalTech: The Thrills of E-Discovery.” The author addresses the use of search and content processing technology to figure out which documents are most germane to a legal matter. Once the subset has been identified, eDiscovery provides outputs which “real” attorneys (whether in Bangalore or Binghamton) can use to develop their “logical” arguments.
A happy quack to
One interesting factoid bumps into my rather sharp assessment of the “size” of the enterprise search market generated by an azure chip out. The number was about $1.5 billion. In the eDiscovery write up, the author says:
Nobody seems to know how large the e-discovery market is — estimates range from 1.2 to 2.8 billion dollars — but everyone agree it’s not going anywhere. We’re never going back to sorting through those boxes of documents in that proverbial warehouse.
I like the categorical affirmative “nobody.” The point is that sizing any of the search and content processing markets is pretty much like asking Bernie Madoff type professionals, “How much in liquid assets do you have?” The answer is situational, enhanced by marketing, and believed without a moment’s hesitation.
I know the eDiscovery market is out there because I get lots of PR spam about various breakthroughs, revolutions, and inventions which promise to revolutionize figuring out which email will help a legal eagle win a case with his or her “logical” argument. I wanted to use the word “rational” in the manner of John Ralston Saul, but the rational attorneys are leaving the field and looking for work as novelists, bloggers, and fast food workers.
One company—an outfit called Catalyst Repository Systems—flooded me with PR email spam about its products. I called the company on January 31, 2013. I was treated in an offhand, suspicious manner by a tense, somewhat defensive young man named Mark, Monk, Matt, or Mump. At age 69, I have a tough time figuring out Denver accents. Mark, Monk, Matt, or Mump took my name and phone number. He assured me that his boss would call me back to answer my questions about PR spam and the product which struck me as a “me too.” I did learn that he had six years of marketing experience and that he just “push the send button.” I suggested that he may want to know to whom he is sending messages multiple times, he said, “You are being too aggressive.” I pointed out that I was asking a question just like the lawyers who, one presumes, gobbles up the Catalyst products. He took my name, did not ask how to spell it, wrote down my direct line and did not bother to repeat it back to me, and left me with the impression that I was out of bounds and annoying. That was amusing because I was trying hard to be a regular type caller.
A happy quack to Bitter Lawyer which has information about the pressures upon some in the legal profession. See http://www.bitterlawyer.com/i%E2%80%99m-unemployed-and-feel-ripped-off-by-my-ttt-law-school/
Mark, Monk, Matt, or Mump may have delivered the message and the Catalyst top dog was too busy to give me a jingle. Another possibility is that Mark, Monk, Matt, or Mump never took the note. He just wanted to get a person complaining about PR spam off the phone. Either way, Catalyst qualifies as an interesting example of what’s happening in eDiscovery. Desperation marketing has infected other subsectors of the information retrieval market. Maybe this is an attempt to hit in reality revenues of $1.5 billion?