Lexalytics: GUI and Wizard

June 12, 2015

What is one way to improve a user’s software navigational experience?  One of the best ways is to add a graphical user interface (GUI).  Software Development @ IT Business Net shares a press release about “Lexalytics Unveils Industry’s First Wizard For Text Mining And Sentiment Analysis.”  Lexalytics is one of the leading companies that provides sentiment and analytics solutions and as the article’s title explains it has made an industry first by releasing a GUI and wizard for Semantria SaaS platform and Excel plug-in.  The wizard and GUI (SWIZ) are part of the Semantria Online Configurator, SWEB 1.3, which also included functionality updates and layout changes.

” ‘In order to get the most value out of text and sentiment analysis technologies, customers need to be able to tune the service to match their content and business needs,’ said Jeff Catlin, CEO, Lexalytics. ‘Just like Apple changed the game for consumers with its first Macintosh in 1984, making personal computing easy and fun through an innovative GUI, we want to improve the job of data analysts by making it just as fun, easy and intuitive with SWIZ.’”

Lexalytics is dedicated to helping its clients enjoy an easier experience when it comes to data analytics.  The company wants its clients to get the answers they by providing the tools they need to get them without having to over think the retrieval process.  While Lexalytics already provides robust and flexible solutions, the SWIZ release continues to prove it has the most tunable and configurable text mining technology.

Whitney Grace, June 12, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Sentiment Analysis: The Progeny of Big Data?

June 9, 2015

I read “Text Analytics: The Next Generation of Big Data.” The article provides a straightforward explanation of Big Data, embraces unstructured information like blog posts in various languages, email, and similar types of content, and then leaps to the notion of text analytics. The conclusion to the article is that we are experiencing “The Coming of Age of Text Analytics—The Next Generation of Big Data.”

The idea is good news for the vendors of text analytics aimed squarely at commercial enterprises, advertisers, and marketers. I am not sure the future will match up to the needs of the folks at the law enforcement and intelligence conference I had just left.

There are three reasons:

First, text analytics are not new, and the various systems and methods have been in use for decades. One notable example is BAE Systems use of its home brew tools and Autonomy’s technology in the 1990s and i2 (pre IBM) and its efforts even earlier.

Second, the challenges of figuring out what structured and unstructured data mean require more than determining if a statement is positive or negative. Text analytics is, based on my experience, blind to such useful data as real time geospatial inputs and video streamed from mobile devices and surveillance devices. Text analytics, like key word search, makes a contribution, but it is in a supporting role, not the Beyoncé of content processing.

Third, the future points to the use of technologies like predictive analytics. Text analytics are components in these more robust systems whose outputs are designed to provide probability-based outputs from a range of input sources.

There was considerable consternation a year or so ago. I spoke with a team involved with text analytics at a major telecommunications company. The grousing was that the outputs of the system did not make sense and it was difficult for those reviewing the outputs to figure out what the data meant.

At the LE/intel conference, the focus was on systems which provide actionable information in real time. My point is that vendors have a tendency to see the solutions in terms of what is often a limited or supporting technology.

Sentiment analysis is a good example. Blog posts invoking readers to join ISIS are to some positive and negative. The point is that the point of view of the reader determines whether a message is positive or negative.

The only way to move beyond this type of superficial and often misleading analysis is to deal with context, audio, video, intercept data, geolocation data, and other types of content. Text analytics is one component in a larger system, not the solution to the types of problems explored at the LE/intel conference in early June 2015. Marketing often clouds reality. In some businesses, no one knows that the outputs are not helpful. In other endeavors, the outputs have far higher import. Knowing that a recruiting video with a moving nasheed underscoring the good guys dispatching the bad guys is off kilter. Is it important to know that the video is happy or sad? In fact, it is silly to approach the content in this manner.

Stephen E Arnold, June 9, 2014

Datameer: Action, Not Talk, about Data Governance

June 5, 2015

A happy quack to Datameer. The company is providing tools to deal with issues related to data quality, compliance, and security. If you Hadoop, Datameer is taking action, not just talking with regard to Hadoop crunching. With “end users” fooling around with analytics, outputs can be exciting. Some Statistics 101 students would be reluctant to turn these “reports” is at the end of the term. For MBAs, point and click analyses are quick and easy. Outputs? Hey, isn’t anything generated by a computer correct?

Navigate to “Datameer Adds Governance Tools for Hadoop Analytics.”

Accordin to the write up:

New data-profiling tools, for example, let companies find and transparently fix issues like dirty, inconsistent or invalid data at any stage in a complex analytics pipeline. Datameer’s capabilities include data profiling, data statistics monitoring, metadata management and impact analysis. Datameer also supports secure data views and multi-stage analytics pipelines, and it provides LDAP/Active Directory integration, role-based access control, permissions and sharing, integration with Apache Sentry 1.4 and column and row anonymization functions.

The source is one of the IDC/IDG properties, so check with Datameer to make certain you are getting the straight scoop.

Stephen E Arnold, June 5, 2015

JackBe May Have a New Owner

June 2, 2015

Jack Be, a Maryland based intelligence software company, sold to Software AG in late 2013. If the information in “Aurea Software with Renewed Offer to Acquire All Update Software AG Shares” is accurate, the major shareholder of Software AG may acquire the German firm. The buyer could be Aurea Software. According the firm’s Web site:

At Aurea, we constantly challenge ourselves to identify and engineer truly transformative customer experiences, and we look for innovative ways that software and processes can transform an average experience for your customers into a great one. Our customer experience platform helps over 1,500 companies worldwide build, execute, monitor and optimize the end-to-end customer journey for a diverse range of industries including Energy, Retail, Insurance, Travel & Hospitality and Life Sciences.

According to a BusinessWeek profile, Aurea is the new positioning of Progress Software. BusinessWeek says:

Aurea Software was formerly known as Progress Software Corp., Sonic, Savvion, Actional and DXSI. As a result of the acquisition of Progress Software Corp., Sonic, Savvion, Actional and DXSI by Trilogy Enterprises, Inc, Progress Software Corp., Sonic, Savvion, Actional and DXSI’s name was changed. Progress Software Corp., Sonic, Savvion, Actional and DXSI comprises four progress software businesses.

A document attributed to Progress Software provides an interesting profile of Aurea. You can find this document as of June 1, 2015, at this link. My recollection is that Progress (now Aurea) used to own the EasyAsk search technology. In May 2015, Aurea acquired Lyris. According to the Aurea-Lyris news release:

Lyris is a global leader of innovative email and digital marketing solutions that help companies reach customers at scale and create personalized value at every touch point. Lyris’ products and services empower marketers to design, automate, and optimize experiences that facilitate superior engagement, increase conversions, and deliver measurable business value.

Jack Be’s screen scraping technology can be used to figure out what customers’ are saying on the Internet and on a licensee’s help desk system. Jack Be did include query tools.  If the Reuters’ story is off base, we will update this post. One assumes that the new Progress will “empower” some significant progress.

Stephen E Arnold, June 2, 2015

Free Book from OpenText on Business in the Digital Age

May 27, 2015

This is interesting. OpenText advertises their free, downloadable book in a post titled, “Transform Your Business for a Digital-First World.” Our question is whether OpenText can transform their own business; it seems their financial results have been flat and generally drifting down of late. I suppose this is a do-as-we-say-not-as-we-do situation.

The book may be worth looking into, though, especially since it passes along words of wisdom from leaders within multiple organizations. The description states:

“Digital technology is changing the rules of business with the promise of increased opportunity and innovation. The very nature of business is more fluid, social, global, accelerated, risky, and competitive. By 2020, profitable organizations will use digital channels to discover new customers, enter new markets and tap new streams of revenue. Those that don’t make the shift could fall to the wayside. In Digital: Disrupt or Die, a multi-year blueprint for success in 2020, OpenText CEO Mark Barrenechea and Chairman of the Board Tom Jenkins explore the relationship between products, services and Enterprise Information Management (EIM).”

Launched in 1991, OpenText offers tools for enterprise information management, business process management, and customer experience management. Based in Waterloo, Ontario, the company maintains offices around the world.

Cynthia Murrell, May 27, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Computing Power Up a Trillion Fold in 60 Years. Search Remains Unchanged.

May 25, 2015

I get the Moore’s Law thing. The question is, “Why isn’t search and content processing improving?”

Navigate to “Processing Power Has Increased by One Trillion-Fold over the Past Six Decades” and check out the infographic. There are FLOPs and examples of devices which deliver them. I focused on the technology equivalents; for example, the Tianhe 2 Supercomputer is the equivalent of 18,400 PlayStation 4s.

The problem is that search and content processing continue to bedevil users. Perhaps the limitations of the methods cannot be remediated by a bigger, faster assemblage of metal and circuits?

The improvement in graphics is evident. But allowing me to locate a single document in my multi petabyte archive continues to a challenge. I have more search systems than the average squirrel in Harrod’s Creek.

Findability is creeping along. After 60 years, the benefits of information access systems are very difficult to tie to better decisions, increased revenues, and more efficient human endeavors even when a “team of teams” approach is used.

Wake up call for the search industry. Why not deliver some substantive improvements in information access which are not tied to advertising? Please, do not use the words metadata, semantics, analytics, and intelligence in your marketing. Just deliver something that provides me with the information I require without my having to guess key words, figure out odd ball clustering, or waiting minutes or hours for a query to process.

I don’t want Hollywood graphics. I want on point information. In the last 60 years, my information access needs have not been met.

Stephen E Arnold, May 25, 2015

IBM Watson: 75 Industries Have Watson Apps. What about Revenue from Watson?

May 25, 2015

Was it just four years ago? How PR time flies. I read “Boyhood.” Now here’s the subtitle, which is definitely Google-licious:

Watson was just 4 years old when it beat the best human contestants on Jeopardy! As it grows up and goes out into the world, the question becomes: How afraid of it should we be?

I am not too afraid. If I were the president of IBM, I would be fearful. Watson was supposed to be well on its way north of $1 billion in revenue. If I were the top wizards responsible for Watson, I would be trepedatious . If I were a stakeholder in IBM, I would be terrified.

But Watson does not frighten me. Watson, in case you do not know, is built from:

  1. Open source search
  2. Acquired companies’ technology
  3. Home brew scripts
  4. IBM bit iron

The mix is held together with massive hyperbole-infused marketing.

The problem is that the revenue is just not moving the needle for the Big Blue bean counters. Please, recall that IBM has reported dismal financial results for three years. IBM is buying back its stock. IBM is selling its assets. IBM is looking at the exhaust pipes of outfits like Amazon. IBM is in a pickle.

The write up ignores what I think are important factoids about IBM. The article asserts:

The machine began as the product of a long-shot corporate stunt, in which IBM engineers set out to build an artificial intelligence that could beat the greatest human champions at Jeopardy!, one that could master language’s subtleties: rhymes, allusions, puns….It has folded so seamlessly into the world that, according to IBM, the Watson program has been applied in 75 industries in 17 countries, and tens of thousands of people are using its applications in their own work. [Emphasis added]

How could I be skeptical? Molecular biology. A cook book. Jeopardy.

Now for some history:

Language is the “holy grail,” he said, “the reflection of how we think about the world.” He tapped his head. “It’s the path into here.”

And then the epiphany:

Watson was becoming something strange, and new — an expert that was only beginning to understand. One day, a young Watson engineer named Mike Barborak and his colleagues wrote something close to the simplest rule that he could imagine, which, translated from code to English, roughly meant: Things are related to things. They intended the rule as an instigation, an instruction to begin making a chain of inferences, each idea leaping to the next. Barborak presented a medical scenario, a few sentences from a patient note that described an older woman entering the doctor’s office with a tremor. He ran the program — things are related to things — and let Watson roam. In many ways, Watson’s truest expression is a graph, a concept map of clusters and connective lines that showed the leaps it was making. Barborak began to study its clusters — hundreds, maybe thousands of ideas that Watson had explored, many of them strange or obscure. “Just no way that a person would ever manually do those searches,” Barborak said. The inferences led it to a dense node that, when Barborak examined it, concerned a part of the brain…that becomes degraded by Parkinson’s disease. “Pretty amazing,” Barborak said. Watson didn’t really understand the woman’s suffering. But even so, it had done exactly what a doctor would do — pinpointed the relevant parts of the clinical report, discerned the disease, identified the biological cause. To make these leaps, all you needed was to read like a machine: voraciously and perfectly.

I have to take a break. My heart is racing. How could this marvel of technology be used to save lives, improve the output of Burger King, and become the all time big winner on the the Price Is Right?

Now let’s give IBM a pat on the back for getting this 6.000 word write up in a magazine consumed by those who love the Big Apple without the New Yorker’s copy editors poking their human nose into reportage.

From my point of view, Watson needs to deliver:

  1. Sustainable revenue
  2. Demonstrate that the system can be affordable
  3. Does not require human intermediaries to baby sit the system
  4. Process content so that real time outputs are usable by those needing “now” insights
  5. Does not make egregious errors which cause a human using Watson to spend time shaping or figuring out if the outputs are going to deliver what the user requires; for example, a cancer treatment regimen which helps the patient or a burrito a human can enjoy.

Hewlett Packard and IBM have managed to get themselves into the “search and content processing” bottle. It sure seems as if better information outputs will lead to billions in revenue. Unfortunately the realty is that getting big bucks from search and content processing is very difficult to do. For verification, just run a query on Google News with these terms: Hewlett Packard Autonomy.

The search and content processing sector is a utility function. There are applications which can generate substantial revenue. And it is true that these vendors include search as a utility function.

But pitching smart software spitballs works when one is not being watched by stakeholders. Under scrutiny, the approach does not have much of a chance. Don’t believe me? Take your entire life savings and buy IBM stock. Let me know how that works out.

Stephen E Arnold, May 25, 2015

Lexmark Buys Kofax: Scanning and Intelligence Support

May 22, 2015

Short honk: I was fascinated when Lexmark purchased Brainware and ISYS Search Software a couple of years ago. Lexmark, based in Lexington, Kentucky, used to be an IBM unit. That it seems did not work out as planned. Now Lexmark is in the scanning and intelligence business. Kofax converts paper to digital images. Think health care and financial services, among other paper centric operations. Kofax bought Kapow, a Danish outfit that provides connectors and ETL software and services. ETL means extract, transform, and load. Kapow is a go to outfit in the intelligence and government services sector. You can read about Lexmark’s move in “Lexmark Completes Acquisition of Kofax, Announces Enterprise Software Leadership Change.”

According to the write up:

  • This was a billion dollar deal
  • The executive revolving door is spinning.

In my experience, it is easier to spend money than to make it. Will Lexmark be able to convert these content processing functions into billions of dollars in revenue? Another good question to watch the company try to answer in the next year or so. Printers are a tough business. Content processing may be even more challenging. But it is Kentucky. Long shots get some love until the race is run.

Stephen E Arnold, May 22, 2015

Expert System Connects with PwC

May 21, 2015

Short honk: Expert System has become a collaborator with PwC (formerly a unit of IBM and an accounting firm). The article points out:

Expert System, a company active in the semantic technology for information management and big data listed on Aim Italy, announced the collaboration with PwC, global network of professional services and integrated audit, advisory, legal and tax, for Expo Business Matching, the virtual platform of business meetings promoted by Expo Milano 2015, the Chamber of Commerce of Milan, Promos, Fiera Milano and PwC.

Expert System is a content processing developer with offices in Italy and the United States.

Stephen E Arnold, May 21, 2015

Lexalytics Offers Tunable Text Mining

May 13, 2015

Want to do text mining without some of the technical hassles? if so, you will want to read about Lexalytics “the industry’s most tunable and configurable text mining technology.” Navigate to “Lexalytics Unveils Industry’s First Wizard for Text Mining and Sentiment Analysis.” I learned that text mining can be fun, easy, and intuitive.” I highlighted this quote from the news story as an indication that one does not need to understand exactly what’s going on in the text mining process:

“Before, our customers had to understand the meaning of things like ‘alpha-numeric content threshold’ and ‘entities confidence threshold,'” Jeff continued. “Lexalytics provides the most knobs to turn to get the results exactly as you want them, and now our customers don’t even have to think about them.”

Text mining, the old-fashioned way, required understanding of what was required, what procedures were appropriate, and ability to edit or write scripts. There are other skills that used to be required as the entry fee to text mining. The modern world of interfaces allows anyone to text mine. Do users understand the outputs? Sure. Perfectly.

As I read the story, I recalled a statement in “A Review of Three Natural Language Processors, AlchemyAPI, OpenCalais, and Semantria.” Here is the quote I noted in that July 2014 write up by Marc Clifton:

I find the concept of Natural Language Processing intriguing and that it holds many possibilities for helping to filter and analyze the vast and growing amount of information out there on the web.  However, I’m not quite sure exactly how one uses the output of an NLP service in a productive way that goes beyond simple keyword matching.  Some people will of course be interested in whether the sentiment is positive or negative, and I think the idea of extracting concepts (AlchemyAPI) and topics (Semantria) are useful in extracting higher level abstractions regarding a document.  NLP is therefore an interesting field of study and I believe that the people who provide NLP services would benefit from the feedback of users to increase the value of their service.

Perhaps the feedback was, “Make this stuff easy to do.” Now the challenge is to impart understanding to what a text mining system outputs. That might be a bit more difficult.

Stephen E Arnold, May 13, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta