SAS Gets More Visual

March 31, 2012

Inxight (now owned by BusinessObjects, part of the SAP empire)  is history at SAS or almost history. Now the company is moving in a different direction.

Jaikumar Vijayan writes about a new visual analytics application recently unveiled by SAS in his article “SAS Promises Pervasive BI with New Tool.” Einstein is believed to have once said “computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.” We noted this passage from Mr. Vijayan’s write up:

Unlike many purely server-based enterprise analytics technologies, Visual Analytics gives business users a full range of data discovery, data visualization and querying capabilities from desktop and mobile client devices, the company said.

The initial version of the new tool allows iPad users to view reports and download information to their devices. Future versions will support other mobile devices as well, SAS added. The quote is actually a good description of the concept that underlies Visual Analysis. The process uses analytic reasoning to detect specific information in massive amount of data. For example, a clothing manufacturer might use it to determine current trends in ladies’ fashions. The results are presented in charts and graphs to the users, who can fine-tune the parameters until their specific queries are answered.

SAS is known for its statistical functionality, its programming language, and its need for SAS-savvy cow pokes to ride herd on the bits and bytes. Will SAS be able to react to the trend for the consumerization of business intelligence.

While the technology is impressive, SAS may be a little late to the game. Palantir and Digital Reasoning have already introduced applications that offer clients powerful Visual Analysis capabilities. Time will tell if SAS is able to catch up to some competitors’ approach. We are interested in Digital Reasoning, Ikanow and Quid.

Stephen E Arnold, March 31, 2012

Sponsored by Pandia.com

Law Firms Learn Staff Can Be Repurposed

March 30, 2012

I know there is considerable enthusiasm for smart software. Most of the eDiscovery vendors suggest that humans and whizzy new systems can coexist. Now, a new chapter in justifiable staff reductions may be upon us. Navigate to “A New View of Review: Predictive Coding Vows to Cut E-Discovery Drudgery” to learn that recently released research from an Ivory Tower-type says that a “predictive coding approach can do a better job of sifting through more than 800,000 documents than humans.”

For many law school graduates, scouring documents for material of value to a case has long been a secure if somewhat tedious means of entering the legal profession. This will no longer be true, however, if a new type of software lives up to its creator’s claims Known as predictive coding, it can supposedly do the same job, faster, cheaper, and as well as humans. But lawyers live to bill, so perhaps software may force law firms to get rid of staff and trust the algorithms.

We learn:

There has been a long-standing myth in the legal field that exhaustive manual review is the gold standard, or nearly perfect, but that has been shown to be a fallacy, according to Maura R. Grossman, a New York City attorney. Research has shown that, under the best circumstances, manual review will identify about 70 percent of the responsive documents in a large data collection. Some technology-assisted approaches have been shown to perform at least as well as that, if not better, at far less cost.

Attorneys, paralegals, unpaid interns, and experts in India will miss 30 percent of the pertinent documents. Smart software is the path to the future.

Some observers worry about the legal defensibility of predictive coding. But such concerns are unfounded, so long as both sides agree to its use. That’s according to Craig Carpenter, a marketer for Recommind, a software development firm focused on the legal and corporate market

But even sophisticated programs don’t actually think. Without that capacity, they cannot understand the subtle nuances and informal connections that underlie written documents. It’s unlikely that predictive coding will live up to the sales hype surrounding it. But what’s new about search vendors’ marketing is that reality is often different from Spock’s world on Star Trek.

Stephen E Arnold, March 20, 2012

Sponsored by Pandia.com

New Alert Feature for Clarabridge Social Media Analytics

March 28, 2012

Social functions are refining their role in the business intelligence niche. The BrainYard reports, “Clarabridge Adds Alerts to Social Media Analytics.”

So what do you do with all that information your business collects from social media? A backlog in the analysis of time-sensitive data could cost a company in lost opportunities. Clarabridge now addresses this problem with automatic alerts. Writer David F. Carr explains:

Clarabridge 5.0 provides tools for collaborating around an analysis. By configuring more proactive notifications, Clarabridge users might also configure the system to automatically send alerts to the correct regional manager–or product manager, or department head–making it more likely that the organization will take action immediately after detecting a specific problem or opportunity. . . . ‘If somebody just tweeted, “I went into Kohl’s and slipped and fell, so now I’m going to sue,” if you’re Kohl’s you want to know that,’ [Clarabridge VP Sidra] Berman said.

Collaboration is the focus of Clarabridge 5.0, formally released March 20, 2012. After all, much of this data points to challenges that require action from multiple departments. Though the alert function is a useful tool, it is important to remember that it will take skillful action to make the most of the new feature.

Clarabridge aims to delve deeper into the meaning behind each piece of content than the competition. Having spent years developing its sentiment and text analytics technology, the company boasts that it is uniquely positioned to support enterprise-scale customer feedback initiatives.

Cynthia Murrell, March 28, 2012

Sponsored by Pandia.com

Connotate Acquires Fetch Technologies

March 27, 2012

I know, “Who? Bought what?”

Connotate is a data fusion company which uses software bots (agents) to harvest information. Fetch Technologies, founded more than a decade ago, processes structured data. The deal comes on the heels of some executive ball room dancing. Connotate snagged a new CEO, Keith Cooper, according to New Jersey Tech Week. Fetch also uses agent technology.

Founded in 1999, Fetch Technologies enables organizations to extract, aggregate and use real-time information from Web sites. Fetch’s artificial intelligence-based technology allows precise data extraction from any Web site, including the so-called Deep Web, and transforms that data into a uniform format that can be integrated into any analytics or business intelligence software.

The company’s technology originated at the University of Southern California’s Information Sciences Institute. Fetch’s founders developed the core artificial intelligence algorithms behind the Fetch Agent Platform while they were faculty members in Computer Science at USC. Fetch’s artificial intelligence solutions were further refined through years of research funded by the Defense Advanced Research Projects Agency (DARPA), the National Science Foundation (NSF), the U.S. Air Force, and other U.S. Government agencies.

The Connotate news release said:

Fetch is very excited to combine our information extraction, integration, and data analytics solution with Connotate’s monitoring, collection and analysis solution,” said Ryan Mullholland, Fetch’s former CEO and now President of Connotate. Our similar product and business development histories, but differing go-to-market strategies creates an extraordinary opportunity to fast-track the creation of world-class proprietary ‘big data’ collection and management solutions.

Okay, standard stuff. But here’s the paragraph that caught my attention:

Big data, social media and cloud-based computing are major drivers of complexity for business operations in the 21st century,” said Keith Cooper, CEO of Connotate.  “Connotate and Fetch are the only two companies to apply machine learning to web data extraction and can now take the best of both solutions to create a best-of-breed application that delivers inherent business value and real-time intelligence to companies of all sizes.

I am not comfortable with the assertion of “only two companies to apply machine learning to Web data extraction.” In our coverage of the business intelligence and text mining market in Inteltrax.com, we have written about many companies which have applied such technologies and generated more market traction. Examples range from Digital Reasoning to Palantir, and others.

The deal is going to deliver on a “unified vision.” That may be true; however, saying and doing are two different tasks. As I write this, unification is the focus of activities from big dogs like Autonomy, now part of Hewlett Packard, to companies which have lower profiles than Connotate or Fetch.

We think that the pressure open source business intelligence and open source search are exerting will increase. With giants like IBM (Cognos, i2 Group, SPSS) and Oracle working to protect their revenues, more mergers like the Connotate-Fetch tie up are inevitable. You can read a July 14, 2010, interview with Xoogler Mike Horowitz, Fetch Technologies at this link.

Will the combined companies rock the agent and data fusion market? We hope so.

Stephen E Arnold, March 27, 2012

Sponsored by Pandia.com

The Invisibility of Open Source Search

March 27, 2012

I was grinding through my files and I noticed something interesting. After I abandoned the Enterprise Search Report, I shifted my research from search and retrieval to text processing. With this blog, I tried to cover the main events in the post-search world. The coverage was more difficult than I anticipated, so we started Inteltrax, which focuses on systems, companies, and products which “find” meaning using numerical recipes. But that does not do enough, so we are contemplating two additional free information services about “findability.” I am not prepared to announce either of these at this time. We have set up a content production system with some talented professionals working on our particular approach to content. We are also producing some test articles.

Front Cover

Until we make the announcement, I want to reiterate a point I made in my talks in London in 2011 about open source search and content processing:

Most reports about enterprise search ignore open source search solution vendors. A quiet revolution is underway, and for many executives, the shift is all but invisible.

We think that the “invisible” nature of the open source search and content processing options is due to four factors:

Most of the poobahs, self appointed experts and former home economics majors have never installed, set up, or optimized an open source search system. Here at ArnoldIT we have that hands on experience. And we can say that open source search and content processing solutions are moving from the desks of Linux wizards to more mainstream business professionals.

Next, we see more companies embracing open source, contributing to the overall community with bug fixes and new features and functions. At the same time, the commercial enterprises are “wrapping” open source with proprietary, value-added features and functions. The leader in this movement is IBM. Yep, good old Big Blue is an adherent of open source software. Why? We will try to answer this in our new information services.

Third, we think the financial pressure on organizations is greater than ever. CNBC and the Murdoch outfitted Wall Street Journal are cheering for the new economic recovery. We think that most organizations are struggling to make sales, maintain margins, and generate new opportunities. Open source search and content solutions promise some operating efficiencies. We want to cover some of the business angles of the open source search and content processing shift. Yep, open source means money.

Finally, the big solutions vendors are under a unique type of pressure. Some of it comes from licensees who are not happy with the cost of “traditional” solutions. Other comes from the data environment itself. Let’s face it. Certain search systems such as your old and dusty version of IBM STAIRS or Fulcrum won’t do the job in today’s data and information rich environment. New tools are needed. Why not solve a new information problem without dragging the costs, methods, and license restrictions of traditional enterprise software along for the ride? We think change is in the wind just like the smell of sweating horses a couple of months before the Kentucky Derby.

Our approach to information in our new services will be similar to that taken in Beyond Search. We want to provide pointers to useful write ups and offer some comments which put certain actions and events in a slightly different light. Will you agree with the information in our new services? We hope not.

Stephen E Arnold, March 27, 2012

Sponsored by Pandia.com

Publishers Pose Threats to Text Mining Expansion

March 26, 2012

Text mining software is all the rage these days due to its ability to make significant connections by quickly scanning through thousands of documents. This software can recognize, extract and index scientific information from vast amounts of plain text, allowing computers to read and organize a body of knowledge that is expanding too fast for any human to keep up with. However, Nature.com recently reported on a some issues that have developed in this growing industry in the article “Trouble at the Text Mine.”

According to the article, text mining programmers Max Haeussler and Casey Bergman have run into trouble trying to get science publishers to agree to let them mine their content.

The article asserts:

Many publishers say that they will allow their subscribers to text-mine, subject to contract and the text-miners’ intentions, and point to a number of successful agreements. But like many early advocates of the technology, Haeussler and Bergman complain that publishers are failing to cope with requests, and so are holding up the progress of research. What is more, they point out, as text-mining expands, it will be impractical for individual academic teams to spend years each working out bilateral agreements with every publisher.

While some publishers are getting on board the text mining train, many are still trying to work out how to take advantage of the commercial value before signing on. Too bad it takes more than a degree in English to make text mining deliver useful results. Bummer.

Jasmine Ashton, March 26, 2012

Sponsored by Pandia.com

Quid: Another Analytics Player

March 24, 2012

There’s a new player in content processing. Quid enters the market with big names and solid financing, too. The product description specifies:

Quid software is used by decision-makers running companies, NGOs, banks, and funds. It captures data, structures it, and enables people to visualize and interact with the information, to understand the global technology landscape. Teams can immerse themselves in and play with the data, optimizing decision-making about what to build and where to invest or partner. Quid software augments your ability to perceive this complex world.

Sounds like a valuable tool for those looking to invest in the next big thing. The software provides the ability to: map emerging technology sectors and identify rising stars; track tech R&D and breakthroughs; analyze white spaces for opportunities; and discern co-investment relationships in order to craft solid investment strategies.

We admire the company’s Origami-inspired way of explaining math and analytics. Very creative. Also, the “Life at Quid” page is well designed to entice potential employees.

Quid is one to watch as the company continues to move forward.

Cynthia Murrell, March 24, 2012

Sponsored by Pandia.com

Another Poobah Insight: Marketing Is an Opportunity

March 21, 2012

Please, read the entire write up “Marketing Is the Next Big Money Sector in Technology.” When you read it, you will want to forget the following factoids:

  • Google has been generating significant revenue from online ad services for about a decade
  • Facebook is working to monetize with a range of marketing services every single one of the 800 million plus Facebook users
  • Start ups in and around marketing are flourishing as the scrub brush search engine optimizers of yore bite the dust. A good example is the list of exhibitors at this conference.

The hook for the story is a quote from an azure chip consultancy. The idea is that as traditional marketing methods flame out, crash, and burn, digital marketing is the future. So the direct mail of the past will become spam email of the future I predict. Imagine.

Marketing will chew up an organization’s information technology budget. The way this works is that since “everyone” will have a mobile device, the digital pitches will know who, what, where, why, and how a prospect thinks, feels, and expects. The revolution is on its way, and there’s no one happier than a Madison Avenue executive who contemplates the riches from the intersection of technology, hapless prospects, and good old fashioned hucksterism. The future looks like a digital PT Barnum I predict.

Read more

Text Analytics Gurus Discuss the State of the Industry

March 19, 2012

Text Analytics News recently reported on an interview with Seth Grimes, the president of Alta Plana Corporation, and Tom H.C Anderson, managing partner of Odin Text- Anderson Analytics, in the article “Infinite Possibilities of Text Analytics.”

According to the article, in preparation for the 8th Annual Text Analytics Summit East in Boston, Text Analytics News reached out to these influential thinkers in the text mining field and asked them some questions regarding the state of the industry.

In response to a question regarding the changes in the approach of analysis software for unstructured data, Grimes said:

The big changes in text analytics are the embrace of and by Big Data, the development of ever-more sophisticated algorithms, and a shift in the way user invoke the technologies. Enterprises understand that a high proportion of Big Data is unstructured: Variety is one of Big Data’s three “Vs.” Text analytics providers know they have to meet challenges presented by the other two “Vs:” Volume and Velocity.

Stephen E Arnold, publisher of Beyond Search, will discuss the implications of “near term, throw forward” algorithms. Mr. Arnold will describe how injections of content can distort the outputs of certain analytic methods. At the fall 2011 conference, Mr. Arnold’s presentation provided a reminder that “objective” outputs may not be.

This is an interesting interview that would be worth checking out for those who are interested in attending the conference or just finding out a little more information about how content is analyzed. For registration information visit the Text Analytics website.

Jasmine Ashton, March 19, 2012

Sponsored by Pandia.com

Google and Semantic Search

March 15, 2012

The Wall Street Journal certainly has a scoop if one has been ignoring Google’s actions over the last five or six years. For a traditional “real” news publication owned by News Corp., the newspaper knows how to generate what I call “faux excitement.” The for fee version of the Wall Street Journal story is at http://goo.gl/DnRrP although the link may go dead in a New York minute.

You will want to snag a copy of the dead tree edition of the March 15, 2012, newspaper. Turn to Section B1 and read “Google Gives Search a Refresh.” If you have don’t have an online subscription to Mr. Murdoch’s favorite newspaper, click here.

I found the write up bittersweet. An era has ended at the Google. Google is moving into the choppy waters of “smart” search. Others have been in the kayaks trying to navigate meaning for a long time. Perhaps the best known player is Autonomy, which is now the “baby tiger” at Hewlett Packard. Google wants to skip the baby tiger metaphor and jump to the semantic shark.

My research suggests that Google has been grinding away at semantic search for a while, at least a decade. There were signals about Google wanting to get beyond the “clever” linking method and the semantic techniques of Oingo (Applied Semantics) a decade ago. (Notice the word “semantics” in the company name?)

Then Google took a couple of steps forward when it landed the Transformics technologies and hired Dr. Ramanathan Guha. You can get the run down on Dr. Guha’s semantic leanings when you work through the hits for this query on Google: Ramanathan Guha semantic Web. No quotes required. Dr. Guha is the wizard behind the Programmable Search Engine, which I described in some length in Google Version 2.0: The Calculating Predator, published by the UK outfit Infonortics five years ago. The monograph may still be in print, and if you can snag a copy, you will see how Google’s wizard explains a system and method to populate “fact tables” and perform other feats of semantic legerdemain. The Wall Street Journal focuses on Google’s acquisition of Metaweb Technologies which is more along the lines of a a complementary content or fact generating system. Google has a tendency to “glue” technologies together, not toss the shark technologies out with the bathwater.

The write up is one of those fear-uncertainty-doubt maneuvers which technology companies enjoy. “Real” journalists are too savvy to fall for the shiny lures. The persistent reader will learn that there is no release date for the new Google search. This surprised me because I was sure I read and later heard that Google version 2.0 was Google Plus, not plain old search with some WolframAlpha.com like touches and Blekko nuances stirred in for enhanced flavor. I must admit I was confused about a news story written in the present tense which is really about some search advances which will arrive at an indeterminate time in the future, maybe tomorrow, maybe in September when the leaves turn.

The story suggests that Google is making changes because of Microsoft Bing, Apple’s voice search, or Facebook, which has no search service of much consequence. My hunch is that Google is making changes to search for one reason: ad revenue via traditional browser based search is softening. This is bad news for anyone dependent on online advertising revenue to pay for airplanes, Davos visits, and massive television and print advertising. Forget the competitors, Google has to do something that works to pump up margins and generate massive revenue. After more than a decade of trying to diversify its revenue, Google is under the gun. If Google’s magic touch were actually working, then the company should be rolling in dough from multiple revenue streams. Where is the payoff from appliances, enterprise sales, and me-too services which have essentially zero impact on companies like Apple, Facebook, and Microsoft.

Google’s PR thrust to focus attention on how it will improve search comes too quickly after Google got “real” journalists to believe that Google 2.0 was the “social” services. Well, how has that worked out for Google? I wrote about James Whittaker’s explanation of “Why I Left Google”. If you haven’t read the Whittaker write up, click here. The passage I noted was:

I couldn’t even get my own teenage daughter to look at Google+ twice, “social isn’t a product,” she told me after I gave her a demo, “social is people and the people are on Facebook.” Google was the rich kid who, after having discovered he wasn’t invited to the party, built his own party in retaliation. The fact that no one came to Google’s party became the elephant in the room.

Net net: Google has been in the semantic game a long time. Semantic technology is now in operation at Google, just as plumbing. Now Google wants to expose the pipes and drains.

The reason?

Semantic are hoped to give Google more hooks on which to hang advertising messages. Without something new, revenue growth at Google may degrade at a time when Apple, Facebook, and Microsoft continue to grow. The unthinkable? Nope, the reality.

Stephen E Arnold, March 15, 2012

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta