Connotate Acquires Fetch Technologies

March 27, 2012

I know, “Who? Bought what?”

Connotate is a data fusion company which uses software bots (agents) to harvest information. Fetch Technologies, founded more than a decade ago, processes structured data. The deal comes on the heels of some executive ball room dancing. Connotate snagged a new CEO, Keith Cooper, according to New Jersey Tech Week. Fetch also uses agent technology.

Founded in 1999, Fetch Technologies enables organizations to extract, aggregate and use real-time information from Web sites. Fetch’s artificial intelligence-based technology allows precise data extraction from any Web site, including the so-called Deep Web, and transforms that data into a uniform format that can be integrated into any analytics or business intelligence software.

The company’s technology originated at the University of Southern California’s Information Sciences Institute. Fetch’s founders developed the core artificial intelligence algorithms behind the Fetch Agent Platform while they were faculty members in Computer Science at USC. Fetch’s artificial intelligence solutions were further refined through years of research funded by the Defense Advanced Research Projects Agency (DARPA), the National Science Foundation (NSF), the U.S. Air Force, and other U.S. Government agencies.

The Connotate news release said:

Fetch is very excited to combine our information extraction, integration, and data analytics solution with Connotate’s monitoring, collection and analysis solution,” said Ryan Mullholland, Fetch’s former CEO and now President of Connotate. Our similar product and business development histories, but differing go-to-market strategies creates an extraordinary opportunity to fast-track the creation of world-class proprietary ‘big data’ collection and management solutions.

Okay, standard stuff. But here’s the paragraph that caught my attention:

Big data, social media and cloud-based computing are major drivers of complexity for business operations in the 21st century,” said Keith Cooper, CEO of Connotate.  “Connotate and Fetch are the only two companies to apply machine learning to web data extraction and can now take the best of both solutions to create a best-of-breed application that delivers inherent business value and real-time intelligence to companies of all sizes.

I am not comfortable with the assertion of “only two companies to apply machine learning to Web data extraction.” In our coverage of the business intelligence and text mining market in Inteltrax.com, we have written about many companies which have applied such technologies and generated more market traction. Examples range from Digital Reasoning to Palantir, and others.

The deal is going to deliver on a “unified vision.” That may be true; however, saying and doing are two different tasks. As I write this, unification is the focus of activities from big dogs like Autonomy, now part of Hewlett Packard, to companies which have lower profiles than Connotate or Fetch.

We think that the pressure open source business intelligence and open source search are exerting will increase. With giants like IBM (Cognos, i2 Group, SPSS) and Oracle working to protect their revenues, more mergers like the Connotate-Fetch tie up are inevitable. You can read a July 14, 2010, interview with Xoogler Mike Horowitz, Fetch Technologies at this link.

Will the combined companies rock the agent and data fusion market? We hope so.

Stephen E Arnold, March 27, 2012

Sponsored by Pandia.com

Lexmark: Under Its Own Nose

March 20, 2012

I read “Lexmark Acquires Isys Search Software and Nolij (Knowledge, get it?) In 2008, Hewlett Packard   acquired Lexington based Exstream Software. HP paid $350 million for the company, leaving Lexmark wondering what its arch printing enemy was doing. Now more than three years later, Lexmark is lurching through acquisitions.

On March 7, 2012, I reported that Lexmark purchased Brainware, a search, eDiscovery, and back office system. Brainware caught my attention because its finding method was based in part on tri-gram technology. I recall seeing patents on the method which were filed in 1999. I have a special report on this Brainware if anyone is interested. Brainware has a rich history. Its technology stretches back to SER Solutions (See US6772164). SER was once part of SER Systems AG. The current owners bought the search and technology and generated revenue from its back office capabilities, not the “pure” search technology. However, Brainware’s associative memory technology struck me as interesting because it partially addressed the limitations of trigram indexes. Brainware became part of Lexmark’s Perceptive Software unit.

Now, a mere two weeks later, Lexmark snags another search and retrieval company. Isys Search was started by Iain Davies in 1988. Mr. Davies was an author and an independent consultant in IBM mainframe fourth generation languages. His vision was to provide an easy-to-use search system. When I visited with him in 2009, I learned that Isys had more than 12,000 licensees worldwide. However, in the US, Isys never got the revenue traction which Autonomy achieved. Even Endeca which was roughly one-tenth the size of Autonomy was larger than Isys. The company began licensing its connectors to third parties a couple of years ago, and I did not get too many requests for analyses of the company’s technology. Like Endeca, the system processes content and generates a list of entities and other “facets’ which can help a user locate additional information for certain types of queries.

Now Lexmark, which allowed Exstream to go to HP, has purchased two companies with technology which is respectively 24 and 12 years old. I am okay with this approach to obtaining search and retrieval functionality, but I do wonder what Lexmark is going to do to leverage these technologies now that HP has Autonomy and Oracle has Endeca. Microsoft is moving forward with Fast Search and a boat load of third party search solutions from certified Microsoft partners. IBM does the Lucene Watson thing, and every math major from New York to San Francisco is jumping into the big data search and analytics sector.

Here’s a screen shot of the Isys Version 8 interface, which has been updated I have heard. You can see its principal features. I have an analysis of this system as well.

clip_image002

What will Lexmark do with two search vendors?

Here’s the news release lingo:

“Our recent acquisitions enable Lexmark to offer customers a differentiated, integrated system of solutions that are unique, cost effective, and deliver a rapid return on investment,” said Paul Rooke, Lexmark’s chairman and CEO. “The methodical shift in our focus and investments has strengthened our managed print services offerings and added new content and process technologies, positioning Lexmark as a key solutions provider to businesses large and small.”

Perceptive Software is now in the search and content processing business. However, unlike Exstream, these two companies do not have a repository and cross media publishing capability. I think it is unlikely that Lexmark/Perceptive will be able to shoehorn either of these two systems’ technology into its printers. Printers make money because of ink sales, not because of the next generation technology that some companies think will make smart printers more useful. Neither Brainware nor Isys has technology which meshes with the big data and Hadoop craziness now swirling around.

True, Lexmark can invest in both companies, but the cash required to update code from 1988 and methods from 1999 might stretch the Lexmark pocket book. Lexmark has been a dog paddler since the financial crisis of 2008.

image

Source: Google Finance

Here’s the Lane Report’s take on the deal:

Lexmark’s recent acquisitions have advanced its “capture/manage/access” strategy, enabling the company to intelligently capture content from hardcopy and electronic documents through a range of devices including the company’s award-winning smart multifunction products and mobile devices, while also managing and processing content through its enterprise content management and business process management technologies. These technologies, when combined with Lexmark’s managed print services capabilities, give the company the unique ability to help customers save time and money by managing their printing and imaging infrastructure while providing complementary and high value, end-to-end content and process management solutions.

I have a different view:

First, a more fleet footed Lexmark would have snagged the Exstream company. It was close to home, generating revenue, and packaged a solution. Exstream was not a box of Lego blocks. What Perceptive now has is an assembly job, not a product which can go head to head against Hewlett Packard. Maybe Lexmark will find a new market in Oracle installations, but Lexmark is a printer company, not a data management company.

Second, technology is moving quickly. Neither Brainware nor Isys has the components which allow the company to process content and output the type of results one gets from Digital Reasoning or Palantir. Innovative Ikanow is leagues ahead of both Brainware and Isys.

Neither Brainware nor Isys is open source centric. Based on my research and our forthcoming information services about open source technology, neither Brainware nor Isys is in that game. Because growth is exploding in the open source sector, how will Lexmark recover its modest expenditures for these two companies?

I think there may be more lift in the analytics sector than the search sector, but I live in Harrod’s Creek, not the intellectual capital of Kentucky where Lexmark is located.

Worth watching.

Stephen E Arnold, March 20, 2012

Sponsored by Pandia.com

Prediction Data Joins the Fight

January 12, 2012

It seems that prediction data could be joining the fight against terrorism. According to the Social Graph Paper article “Prediction Data As An API in 2012” some companies are working on developing prediction models that can be applied to terror prevention. The article mentions the company Palantir “they emphasize development of prediction models as applied to terror prevention, and consumed by non-technical field analysts.” Recorded Future is another company but they rely on “creating a ‘temporal index’, a big data/ semantic analysis problem, as a basis to predict future events.”  Other companies that have been dabbling in big data/prediction modeling are Sense Networks, Digital Reasoning, BlueKai and Primal. The author theorizes that “There will be data-domain experts spanning the ability to make sense of unstructured data, aggregate from multiple sources, run prediction models on it, and make it available to various “application” providers.”  Using data to predict the future seems a little farfetched but the technology is still new and not totally understood. Everyone does need to join the fight against terrorism but exactly how data prediction fits in remains to be seen.

April Holmes, January 12, 2012

Sponsored by Pandia.com

Predictions on Big Data Miss the Real Big Trend

December 18, 2011

Athena the goddess of wisdom does not spend much time in Harrod’s Creek, Kentucky. I don’t think she’s ever visited. However, I know that she is not hanging out at some of the “real journalists’” haunts. I zipped through “Big Data in 2012: Five Predictions”. These are lists which are often assembled over a lunch time chat or a meeting with quite a few editorial issues on the agenda. At year’s end, the prediction lunch was a popular activity when I worked in New York City, which is different in mental zip from rural Kentucky.

The write up churns through some ideas that are evident when one skims blog posts or looks at the conference programs for “big data.” For example—are you sitting down?—the write up asserts: “Increased understanding of and demand for visualization.” There you go. I don’t know about you, but when I sit in on “intelligence” briefings in the government or business environment, I have been enjoying the sticky tarts of visualization for years. Nah, decades. Now visualization is a trend? Helpful, right?

Let me identify one trend which is, in my opinion, an actual big deal. Navigate to “The Maximal Information Coefficient.” You will see a link and a good summary of a statistical method which allows a person to process “big data” in order to determine if there are gems within. More important, the potential gems pop out of a list of correlations. Why is this important? Without MIC methods, the only way to “know” what may be useful within big data was to run the process. If you remember guys like Kolmogorov, the “we have to do it because it is already as small as it can be” issue is an annoying time consumer. To access the original paper, you will need to go to the AAAS and pay money.

The abstract for “Detecting Novel Associates in Large Data Sets by David N. Reshef1,2,3,*,†, Yakir A. Reshef, Hilary K. Finucane, Sharon R. Grossman, Gilean McVean, Peter Turnbaugh, Eric S. Lander, Michael Mitzenmacher, Pardis C. Sabet, Science, December 16, 2011 is:

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R^2) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

Stating a very interesting although admittedly complex numerical recipe in a simple way is difficult, I think this paragraph from “The Maximal Information Coefficient”  does a very good job:

The authors [Reshef et al] go on showing that that the MIC (which is based on “gridding” the correlation space at different resolutions, finding the grid partitioning with the largest mutual information at each resolution, normalizing the mutual information values, and choosing the maximum value among all considered resolutions as the MIC) fulfills this requirement, and works well when applied to several real world datasets. There is a MINE Website with more information and code on this algorithm, and a blog entry by Michael Mitzenmacher which might also link to more information on the paper in the future.

Another take on the MIC innovation appears in “Maximal Information Coefficient Teases Out Multiple Vast Data Sets”. Worth reading as well.

Forbes will definitely catch up with this trend in a few years. For now, methods such as MIC point the way to making “big data” a more practical part of decision making. Yep, a trend. Why? There’s a lot of talk about “big data” but most organizations lack the expertise and the computational know how to perform meaningful analyses. Similar methods are available from Digital Reasoning and the Google love child Recorded Future. Palantir is more into the make pictures world of analytics. For me, MIC and related methods are not just a trend; they are the harbinger of processes which make big data useful, not a public relations, marketing, or PowerPoint chunk of baloney. Honk.

Stephen E Arnold, December 18, 2011

Sponsored by Pandia.com, a company located where high school graduates actually can do math.

Search Silver Bullets, Elixirs, and Magic Potions: Thinking about Findability in 2012

November 10, 2011

I feel expansive today (November 9, 2011), generous even. My left eye seems to be working at 70 percent capacity. No babies are screaming in the airport waiting area. In fact, I am sitting in a not too sticky seat, enjoying the announcements about keeping pets in their cage and reporting suspicious packages to law enforcement by dialing 250.

I wonder if the mother who left a pink and white plastic bag with a small bunny and box of animal crackers is evil. Much in today’s society is crazy marketing hype and fear mongering.

Whilst thinking about pets in cages and animal crackers which may be laced with rat poison, and plump, fabric bunnies, my thoughts turned to the notion of instant fixes for horribly broken search and content processing systems.

I think it was the association of the failure of societal systems that determined passengers at the gate would allow a pet to run wild or that a stuffed bunny was a threat. My thoughts jumped to the world of search, its crazy marketing pitches, and the satraps who have promoted themselves to “expert in search.” I wanted to capture these ideas, conforming to the precepts of the About section of this free blog. Did I say, “Free.”

A happy quack to http://www.alchemywebsite.com/amcl_astronomical_material02.html for this image of the 21st century azure chip consultant, a self appointed expert in search with a degree in English and a minor in home economics with an emphasis on finger sandwiches.

The Silver Bullets, Garlic Balls, and Eyes of Newts

First, let me list the instant fixes, the silver bullets,  the magic potions, the faerie dust, and the alchemy which makes “enterprise search” work today. Fasten your alchemist’s robe, lift your chin, and grab your paper cone. I may rain on your magic potion. Here are 14 magic fixes for a lousy search system. Oh, one more caveat. I am not picking on any one company or approach. The key to this essay is the collection of pixie dust, not a single firm’s blend of baloney, owl feathers, and goat horn.

  1. Analytics (The kind equations some of us wrangled and struggled with in Statistics 101 or the more complex predictive methods which, if you know how to make the numerical recipes work, will get you a job at Palantir, Recorded FutureSAS, or one of the other purveyors of wisdom based on big data number crunching)
  2. Cloud (Most companies in the magic elixir business invoke the cloud. Not even Macbeth’s witches do as good  a job with the incantation of Hadoop the Loop as Cloudera,but there are many contenders in this pixie concoction. Amazon comes to mind but A9 gives me a headache when I use A9 to locate a book for my trusty e Reeder.)
  3. Clustering (Which I associate with Clustify and Vivisimo, but Vivisimo has morphed clustering in “information optimization” and gets a happy quack for this leap)
  4. Connectors (One can search unless one can acquire content. I like the Palantir approach which triggered some push back but I find the morphing of ISYS Search Software a useful touchstone in this potion category)
  5. Discovery systems (My associative thought process offers up Clearwell Systems and Recommind. I like Recommind, however, because it is so similar to Autonomy’s method and it has been the pivot for the company’s flip flow from law firms to enterprise search and back to eDiscovery in the last 12 or 18 months)
  6. Federation (I like the approach of Deep Web Technologies and for the record, the company does not position its method as a magical solution, but some federating vendors do so I will mention this concept. Yhink mash up and data fusion too)
  7. Natural language processing (My candidate for NLP wonder worker is Oracle which acquired InQuira. InQuira is  a success story because it was formed from the components of two antecedent search companies, pitched NLP for customer support,and got acquired by Oracle. Happy stakeholders all.)
  8. Metatagging (Many candidates here. I nominate the Microsoft SharePoint technology as the silver bullet candidate. SharePoint search offers almost flawless implementation of finding a document by virtue of  knowing who wrote it, when, and what file type it is. Amazing. A first of sorts because the method has spawned third party solutions from Austria to t he United States.)
  9. Open source (Hands down I think about IBM. From Content Analytics to the wild and crazy Watson, IBM has open source tattooed over large expanses of its corporate hide. Free? Did I mention free? Think again. IBM did not hit $100 billion in revenue by giving software away.)
  10. Relationship maps (I have to go with the Inxight Software solution. Not only was the live map an inspiration to every business intelligence and social network analysis vendor it was cool to drag objects around. Now Inxight is part of Business Objects which is part of SAP, which is an interesting company occupied with reinventing itself and ignored TREX, a search engine)
  11. Semantics (I have to mention Google as the poster child for making software know what content is about. I stand by my praise of Ramanathan Guha’s programmable search engine and the somewhat complementary work of Dr. Alon Halevy, both happy Googlers as far as I know. Did I mention that Google has oodles of semantic methods, but the focus is on selling ads and Pandas, which are somewhat related.)
  12. Sentiment analysis (the winner in the sentiment analysis sector is up for grabs. In terms of reinventing and repositioning, I want to acknowledge Attensity. But when it comes to making lemonade from lemons, check out Lexalytics (now a unit of Infonics). I like the Newssift case, but that is not included in my free blog posts and information about this modest multi-vehicle accident on the UK information highway is harder and harder to find. Alas.)
  13. Taxonomies (I am a traditionalist, so I quite like the pioneering work of Access Innovations. But firms run by individuals who are not experts in controlled vocabularies, machine assisted indexing, and ANSI compliance have captured the attention of the azure chip, home economics, and self appointed expert crowd. Access innovations knows its stuff. Some of the boot camp crowd, maybe somewhat less? I read a blog post recently that said librarians are not necessary when one creates an enterprise taxonomy. My how interesting. When we did the ABI/INFORM and Business Dateline controlled vocabularies we used “real” experts and quite a few librarians with experience conceptualizing, developing, refining, and ensuring logical consistency of our word lists. It worked because even the shadow of the original ABI/INFORM still uses some of our term 30 plus years later. There are so many taxonomy vendors, I will not attempt to highlight others. Even Microsoft signed on with Cognition Technologies to beef up its methods.)
  14. XML (there are Google and MarkLogic again. XML is now a genuine silver bullet. I thought it was a markup language. Well, not any more, pal.)

Read more

A Coming Dust Up between Oracle and MarkLogic?

November 7, 2011

Is XML the solution to enterprise data management woes? Is XML a better silver bullet than taxonomy management? Will Oracle sit on the sidelines or joust with MarkLogic?

Last week, an outfit named AtomicPR sent me a flurry of news releases. I wrote a chipper Atomic person mentioning that I sell coverage and that I thought the three news releases looked a lot like Spam to me. No answer, of course.

A couple of years ago, we did some work for MarkLogic, a company focused on Extensible Markup Language or XML. I suppose that means AtomicPR can nuke me with marketing fluff. At age 67, getting nuked is not my idea of fun via email or just by aches and pains.

Since August 2011, MarkLogic has been “messaging” me. The recent 2011 news releases explained that MarkLogic was hooking XML to the buzz word “big data.” I am not exactly sure what “big data” means, but that is neither here nor there.

In September 2011, I learned that MarkLogic had morphed into a search vendor. I was surprised. Maybe, amazed is a more appropriate word. See Information Today’s interview with Ken Bado, formerly an Autodesk employee. (Autodesk makes “proven 3D software that accelerates better design.” Autodesk was the former employer of Carol Bartz when Autodesk was an engineering and architectural design software company. I have a difficult time keeping up with information management firms’ positioning statements. I refer to this as “fancy dancing” or “floundering” even though an azure chip consultant insists I really should use the word “foundering”. I love it when azure chip consultants and self appointed experts input advice to my free blog.)

In a joust between Oracle and MarkLogic, which combatant will be on the wrong end of the pointy stick thing? When marketing goes off the rails, the horse could be killed. Is that one reason senior executives exit the field of battle? Is that one reason veterinarians haunt medieval re-enactments?

Trade Magazine Explains the New MarkLogic

I thought about MarkLogic when I read “MarkLogic Ties Its Database to Hadoop for Big Data Support.” The PCWorld story stated:

MarkLogic 5, which became generally available on Tuesday, includes a Hadoop connector that will allow customers to “aggregate data inside MarkLogic for richer analytics, while maintaining the advantages of MarkLogic indexes for performance and accuracy,” the company said.

A connector is a software widget that allows one system to access the information in another system. I know this is a vastly simplified explanation. Earlier this year, Palantir and i2 Group (now part of IBM) got into an interesting legal squabble over connectors. I believe I made the point in a private briefing that “connectors are a new battleground.” the MarkLogic story in PCWorld indicated that MarkLogic is chummy with Hadoop via connectors. I don’t think MarkLogic codes its own connectors. My recollection is that ISYS Search Software licenses some connectors to MarkLogic, but that deal may have gone south by now. And, MarkLogic is a privately held company funded, I believe, by Lehman Brothers, Sequoia Capital, and Tenaya Capital. I am not sure “open source” and these financial wizards are truly harmonized, but again I could be wrong, living in rural Kentucky and wasting my time in retirement writing blog posts.

Read more

Inteltrax: Top Stories, October 10 to October 14

October 17, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, how analytic technology depends so heavily on funding and what those dollars signify.

Our feature story this week, “Palantir Back From the Grave,” http://inteltrax.com/?p=2775 details how one BI company suffered some near-fatal blows, but has bounced back with new software and confidence, thanks to some new funding.

Another funding-centric tale was our story, “Opera and Xignite Make Waves by Raising Millions” http://inteltrax.com/?p=2573 that showed two smaller companies on the rise thanks to some big time investments.

We turned the tables with “Actuate Analytics Contest Gets Attention” http://inteltrax.com/?p=2541 to show how one company is supporting the next generation of analytic thinkers by offering their financial support.

Money makes the big data globe spin, it’s no secret. But funding carries a lot of meaning in this industry, usually it’s a sign of impending success. We’ll see if that theory holds true, as we follow these and other stories in the ever-expanding world of data analytics.

Follow the Inteltrax news stream by visiting

http://www.inteltrax.com/

Patrick Roland, Editor, Inteltrax.

Real Consultants and Real Analysts Take a Hit

September 12, 2011

The Washington Post must have had a bad experience with self appointed experts. Read “The investor’s Dilemma: Earnings, Valuation and What to Do Now.” As you work through the write up, think about Microsoft’s purchase of Fast Search for $1.2 billion, Oracle’s purchase of InQuira for an estimated $66 million, Palantir’s recent intake of another $68 million, and Hewlett Packard’s interesting $11 billion purchase price for Autonomy.

Now think about the write ups from the “real” consulting companies, the trade magazines with lists of “top” companies, and the speakers on some conference programs with three or more slots in a two day period. What’s going on? The write up in the Washington Post seems to have pinpointed an important change in “analyst” behavior. Here’s the snippet I noted:

… I suspect the error is about something else. Structural changes at Wall Street firms are just as likely to be the cause. Research analysts used to work for trading and asset management divisions of big Wall Street banks. Since the 1990s, they have mostly migrated to underwriting. That’s where all the money is made. This change has changed the job of the analyst. They do far less critical analysis and far more “cheerleading.” Robert Powell, editor of Retirement Weekly, confirms it: Regarding the stocks that make up the S&P1500, Powell noted that not a single one has a Wall Street consensus “sell” rating on it. This is pretty damning proof that forecasting errors may be because of inherent structural bias.

I have a simpler way of explaining what’s going on. First, in an effort to generate revenue, analysts are now in the “pay to play” business.  But wait. Conferences are also selling speaking slots for booth / exhibit purchasers or sponsors who provide “bags” for give aways, drinks at receptions, or logos for giant banners that identify who is silver, gold, or platinum. What about lists? These are hooked in to speaking, ads, or the fraternity of the trade show.

Now keep in mind that I run content for clients. We even produce information services that explain the ins and outs of financial services, rocket science technology, and silliness about social media. When I give a talk, I get money, a free meal, and, if I am lucky, two nights at a hotel without stars.

The point is that I am an addled goose, dabbling in odds and ends. The folks touched upon in the Washington Post article try to generate an aura  of analytic objectivity. None of these poobahs, satraps, failed Webmasters, and unemployed English majors would dare to suggest that their work is little more than a clumsy payola, old style advertorial, or flat out fluff.

The disconnect between facts and value is fascinating. Can one believe anything from anyone in the advisory business? I hope so. I think I can filter the goose feathers from the giblets. My hunch is that others cannot, will not, or do not think goose feathers are anything by gold. Believe me, goose feathers are not gold. Goose feathers can absorb a hit. Worth having a few around if you are a “real” consultant.

Stephen E Arnold, September 12, 2011

Sponsored by Pandia.com

IBM Acquires i2 Ltd.

September 1, 2011

IBM purchased i2 Group. Founded in 1990 by Mike Hunter, i2 is a widely used content processing and case management system for law enforcement and intelligence professionals. The company received the EuroIntel Golden Candle Award, for its contribution to the global intelligence community. On several occasions, the ArnoldIT team worked on some i2 products several years. The company has moved outside the somewhat narrow market for sophisticated intelligence analysis systems.

IBM Acquiring I2 for Criminal Mastermind Software” reported:

IBM plans to fuse i2’s products with its own data collection, analysis and warehousing software. It will then offer packages based on this combinations to organizations looking to spot suspicious behavior within vast collections of data.

Not surprisingly, there has been considerable confusion about the company. Part of the reason is that the name “i2” was used by a back office and supply chain company. The firm benefited from its acquisition from the low profile Silver Lake Sununu. Silver Lake purchased i2 from Choicepoint in 2008 for about $185 million. “IBM Bolsters Big Data Security Credentials with i2 Buy” opines that the deal was worth more than $500 million, a fraction of what UK vendor Autonomy commanded from Hewlett Packard in August 2011.

i2’s technology is not well understood by those without direct experience using the firm’s pace setting products. One example in the Analyst’s Notebook, a system which allows multiple case details to be processed, analyzed, and displayed in a manner immediately familiar to law enforcement and intelligence professionals. i2 acquired Coplink, developed at an academic institution in Arizona.

The core technology continues to be enhanced. i2 now provides its system to organizations with an interest in analyzing data across time, via relationships, and with specialized numerical recipes.

My position is that I am not going to dive into the specific features and functions of the i2 system. If you want to know more about i2’s technology, you can visit the firm’s Web site at http://www.i2group.com/us. The Wikipedia page and many of the news and trade write ups about i2 are either incorrect or off by 20 degrees or more.

What will IBM “do” with the i2 technology? My hunch is that IBM will maintain the present market trajectory of i2 and expose the firm’s technology to IBM clients and prospects with specific security needs. Please, appreciate that the nature of the i2 technology is essentially the opposite of software available for more general purpose applications. My view is that IBM will probably continue to support the integration of i2 Clairty component with the Microsoft SharePoint platform. Like the descriptions of Autonomy’s technology, some of the write ups about i2 may require further verfication.

We have reported on the legal dust up about the i2 ANB file format and some friction between Palantir and i2 in Inteltrax. Most of the legal hassles appear to be worked out, but contention is certainly possible going forward.

I have been a fan of i2’s technology for many years. However, some firms have moved into different analytical approaches. In most cases, these new developments enhance the functionality of an i2 system. Today we are featuring an editorial by Tim Estes, founder of Digital Reasoning, a company that has moved “beyond i2.” You can read his views about the Autonomy deal in “Summer of Big Deals”. More information about Digital Reasoning is available at www.digitalreasoning.com. Digital Reasoning is a client of ArnoldIT, the publisher of this information service.

Stephen E Arnold, September 1, 2011

Sponsored by Pandia.com

Recommind and Predictive Coding

June 15, 2011

The different winners of the Kentucky Derby, Preakness, and Belmont horse races cast some doubt on predictive analytics. But search and content processing is not a horse race. The results are going to be more reliable and accurate, or that is the assumption. One thing is 100 percent certain: A battle over the phrase “predictive coding” in the marketing of math that’s in quite a few textbooks is brewing.

First, you will want to read US 7,933,859, Systems and Methods for Predictive Coding.” You can get your copy via the outstanding online service at USPTO.gov. The patent was a zippy one, filed on May 25, 2010, and granted on April 26, 2011.

There were quite a few write ups about the patent. We noted “Recommind Patents Predictive Coding” from Recommind’s Web site. The company has a Web site focused on predictive coding with the tag line “Out predict. Out perform.” A quote from a lawyer at WilmerHale announces, “This is a game changer in eDiscovery.”

Why a game changer? The answer, according to the news release, is:

Recommind’s Predictive Coding™ technology and workflow have transformed the legal industry by accelerating the most expensive phase of eDiscovery, document review. Traditional eDiscovery software relies on linear review, a tedious, expensive and error-prone process . . . . Predictive Coding uses machine learning to categorize and prioritize any document set faster, more accurately and more defensibly than contract attorneys, no matter how much data is involved.

Some push back was evident in “Predictive Coding War Breaks Out in US eDiscovery Sector.” The point in this write up is that other vendors have been offering predictive functions in the legal market.

Our recollection is that a number of other outfits dabble in this technological farm yard as well. You can read the interview with Google-funded Recorded Future and Digital Reasoning in my Search Wizards Speak series. I have noted in my talks that there seems to be some similarity between Recommind’s systems and methods and Autonomy’s, a company that is arguably one of the progenitors of probabilistic methods in the commercial search sector. Predecessors to Autonomy’s Integrated Data Operating Layer exist all the way back to math-crazed church men in ye merrie old England before steam engines really caught on. So, new? Well, that’s a matter for lawyers I surmise.

With the legal dust up between i2 Ltd. and Palantir, two laborers on the margins of the predictive farm yard, legal fires can consume forests of money in a flash. You can learn more about data fusion and predictive analytics in my Inteltrax information service. Navigate to www.inteltrax.com.

Stephen E Arnold, June 15, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta