Exclusive Interview: Digital Reasoning

February 2, 2010

Tim Estes, the youthful founder and chief technologist, for Digital Reasoning, a search and content processing company based in Tennessee, reveals the technology the is driving the company’s growth. Mr. Estes, a graduate of the University of Virginia, tackled the problem of information overload with a fresh approach. You can learn about Digital Reasoning’s approach that delivers a system that “deeply, conceptually searches within unstructured data, analyzes it and presents dynamic visual results with minimal human intervention. It reads everything, forgets nothing and gets smarter as you use it.”

Mr. Estes explained:

Digital Reasoning’s core product offering is called “Synthesys.” It is designed to take an enterprise from disparate data silos (both structured and unstructured), ingest and understand the data at an entity level (down to the “who, what, and wheres” that are mentioned inside of documents), make it searchable, linkable, and provide back key statistics (BI type functionality). It can work in an online/real-time type fashion given its performance capabilities. Synthesys is unique because it does a really good job at entity resolution directly from unstructured data. Having the name “Umar Farouk Abdul Mutallab” misspelled somewhere in the data is not a big deal for us – because we create concepts based on the patterns of usage in the data and that’s pretty hard to hide. It is necessarily true that a word grounds its meaning to the things in the data that are of the same pattern of usage. If it wasn’t the case no receiving agent could understand it. We’ve figured out how to reverse engineer that mental process of “grounding” a word. So you can have Abdulmutallab ten different ways and it doesn’t matter. If the evidence links in any statistically significant way – we pull it together.

You can read the full-text of this exclusive interview with Tim Estes on the ArnoldIT.com site in the Search Wizard Speak series. You can get more information about Digital Reasoning from the company’s Web site.

The Search Wizards Speak series provides the largest collection of free, detailed information about major enterprise search systems.Why pay the azure-chip consultants for sponsored listings, write ups prepared by consultants with little or no hands on experience, and services that “sell” advertorials. You hear in the developer’s, founders, and CEO’s own words what a system does and how it solves content-related problems.

Stephen E Arnold, February 2, 2010

No one paid me to write about my own Web site. I will report this charitable act to the head of the Red Cross.

Autonomy Pops Up an Email Archiving Toaster

January 31, 2010

Autonomy is in the appliance business. You can get what The Orange Rag called “the Autonomy eDiscovery Appliance.” The idea is that the features of a Clearwell-type of solution is combined with Autonomy’s smart software and connectors. The solution, according to The Orange Rag: “delivers a broad set of unique capabilities” and “meaning based computing”. Among the features embedded in the appliance are search, connectors to various content types, visualization, scalability, and reports. The appliance that has captured some loyal fans is the Clearwell Systems’ “rocket docket” service in its appliance. Clearwell now has a formidable competitor, and I wonder if the value-added software that allows a report to be generated that can be slapped in the hand of opposing counsel and a nifty audit trail feature will be enough to deal with the steroid infused marketing of Autonomy. Should be interesting because Recommind has tried to broaden beyond the legal market in a bid to become an enterprise search vendor. Stratify has morphed several times in its eDiscovery journey. EMC bought Kazeon and may be getting ready to attack the legal eagles from the storage angle. I suppose this is what the azure chip crowd calls “search specialization”. I thought it was savvy product packaging, but what do I know. I am not young and inclined to perceive myself as infallible. I am an addled goose who forgets when he puts his pin feathers.

Stephen E Arnold, January 31, 2010

A freebie. I will report this unpleasant fact to the director of the US Postal Museum where old information methods are on display.

Autonomy and Precise Team Up

January 24, 2010

Autonomy continues to sniff trends and move before other players in the enterprise search and content processing space. I saw a short announcement on Sharecast (a service with more weird pop ups than most Web sites I visit) that said:

Search software firm Autonomy is teaming up with UK-based media intelligence outfit Precise to develop and market next-generation media intelligence services to the public relations and communications sectors.

Autonomy is well known to readers of this Web log. Precise may not be. Here’s a quick run down on that outfit:

  • The company is in the “media intelligence” business. This is somewhat similar to the old style Bacon’s clipping service put on steroids.
  • The company has more than 5,000 customers and a big chunk of them are in the financial services and information sector. The idea is that media monitoring provides open source information that Precise converts into intelligence about what a company will or may do. This is the enterprise version of government intelligence agency operations.
  • The chief information officer comes from the real time information side of the business. (This suggests to me that Autonomy is deep into the real time content processing spaghetti.)
  • The company’s description of its services sounds almost Googley: “Our Media Portal allows our clients to view and evaluate the impact of coverage from every media source – print, broadcast, online. In addition they can access forward planning data at the touch of a button.”

My take on this is that Autonomy will be nosing into other real time information sectors as well. Some of the incumbents may find that Autonomy’s marketing and its corporate clout will push them out of their comfortable positions. Who will be affected by Autonomy if it moves in this direction? That’s a good question.

Stephen E Arnold, January 24, 2010

A freebie. No one paid me to write about this tie up. I suppose I shall report this sad fact to MARAD, an outfit that knows about brown water tie ups.

Information Builders WebFOCUS Magnify

January 22, 2010

I wrote about Information Builders in my report for the Gilbane Group a couple of years ago. I was updating my files, and I wanted to pass along a bit of information about this product. Information Builders is a vendor of enterprise solutions. The company spans data management, business intelligence, and dashboards. You can get basic information about the firm’s products on the firm’s Web site. There are some diagrams showing how Information Builders’ software “snaps in” to most enterprise computing infrastructures.

Information Builders rolled out more than 100 changes to its WebFOCUS product line. According to TDWI, there have changes to the dashboard and the predictive analytics. For search, TDWI says:

“WebFOCUS Magnifyƒz a search navigation tool that dynamically categorizes search results and supplements them with analysis and reporting capabilities, has been updated with new features designed to enhance both usability and security. Highlights include:

    • Collections, enabling users to select a collection that narrows their search to a specific part of the index, prior to their submission
    • Magnify iWay Wizard, helping users to quickly set up a Magnify environment, instructing them how to handle each field when transforming a record into a search result
    • New security features including single sign-on integration, multiple credential support, the ability to hide entire results and parts of results, and present alternate-result rendering as well as a security API”.

The search product indexes content, using what the company calls “real-time transactional indexing method”. The results list provides relevance ranked output plus facets to allow point and click navigation. The screenshot below shows one possible results display. The system is fully customizable, so you can create the look and feel you want.

info builders

If you are an Information Builders’ customer running the company’s business intelligence system, the search system (WebFOCUS Magnify) integrates with that Information Builders’ product. The company says:

A user’s inquiry does not end with search results. Those results will often lead to new questions, which, in order to be answered, require numbers analysis, aggregations, and value comparisons that can only be achieved through business intelligence. For example, a search for a bank account may lead the user to want to know more about its cash flow patterns, expenditures by time frame, and other transactions. Magnify allows a user to drill down directly from search results into the reporting system, so any natural language search can be instantly expanded to include numbers analysis. Search terms are fed to the WebFOCUS platform as parameters that automatically trigger the generation of reports or guided ad hoc forms that can be used to further refine report content.

The company added support for Web content three or four years ago. You can read about this aspect of WebFOCUS in “Information Builders Google-izes BI.” The company has been surfing on Google for a couple of years. You can get an overview of Information Builders’ Google components on the Google Solutions Marketplace.

I have never been able to get a firm grip on the Information Builders’ pricing for its search and content processing technology. I did learn that the WebFOCUS Intelligent Search component for the Google OneBox for Enterprise is free. You can learn more about this product from the Google Solutions Marketplace. You can read about this “surf on Google” play in the March 2006 Computer World article “Information builders, Google Build Corporate BI Search Tool.” I just haven’t heard much about the success of this software component.

Stephen E Arnold, January 22, 2010

Okay, here’s the scoop. A freebie. I will report this to the Bureau of Labor Statistics, an outfit that combines work and data analysis. That’s the oversight group for me.

Stratify Software India

January 13, 2010

Some interesting information about Stratify, a unit of Iron Mountain, surfaced in a job posting for an engineer in Bangalore. In India, Stratify does business as Stratify Software India Pvt Ltd. The part of the advert that caught my attention was this description of Stratify as a Software as a Service company. Here’s the snippet I found interesting:

Stratify is a Product company which provides electronic discovery or unstructured Data mining solutions through Software as a Service Model. We are a fully owned subsidiary of Iron Mountain, the world’s largest Data Storage, protection and Recovery company with $3 Billion revenue. We are market leaders in our space and have registered 25-30% growth last year and 70% per annum growth in the previous 4 years. We have mostly Fortune-1000 companies as our clients. Iron Mountain, our parent company, has more than 13,000 employees.

Stratify, originally Purple Yogi, came on my radar as a text and content processing company. Now the firm is a provider of electronic discovery or unstructured data mining solutions. I also think the growth of Iron Mountain is a useful factoid as well.

Stephen E. Arnold, January 13, 2010

A freebie. I suppose this disclosure falls under the purview of the ExIm Bank to which I shall report the fact that I got no money for this item. Don’t you feel better knowing I wrote this because I have only a small pond in which to swim.

Are Google Users Ready to Step Up to Fusion Tables? Nah.

December 16, 2009

WolframAlpha and Google have a tiny challenge. Both firms’ rocket scientists and algorithm wranglers understand the importance of herding data. Take this simple test. Navigate first to WolframAlpha and enter a word pair. Try UK population. Now navigate to Google’s public facing Fusion table demo here. What did you get? How did it work? Do you know why the systems responded as they did? How do you improve your query?

My hunch is that few readers of this Web log can answer these questions? Agree? Disagree? Well, I am not running an academic class, so if you flunked, that’s okay with me. I think most people will flunk, including some of the lesser lights at the Google and at WolframAlpha.

Against this background, the Google rolled out an API for Fusion tables. You can get the Googley story in the write up “Google Fusion Tables API.” My view is that Google’s moves in structured data are quite important, generally unknown, and essentially incomprehensible to those who suffered through high school algebra.

My opinion is that this API will result in some applications that will make Google’s significant commitment and investment in structured data more understandable. If you are ahead of the curve, the Google is on the march. If you have no clue what this post means, maybe you should think about changing careers. Wal+Mart greeter is somewhat less challenging that the intricacies of Google’s context server technology.

Stephen E. Arnold, December 16, 2009

Okay, I rode by Google’s DC headquarters. No one waved. No one paid me. I suppose I report this fact to the manger of the Union Station taxi dispatchers. Nah, those folks don’t care that this is a freebie either.

Connotate and Its Landing Page

December 15, 2009

Getting leads and making sales is the name of the game for enterprise search vendors. I think I found an example of a search vendor using Twitter and a landing page to get leads. Here’s the tweet that I saw from a person posting as dnapoleo.

conotate tweet

This bit.ly link pointed to this special landing page:

connotate landing page

I found this interesting. I wonder, however, if this type of marketing will deliver qualified leads. Making sales today requires a heck of a lot of work. The cost and complexity of enterprise search and content processing systems seems ill suited for Twitter. A quick look at my Overflight service reveals that a balanced marketing plan is the approach taken by Autonomy, Coveo, Exalead, and MarkLogic, for example.

In fact, making sales requires a motivated sales force, brand positioning, resellers, Web logs, media campaigns using every trick in the sales books at Barnes & Noble, and client champions. It is December and cold out there. Sales heat is needed.

Contrast the Connotate approach to Google’s use of a paper wrap around to the free commuter newspaper, Metro. Google was pitching its Chrome “consumer” Web browser.

Connotate’s effort warrants watching. Now that AOL has repositioned Relegence.com as Love.com, I think some market headroom may become available for Connotate.

Stephen E. Arnold, December 15, 2009

Oyez, oyez, I am disclosing that no one paid me to write about Connotate’s possible tweet campaign. Who’s on first? Oh, I know. I am reporting today to the Farm Credit Administration. Grow those revenues, people!

Search, Its Biggest Change, and Yawns

December 8, 2009

I try to steer clear of the search engine optimization crowd. A reader sent me a link to a write up called “Google’s Personalized Results: The “New Normal” That Deserves Extraordinary Attention”. The idea is that Google can personalize search results for every user in the world. Search Engine Land slaps the word “biggest” on this Google announcement. The idea is that users should be revved up, excited, concerned, involved, etc.

I suppose I should be excited, but the personalization can be turned off. I have noticed shaped search results for quite a while. The scale interests me. Personalization is one consequence of Google’s adaptive functions. Newly visible to users, not new.

Stephen Arnold, December 8, 2009

Oyez, oyez, I want to disclose to the Geological Survey (USGS) that this new world has been explored already. I did my write up without any payment. Tough to charge money to state the obvious.

MarkLogic and Its XML Briefing Draw Crowds at London Online

December 4, 2009

Usually I ignore the exhibit areas at trade shows. I don’t know anyone any longer, and the average age of most of the people in the booths is about one third of my 65 years. I did make a sweep through the Incisive International Online Show but I had my progress impeded yesterday. The reason was that the MarkLogic briefings given every hour or so created a mini-traffic jam.

marklogic

Overflow crowds participated in the MarkLogic technical briefings at the International Online Show, December 1 to 3, 2009, in London, UK.

The briefings drew crowds that overflowed the space allocated for attendees. I asked one of the XML wizards, “What’s with the big crowd?” The MarkLogic wizard replied, “Our MarkLogic server briefing is selling like cold drinks at a football match.” MarkLogic knows its XML and its metaphors. The interest in XML MarkLogic style makes clear that where there is technical magnetism, there is a crowd.

Stephen Arnold, December 4, 2009

I want to disclose to the Food & Drug Administration that I was not paid by MarkLogic to write this article. I was not able to get a booth giveaway when I stopped to ask about the reason for the interest in the XML server lectures. I have to find a way to get some cash for my photographic expertise.

Oracle Feels Heat, Tries to Redefine Kitchen

December 3, 2009

I know when there is trouble on the off ramp that once ran directly to Sea World south of San Francisco. There is the deterioration of the road bed, a reminder of the problems aging infrastructure pose to drivers. In a way, cracked pavement and poorly marked off ramps are indicative of some enterprise technology solutions as well. You make a choice, expecting a smooth ride to Sea World, and what do you get? A jarring ride down a highway filled with bumps and pot holes. Why no improvements? Good question. I asked this when I was doing one of my periodic brush ups for the companies I track in the search, information management, and content processing sector that is my particular interest.

My research suggests that the giant database vendor Oracle faces a number of challenges. The company’s headquarters can be reached on the old Sea World highway, now named Oracle Way, I think.

First, the company cannot land its corporate jet or its founder’s jet fighter at the same airport as Google. Googlers can walk to their expensive toys. Oracle executives have to fight traffic on 101. Second, there are growing problems from data management upstarts like InfoBright and Aster Data. Third, there are the pesky French search based application vendors like Exalead. Fourth, the geriatric Codd database is getting left in the performance dust by the speedy Perfect Search vortex technology. Fifth, the Oracle Secure Enterprise Search remains an undercard opponent in the enterprise search wrestling matches that entertain me on a daily basis.

But Oracle asserts that it has not only addressed some of its weaknesses but the company has taken a leadership position in next generation data management.

For example, in November 2009, I read an interesting Oracle blog post called “Next Generation Data Warehouse Platforms”. The post made a number of assertions that suggested Oracle had overcome the problems of scaling, performance, and affordability that continue to plague the world’s largest database vendor. For example, that blog post pointed out these breakthroughs for Oracle 11 and I quote from the Oracle blog:

  • “Performance => Sun Oracle Database Machine. Yes, it really is fast!
  • In-memory processing => Oracle now has (11gR2) In-Memory Parallel Execution. More about this can be read in Maria’s excellent post here.
  • In-Database Analytics => As the report says in Exadata V2 and Oracle 11gR2 we are now offloading data mining model scoring to the storage side of the house, which allows us to embed mining models into more and more operational systems and get online (direct) feedback on transactions. We also have for years moved more and more OLAP and Stats functionality into the engine
  • Real-time data warehousing => First and foremost the read consistency model introduced in Oracle 4 (this is not a typo…) allowing readers to see consistent data during writes, secondly, the just completed acquisition of Golden Gate and the ETL capabilities (like streams) in Oracle allow for very nice real time data feeds. Oracle’s MAA architecture allows us to be up and running 24*7 on commodity hardware and deliver an online experience to all customers…
  • Cloud computing => see the in-database MapReduce post here.
  • Appliances => Sun Oracle Database Machine.”

If these statements are spot on, Oracle has cracked some technical and business challenges it has faced for many years. From Oracle’s position of strength, Oracle can crush its rivals by winning head to head competitions. Strength is manifested in client wins and revenues in my book. White papers nuking another tech vendor are not demonstrations of strength in my opinion.

Apparently companies in a position of strength find it appropriate to use rhetoric and disinformation to discipline an upstart. Let me give you an example.

I stumbled upon an Oracle white paper “Mark Logic XML Server 4.1”. You must download your own copy from this link. This paper which shows a November 2009 date here is a fascinating window into Oracle. If I were teaching rhetoric, I would use this Oracle white paper as an example of disinformation. Your mileage may vary.

I asked myself, “Why would a multi-billion dollar outfit invest the time, money, and effort in a direct attack on a specialist company chugging quietly along, pretty much minding its own business?” The Oracle white paper purports to discuss technology of a company that “continues to rely on venture funding”. The white paper explores five alleged weaknesses of the Mark Logic XML Server 4.1. The implications of the Oracle analysis range from cost to complexity to proprietary technology to financial weakness. Mark Logic, according to the white paper “Mark Logic XML Server 4.1” essentially cannot walk and chew gum at the same time.

My own experience with the Mark Logic technology is that Mark Logic can walk, chew gum, and compete in data gymnastics. Keep in mind that I have been fed cold tacos and compensated with a Mark Logic goodie bag at a recent Mark Logic meet up in Washington, DC.

I sat in the crowded meeting room with 225 other people and listened to my former colleagues at Booz, Allen & Hamilton explain their use of the Mark Logic technology. I trust blue chip consultants because the risk of screw ups is too great to deploy a solution that makes a very large client unhappy. I heard speakers representing the US government explain their use of the Mark Logic technology in war fighting, pointing out the benefits of Mark Logic technology in war fighting. I heard whiz kids explain that slicing and dicing information permitted clever mash ups of data without humans fiddling to deliver on the fly, low latency solutions for decision makers. The assertions and evidence in the Oracle anti-Mark Logic white paper were not in line with what I learned directly from Mark Logic users.

This begs the question, “So what’s with the direct attack on Mark Logic?” In my opinion, there are three factors operating:

First, Oracle finds itself in a position of playing catch up in next generation data management. For whatever reason, the Oracle sales engineers have found that organizations in a number of business sectors want a non Oracle solution.

Second, customers are struggling with a mushy economy. The notion of paying more money for Oracle licenses, more money for Oracle service, and more money for more hardware to get acceptable performance continues to lose appeal. Like SAP, Oracle finds itself facing customer resistance to the traditional enterprise software approach. Cost alone is not the only deal breaker. The perceived benefits of an Oracle RDBMS are losing magnetism.

Third, the petascale flows of data in some organizations are forcing a fundamental rethink of traditional data management and repurposing approaches in use since the late 1970s. Last generation technology is not appropriate for next generation data management problems. Not even the entrenched Oracle database administrator can get an aging RDBMS elephant to do the tricks it could in the good old days. A different data animal is needed in my opinion.

I suggest that you read the Oracle write ups yourself and draw your own conclusions. The analysis of Mark Logic underscores Oracle’s own technical Werner’s syndrome.

Stephen Arnold, December 4,  2009

Oyez, oyez, I wish to disclose to the Veterans’ Employment and Training Service that I have been fed tacos and given a goodie bag by Mark Logic’s official chief technology officer. He was nervous around the addled goose, and he watched where he stepped after I waddled away. Prudence is a positive.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta