Dieselpoint 3.5 Available

May 7, 2010

Dieselpoint keeps a low profile. The company has an interesting technology and a year or so ago made a run at capturing some open source goodness. The company offers a search system that supports XML content, PDF files, structured data for eCommerce and business intelligence, and OEM search applications. Dieselpoint is written in Java and can run on most platforms.

Dieselpoint Search Version 3.5 Released – All-Java Enterprise Search Software Adds New Features” describes the new release this way:

For those who track the search software industry, Dieselpoint Search is widely regarded as the most sophisticated solution on the market for applications that require full-text, navigational, and parametric search.

Okay. I think Dieselpoint is a capable product, and you can see it in action at http://hmv.com/hmvweb/home.do. HMV is a large music retailer in the UK. A number of companies offer content technology as good as or better than Dieselpoint’s, the description in Lifestyle notwithstanding. I include Dieselpoint in my free Overflight service which you can check out here. You can easily compare the buzz for different vendors’ systems with a mouse click or two. The number of articles about a vendor in an Overflight provides a rough indication of the uptake about a particular company. Google, for example, warrants an Exalead search service plus five separate Overflights.

Lifestyle’s writer may find the Overflight service helpful when suggesting that “those who track the search software industry” will know that a particular product is “widely regarded as the most sophisticated solution on the market”. For a comparison test, check out Autonomy and Mark Logic on Overflight. I find the coverage comparison interesting and possibly suggestive.

Stephen E Arnold, May 7, 2010

Unsponsored post.

News Flash: Data Mining May Not Be an Information Cure All

May 7, 2010

Technology can work wonders. Technology is supposed to make it easier for downsized organizations to perform with agility and alacrity. I am “into” technology but I understand that the minimum wage workers at airline counters and financial institutions operate within systems assumed to work as intended. These systems, in my opinion, neither work at the level of answering a simple question like “Is the flight on time?” or at more a sophisticated level of “Where did this wire transfer come from?”

Why is it a surprise that technology does not do less familiar tasks with glitches or outright breakdowns? I was surprised to read “NY Plot Highlights Limitations of Data Mining.” There were three reasons:

  1. The writer for Network World expresses gentle surprise that predictive systems don’t work too well when applied to the actions of one person. Network World documents lots of system glitches, and the gentle surprise is not warranted.
  2. The story plants the seed that we have no choice but to rely on fancy content processing systems. Are there other options? None if you rely on this article’s analysis. In my experience there are indeed options, but these are conveniently nudged to the margins.
  3. The dancing around with data mining is specious. Text processing is one of those Rube Goldberg machines just built with software. Get the assumptions wrong, the inputs wrong, or the algorithms wrong to a slight degree and guess what? The outputs are likely to be wrong.

Here’s the passage I found interesting:

That fact is likely to provide more fodder for those who question the effectiveness of using data mining approaches to uncover and forecast terror plots. Since the terror attacks of Sept. 11, the federal government has spent tens of millions of dollars on data mining programs and behavioral surveillance technologies that are being used by several agencies to identify potential terrorists. The tools typically work by searching through mountains of data in large databases for unusual patterns of activity, which are then used to predict future behavior. The data is often culled from dozens of sources including commercial and government databases and meshed together to see what kind of patterns emerge.

In my experience, humans and text processing must work in an integrated way. Depend only on technology and the likelihood of getting actionable information that is immediately useful goes down. Even Google asks humans to improve on its machine translation outputs. Smart software may not be so smart.

Stephen E Arnold, May 7, 2010

Unsponsored post.

Kindle Tracks User Highlights

May 7, 2010

I am not sure how I like this tracking of what I highlight. I read “Amazon Tracking Most Highlighted Kindle Passage.” I do quite a bit of open source work. I mark up documents anyone can read. What my method yields is a passage, maybe a phrase, that yields a nugget of information. Many public sources contain substantive information. In my experience, some authors are not aware that their seemingly innocuous remarks reveal actionable intelligence. Knowing what I “mark” is taking my specialized type of analysis and having access to that quite valuable meta component. You may not care about this type of tracking. I do.

Here is the passage that annoyed me:

Amazon is now tracking passages and books most highlighted on the Kindle and displaying them on its website. While the Kindle allows its customers to highlight book passages that are meaningful to them, Amazon does display them anonymously.

Your mileage may vary.

Stephen E Arnold, May 7, 2010

Unsponsored post.

International eDiscovery

May 7, 2010

Short honk: Two outfits–Lineal Ltd in the UK and Complete Discovery Source in the US–have teamed to delivery global eDiscovery services. The US is one of the most litigious and legal eagle stuffed nations. Now eDiscovery is being exported just like motion pictures, rap, and Baywatch. The basics of the deal appear in “Lineal Partners with Complete Discovery Source to Provide International eDiscovery Services.” Is this a step forward or another example of a cultural export? Of the dozen vendors of eDiscovery, aren’t most selling internationally? Many are.

Legal eagles like to bill. For these professionals, international eDiscovery is good news. For others? The answer depends on who wins.

Stephen E Arnold, May 7, 2010

Unsponsored post.

Netezza Partners with Coveo for Customer Service

May 7, 2010

A leading provider of customer information access and enterprise search solutions, Coveo can now add Netezza, a leader in data warehouse, analytic and monitoring appliances, to its customer list.

According to newly released press, Netezza will implement Coveo’s Customer Information Access Solution for Contact Centers and Customer Self-Service. This will allow Netezza customers to quickly search and pinpoint information within its online support knowledgebase to solve their technical questions, ultimately deflecting calls into Netezza’s call centers. Netezza customers will also have access to the secure, online customer self-service console that offers infinite possibilities to retrieve, manipulate, and display data. However, what Coveo offers is much more than a search engine, or a dashboarding tool, or a reporting platform, though it can do all of these things well. This new client will give them yet another opportunity to flex their muscles for the crowd that will surely be watching.

Melody K. Smith, May 5, 2010

Note: Post was sponsored.

Linguamatics I2E Used To Analyze Election Tweets

May 6, 2010

Linguamatics,  the UK-based company specializing in natural language processing (NLP) deployed their I2E software during the final televised UK election debate to analyze tweets being sent both during and after the debate. “Linguamatics Reveals Instant Reactions on Twitter to Third Televised Election Debate” describes in detail the analytics that I2E captured, such as tweets in favor of a certain candidate, top issues for the twitterers, and “positive sentiments” towards each candidate – this last remark reflection the narrowing gap between their performances. The text mining software analyzed over 180,000 tweets in a 1.5 hour span on the night of the debate and can identify the meaning of a tweet even if strangely worded. While Linguamatics’ software can be used to extract the meaning from tweets, its clients include many top-10 pharma companies that use its much heavier data analysis methods.

Samuel Hartman, May 6, 2010

Note: Post not sponsored.

Market Share, Google, and Fancy Dancing

May 6, 2010

What’s a 71 percent market share mean? In search, lots of ad revenue. In other businesses like the old Standard Oil, break up talk. “Google U.S. Search Share Tops 71% In April, Hitwise Reports” contains news that may create more hassles for Google. Microsoft’s Web search folks may be chagrined by Google’s market share but the data creep ever closer to the “m” word that made the old Ma Bell history. For me, the killer passage in the write up was:

Microsoft Bing (MSFT) had 9.43%, down from 9.62%.

The $65 billion Microsoft is losing share. Yahoo is also slumping. The Google keeps on growing.

So what does this mean for competitors? [a] Turn the lawyers and PR people loose on a monopoly campaign, [b| pretend that the gap can be closed, [c] give up, [update the resume],  [e] two of the above.

Stephen E Arnold, May 5, 2010

No one paid me t0 write this.

Hardware Extends SQL Size Boundaries

May 6, 2010

I know. There is the SQL crowd. And there is the NoSQL crowd. Don’t forget the shot gun marriage segment which marries SQL and NoSQL. Where do you think IBM and Oracle fall in this range of options? Part of the answer for IBM appears in “Power7 Blades: The i/DB2 Combo Versus AIX/Oracle”. The idea is simple. Traditional database technology can handle the peta- and exa-scale data management tasks. Well, that’s what the assumption is. The article points out that “In many cases, the premium that IBM is charging i For Business shops for configured Power7 blade servers is reasonable compared to what it costs to configure AIX and an Oracle database on the same identical blades.” The write up explains the pricing for IBM’s newest hardware for database. The assumption is that I would use IBM hardware for data management. The interesting part is that this write up could be edited to apply to Oracle’s latest hardware line up. In short, there is not much difference between these two companies’ approach to data management. The cost for licenses gets really big really fast. Take 64 cores, buy hardware, pay for software licenses, rinse, repeat. Big numbers, fast. The write up is important because it provides performance and cost figures. I downloaded the story and tucked it in my pricing folder.

The challenge to IBM and Oracle is that the cost of SQL solutions is going to hockey stick. What licensees need to ask are such questions as:

  1. What are the NoSQL or hybrid solutions’ cost? When considering alternatives, what happens to those licensing and cost costs?
  2. If SQL is assumed to the solution to data management woes, why are NoSQL and hybrid solutions becoming the methods in use at some outfits, ranging from the US government entities to commercial outfits?
  3. What are the additional costs for maintaining and tuning these hardware/software solutions from IBM and Oracle?
  4. My view is that SQL is assumed to be the right tool for today’s data management tasks. I am not so sure.

NoSQL specialists like Mark Logic have some interesting approaches. Hybrid outfits like Aster Data have interesting approaches. I find pure SQL solutions both less interesting and more trouble than the high price tags warrant. Just my opinion.

Stephen E Arnold, May 6, 2010

No one paid me to write this.

Digital Black-Snouted Flying Frog: Objective Search Results

May 6, 2010

Pick a free online Web search service. Run a query. Are the results you see in your laundry list presented without regard to payment, bias, or some other digital tilt?

Tough to tell. At the Search Engine Meeting in Boston on April 26th and 27th, Dr. David Evans and I had one of our note-passing moments. Thank goodness he and I were not in the same math class. The professor would have taken our slide rules away and maybe banished us to the gym.

These notes presented below tackled the issue of objective search results and their becoming an endangered species in today’s rough-and-tumble marketplace.

We sketched and annotated a chart that looked like this:

Search results chart

The up swinging line suggests that as online users’ technical capabilities rise, the down-swinging line shows that objective search results have less value. The idea is that in the public Web search arena, subjectivity may be losing ground to objective selection and presentation of search results. The user * thinks * results are objective. The results may be subjective. If this supposition is true in a world of play for placement, online advertising, sponsored results, and the chicanery of search engine optimization experts, there may be some implications in world of Web search.

For instance, the type of search results from a service such as Delicious, Facebook, or StumbleUpon may be perceived as having more value. The idea is that if a person suggested a particular source of information and that person has some “connection” to the user, then the results may be more useful. Other possible descriptions of the results might be “trustworthy,” “accurate,” “non commercial”, or “reliable.”

In actual fact may be that these social results are as subject to commercial intent as the results in a Bing, Google, or Yahoo search list. That may not matter because there seems to be a flagging appetite for verification of information snagged from public Web sites. The demographic and social shift may be the prehistoric termites nibbling on the the intellectual foundations.

The passages below come from the notes that Dr. Evans and I exchanged in the course of our note-passing moment:

Arnold: I wonder if the interest in social media is a change in how people think about finding information. I think the social angle in the US is different from what I have experienced in China and Japan. Surprisingly there was some resistance to social media in Slovenia which contrasts sharply with the texting frenzies of the Chinese and Japanese.

Evans: In the US, we’re skeptical about authority (and resist the temptation to appear to conform to someone else’s opinions). This is not the case in other places (like Japan).

Arnold: Social is the new security problem. Information validity is an issue and some information is subject to manipulation.

Evans: It’s the network of associations that permits individuals to “suspend skepticism” and conform, cooperate, join in, etc. A kind of democracy effect. One network effect I have observed is the “rule of two”. If two acquaintances agree on a position, we’re likely also to agree.

Arnold: The social trend in the US is able to make factually incorrect information into “accurate” information.

Evans: Is this an Anglo-centric  phenomenon? That is, is it a “sea change” only because we are Americans? In Japan, France, Italy, India and many other countries, social collaboration is the norm.

Arnold: The potential for misinformation is ratcheting upward in the US. Information can be shaped and the consumers of that information are unaware. Think of Fox News, which is owned by Mr. Murdoch. The information pushes an agenda, and despite its approach, the content gets wide distribution and is sometimes indistinguishable from information that does not have a slant or a political angle.

Evans: It’s ironic that a technology–digital computers and networks–designed to overcome limitations  in human memory and ability to calculate probabilities and ground facts–would become the vehicles for and licensers of socially grounded points of view.

Arnold: It’s tragic that many individuals cannot make informed judgments about the information used to “know” something. The lack of information literacy gives social media in the US considerable potential for disinformational activities.

Evans: The Web has introduced noise in the information channel. It’s hard to distinguish one results of a search from another. The results “look alike” in a search results list. One might be from a respected research institution, another from a blog post. The attitude (banal democracy) has become, “Who can tell which is more reliable?”” We may be taking a huge step backwards.

Arnold: The digital Dark Ages? Figuring out which information is more accurate, reliable, or objective may be like finding a black snouted flying frog. A long shot indeed.

Stephen E Arnold and Dr. David Evans, May 4, 2010

Autonomy and Its ROI Push

May 6, 2010

Autonomy seems to have opened a new front in the information access wars. Search and content processing has experienced Balkanization over the last 20 years. Flash back to 1980 and how many ways were there to search digital information. My list includes the marvelous Inquire product (forward and rearward truncation), Stairs III, BRS, and a handful of other systems. Today I track more than 300 vendors, and I could expand that list by tossing in companies embedding enterprise search into applications.

In “British Airways Enhances Customer Engagement and Drives Up Conversion Rates with Autonomy Optimost” I could discern a return on investment push. The idea is that search delivers a payoff, a substantial benefit in today’s churning financial climate. Anyone planning a trip to Scotland or Greece this morning? Thought not.

The write up asserts:

Autonomy Optimost enables businesses to depart from legacy approaches relying on ill-equipped metrics and guesswork, and empowers them to gain a true understanding of their customers’ preferences, intent and behavior,” said Andy Jenks, CEO of Autonomy Optimost. “Businesses are increasingly turning to Autonomy Optimost to democratize their marketing campaigns and design process and we are delighted to see British Airways achieve these fantastic results with Autonomy Optimost.”

Optimost, according to Autonomy:

delivers automated capabilities such as advanced analytics, pattern-matching, optimization, and targeting to optimize marketing across multiple channels to drive business growth. Marketers can now take a proactive and automated approach for identifying emerging customer segments and determining the most effective way to market to them, including the most optimal product recommendations, promotional offers, pricing strategies, and advertising placements.

Why is this important?

  1. Autonomy is a smart content platform, not just search
  2. Autonomy has an uncanny knack for market positioning
  3. Autonomy has morphed from search into a far broader software ecosystem with vertical technology that makes the company a threat in audio, video, data, and text processing.

Worth tracking in my opinion. Unlike Google or Microsoft, Autonomy sends a more consistent message about what its technology delivers. That’s the core of the argument, isn’t it? Return on investment. How will other vendors respond? Hopefully with substantive cost and payoff information. I am not holding my breath, however.

Stephen E Arnold, May 6, 2010

Unsponsored post.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta