Actionable Search Jargon

May 17, 2009

I recall interviewing Ali Riaz in my Search Wizards Speak series here. He founded a company called Attivio, and as I recall, Attivio is derived from the Italian word “attivo” which means “active”. Imagine my surprise when I read “Actionable Search – From What to Why?” here. The Microsoft Enterprise Search Blog has in my opinion inadvertently given Attivio a bit of a PR boost. Actionable search is not too far from what Attivio’s team has been selling in the last year or so.

The Microsoft spin on “actionable search” is interesting in three other ways.

First, the idea is that “no frills” search is not what folks want. Considering that two thirds of the users of existing search systems are not too happy with those systems, I am not sure “no frills” is sufficiently broad to handle the wide range of use cases the research for Martin White’s and my Successful Enterprise Search Management revealed. There are many types of search, and not all of them are “actionable”. Some are pretty important, if mind numbingly dull research tasks, such as looking through email for a smoking gun in an eDiscovery process.

Second, the notion of “directly” is interesting to me. For example, a client wanted me to provide some information about the growth of digital content in a typical organization. One company presenting at the MarkLogic user conference which attracted about 430 people compared to the 90 or so at the search summit held at the same time experienced a 4X growth in digital information in a single year. In this presentation I learned that search and finding were hooked into many core processes; for example, one system automatically needed information from another system to update information in case that data might be needed. I suppose one can stretch this notion of interprocess XML exchange to “direct” but I prefer to think of these types of “search” functions as more fine grained. Rock hammers don’t do the job when it comes to electronic information.

Third, the notion of actionable information by itself and without context is in my mind more closely linked to business intelligence reports. Search may be an action free as learning in order to be confident in one’s knowledge. I suppose language is sufficiently malleable to permit stretching a notion to embrace Wrigley Field but that might not be too helpful to me. I want to know something for its own sake, not for its utility; for example, Nero ordered his mother to be put to death, according to Arthur Weigall in Nero: The Singing Emperor of Rome (GP Putnam: London, 1930). Agrippina, Nero’s mother, was not too popular. The Senate sent congratulations to Nero for his deed. Not much use to me but a neat anecdote about family harmony. See page 209.

In short, I struggle with the notion of simplifying and abstracting information retrieval and text processing. The need for precision has never been greater. In my view, the problem with most enterprise search deployments pivots on this notion of precision. Muddy thinking and the belief in silver bullets leads to mutli million dollar costs and may, in today’s economic climate, contribute to the failure of an undertaking, not help ensure its success.

Simple is good. Simplicity without precision is not too useful to this addled goose.

Stephen Arnold, May 17, 2009

Wolfram Alpha and Niche Ville

May 17, 2009

I have been grabbing quick looks at the write ups about the Wolfram Alpha search system. One of the more interesting essays is by Larry Dignan. You can read “Wolfram Alpha Launches: Can It Break Out of Niche Ville?” here. I liked the idea of a “Niche Ville”. The phrase connoted a small town which may be interesting, but it is not likely to suck MBAs into its maw the way Manhattan does. Aside from the suggestion that Wolfram Alpha was a “hamburger and fries” type of search and content processing system, I found this comment quite suggestive:

Overall, Wolfram/Alpha reads like an encyclopedia. It’s handy at times, but the big question is whether the search engine can break out of niche-ville. Sure, geeks like the presentation and it Wolfram/Alpha can be handy for deep dives, but the average person will want some sort of results every time. In that regard, Wolfram/Alpha may be a disappointment.

My thought was, “Wolfram Alpha is like other question answering systems. It’s a bit like an advanced search function because the user has to do more thinking than typing pizza into a Google Map. As a result, a small percentage of Web users may have the mental energy to tackle these systems.” Niche is a useful term.

Stephen Arnold, May 17, 2009

Google: Growth and Grimness

May 16, 2009

Search Engine Guide here has a video about Google’s market share. The video is a reaction to the Marketing Vox summary of Hitwise’s search market share data for the four weeks ending April 25, 2009. You can read the “Google Creeps toward 73 percent of US Search in April” here. To highlight the risks of a monoculture, you will want to peruse the San Francisco Chronicle’s story “Frustration, Distress over Google Outage” here. Seems that as Google’s rivals fall farther behind, the Google’s technical weaknesses, what I call YAGG (yet another Google glitch) become more of an issue for lots of Google dependents. Little wonder that every new search engine is labeled as “a Google killer”. Too late, lads and lassies. Too late.

Stephen Arnold, May 16, 2009

Wolfram Alpha Goes to R Systems for Horsepower

May 16, 2009

CNet’s Stephen Shankland wrote “Wolfram Alpha Gets Supercomputer Boost” here, and I think the story will super-boost the hopes that Wolfram Alpha’s search system will crush the Google. Rooting for a Google smasher is a bit of a pundit trend these days. With Wolfram Alpha supposed to be officially alive as I write this, I think the notion of the world’s 66th fastest supercomputer can make those Google haters experience an adrenaline rush. Mr. Shankland provides a link to the November 2008 list of supercomputers, and when I looked at the list I did not see the Google listed. The reason? Supercomputers are indeed fast, but they by themselves are not exactly what’s required to declaw Googzilla. Search and content processing is an interesting technical challenge and raw speed does not frighten to death the opposition. I learned from Mr. Shankland:

The system, called R Smarr, has 4,608 processor cores using 576 quad-core “Harpertown” Xeon machines, 65,536GB of memory, and high-speed InfiniBand data-transfer connections, according to the Top500 site and a Dell case study on the system (PDF). It also uses both the Red Hat Enterprise Linux and Microsoft Windows HPC Server operating systems, according to the Dell paper. Alpha requests will be served from five co-location facilities, Wolfram Research said. There actually are two supercomputers in the project, with nearly 10,000 processor cores total and hundreds of terabytes of hard drives.

I wonder if this commercial for Dell servers and Intel CPUs will indeed humble the arrogant Googlers? I have to keep reminding myself that dear old Google has been chipping away at the technical problems that keep most competitors from chewing into Google’s share of the Web search market. So what is it now? A decade for Google’s search effort? Wolfram Alpha has been in the search game for what now, a couple of years?

Should be interesting to see if the newcomer can do what Fast Search & Transfer, Microsoft, and Yahoo, among others, have been unable to do. If I were a betting man, I think Wolfram Alpha comes out of the gate with long odds.

Why?

CNet’s Tom Krazit reported here “Wolfram Alpha’s Launch Delayed Amid Glitches”. Hmmm.

Stephen Arnold, May 16, 2009

Universal Answer Engine

May 16, 2009

In the gap created by Wolfram Alpha’s sort of start tonight, I poked through my files. I came across “How to Build a Universal Answer Engine: Ten Vital Principles” here. After a week at a conference and dozens of conversations about whizzy new search systems, I must admit I am a bit jaded. True Knowledge, on the other hand, is excited about the idea of a Universal Knowledge Engine. This is a Web log post about True Knowledge’s comuter system designed to answer users’ questions on any subject. I must admit that I am skeptical. The questions have to be in text. So much for equations? The questions must be in a language I speak, which may or may not make my questions intelligible. Enough of this skepticism. The write up lays out “principles”, and I am not comfortable repeating each. I can highlight two principles and offer a comment:

Principle 4 is “The only truly scalable way to learn envryitng is by allowing users to contribute.” I don’t really disagree, but I think only a small percentage of a user community contributes information. To get around this problem, True Knowledge taps into Wikipedia. I give True Knowledge credit for mentioning that some users are not to be trusted. The problem is that with only the motivated contributing, the system has to find some way to determine the likelihood that a particular contribution and item are likely to be “correct” or “trusted”. This is a pretty complicated task, and I think that it is worth noting that the Google and others are beavering away on this problem.

Principal 8 is “All fqacts need sources and these need to be available to the user.” This sounds like provenance, which is related to principle 4. I saw a demo by a Stanford professor where provenance and uncertainty were query modes supported by the system. True Knowledge seems to be in step with this line of inquiry. Calculating these “values’ will consume a chunk of computer time slices.

To wrap up, the principles are interesting. A number of companies are in the question answering business, and these organizations will need deep pockets to pull off a service that keeps me happy.

Stephen Arnold, May 16, 2009

Arnold’s Life Is Tweet Available

May 16, 2009

Short honk: one of those for-fee write ups that contain real information, not the quacks of the addled goose, is now available in the May 2009 Information World Review. You can find details about the column and the opinion piece “Life Is Tweet for Real-Time Search”. My editor, Peter Williams, has done an excellent job with the article. If you can snag a copy of this UK publication, you can get some chunks of Tweet stew served up. The information will stick to your ribs, unlike the marketing honks of the contributors to this free Web log. Click here for more information. I wrote: Twitter is a hybrid information service. Want to know more. Subscribe to IWR.

Stephen Arnold, May 16, 2009

The Pain in Spain Is Tending to the Inane

May 15, 2009

Read “Recording Industry Tries To Shut Down Search Engine In Spain Without Allowing It To Defend Itself” here. If true, the Internet she be changin’. Search requires content processing. Robots index and point. Software becomes the problem.

Stephen Arnold, May 15’ 2009

Videosurf Update

May 15, 2009

Since VideoSurf’s birth in mid-2008 (Beyond Search reviewed it at http://arnoldit.com/wordpress/2008/09/23/videosurf-video-metasearch/), it’s offered up a beta version (I wrote about it at http://arnoldit.com/wordpress/2008/10/18/videosurf-looking-for-wave-of-new-users/), and now it’s opened up the API so developers can install “visual search” on their own sites. VideoSurf promotes itself as the only video search engine that can  search and “see” inside videos to index content rather than depending upon tags and descriptions that can produce spam. The API allows accress to videos that can be selected to relate to site content; site can also tailor the displays to their promotional needs. See more about the API availability at http://www.videosurf.com/blog/search-and-video-lookup-apis-1039/. This could be a smart choice for sites looking to rev up their content while keeping it relevant.

Jessica Bratcher, May 15, 2009

YAGG: Bytes of Clay

May 15, 2009

Googzilla stubbed its paw today. You can read the official explanation here. YAGG means yet another Google glitch. The once alleged online uptime champ turned chump. Millions inconvenienced. Enough said.

Stephen Arnold, May 15, 2009

Brands as Gravity

May 14, 2009

In the online world, there are Jupiters, suns, and asteroids. Traffic sorts itself out in ways that are gravitational. A big brand gets lots of traffic. The asteroids generate less customer pull. Why’s this important? Online sites without pull are not likely to pull the traffic. Sad but true. Quality may be defined as lots of clicks. IT Pro here reported data that underscores the gravitational pull of 10 brands in the UK. The headline told the tale: “Top 10 Web Brands Get Half of UK Traffic.” The top three online brands were Facebook, Microsoft, and Google. The most interesting comment in the report was this remark:

Second-ranked MSN/Windows Live slid nearly a percentage point to 9.2 per cent, but added over a billion minutes in the past year. Third-ranked Google gained 0.4 per cent market share to 5.3 per cent, adding 950 million minutes.

Facebook appears to be the winner, which may have implications for the host of social challengers now available. In the UK, social seems to be the pull.

Stephen Arnold, May 14, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta