Real Time Search Systems, Part 1

June 21, 2010

Editor’s note: For those in the New Orleans real time search lecture and the Madrid semantic search talk, I promised to make available some of the information I discussed. Attendees are often hungry to have a take away, and I want to offer a refrigerator magnet, not the cruise ship gift shop. This post will provide a summary of the real time information services I mentioned. The group focuses on content processed from such services as Facebook, Twitter, blogs, and other geysers of digital confetti. A subsequent blog post will present the basics of my draft taxonomy of real time search. I know that most readers will kick the candy bar wrapper into the gutter. If you are one of the folks who picks up the taxonomy, a credit line would make the addled goose feel less like a down pillow and more like a Marie Antoinette pond ornament.

What’s Real Time Search?

Ah, gentle reader, real time search is marketing baloney. Life has latency. You call me on the phone and days, maybe weeks go by, and I don’t return the call. In the digital world, you get an SMS and you think it was rocketed to you by the ever vigilant telecommunications companies. Not exactly. In most cases, unless you conduct a laboratory test between mobiles on different systems, capturing the transmit time, the receiving time, and other data points such as time of day, geolocation, etc., you don’t have a clue what the latency between sending and receiving. Isn’t it easier to assume that the message was sent instantly. When you delve into other types of information, you may discover that what you thought was real time is something quite different. The “check is in the mail” applies to digital information, index updating, query processing, system response time, and double talk from organizations too cheap or too disorganized to do much of anything quickly. Thus, real time is a slippery fish.

Real Time Search Systems

Why do I use the phrase “real time”? I don’t have a better phrase at hand. Vendors yap about real time and a very, very few explain exactly what their use of the phrase means. One outfit that deserves a pat on the head is Exalead. The company explains that in an organization, most information is available to an authorized user no less than 15 minutes after the Exalead system becomes aware of the data. That’s fast, and it beats the gym shorts of many other vendors. I would love to pinpoint the turtles, but my legal eagle cautions me that this type of sportiness will get me a yellow card. Figure it out for yourself is the sad consequence.

Here’s the list of the systems I identified in my lectures. I don’t work for any of these outfits, and I use different services depending on my specific information needs. You are, therefore, invited to run sample queries on these services or turn to one of the “real” journalists for their take. If you have spare cash and found yourself in the lower quartile of your math class, you may find that an azure chip consultant is just what you need to make it in the crazy world of online information.

In my lectures I made four points about these types of real-time search services.

First, each of these services did at the time of my talks deliver more useful and comprehensive results than the “real time search” services from the Big Gals in the Web search game; namely, Google, Microsoft Bing, and Yahoo. Yahoo, I pointed out, doesn’t do real time search itself. Yahoo has a deal with the OneRiot.com outfit. The service is useful and I suppose I could stick it in the list above, but I am just cutting and pasting from the PowerPoint decks I used as crutches and dogs in my lecture.

Second, these outfits come and go. Collecta.com, the company I explained in some detail in New Orleans landed another dump truck of venture money. If a link goes dead, another victim has been run over by managers with more marketing expertise than paying customers.

Third, the outputs from these services have to be subjected to post processing. The question this statement begs is, “Well, how do I do that?” There are different methods which may require original coding, the use of tools from the goslings who peck at dirt near the goose pond, or the commercial vendors which I mentioned in the second set of use case examples. Raw outputs from these systems may be chock full of funny, fuzzy stuff. Think disinformation.

That’s it for this segment of my lectures delivered the week of June 14, 2010 in two cities before a combined audience of 700 people in venues 3,700 miles apart. I don’t think my doing the talks in person made much of an impact. Make your own decision. If you want to complain, use the comments section of this Web log. Someone who pretends to be me reads the comments and responds with a cheerful thank you. The goose is heading back to the pond filled with mine run off water. Better than the water from the oil spill in Louisiana.

Stephen E Arnold, June 21, 2010

I was paid to give these talks. Believe it or not.

Comments

2 Responses to “Real Time Search Systems, Part 1”

  1. Real Time Search Systems, Part 1 | Digital Asset Management on June 21st, 2010 1:50 am

    […] Real Time Search Systems, Part 1 : Beyond Search. […]

  2. julien on June 21st, 2010 8:49 am

    I believe there is a definition for realtime cmputing and it’s not about “timing”, but about constraints : a realtime system can guarantee delivery of the messages with a specific timeframe. Past it, it means that the message doesn’t exist 🙂

    What strikes me about realtime search is that most of the time it’s actually “search in recent data”. It doesn’t involve new algorithm or even use-cases, but it just reduces the bucket of searchable data to what’s been published in a “recent” timeframe.

    Finally, all these srvices still onvolve the user actually doing a search, over and over again (by opening a browser window). I’m looking forward to stuff like Google Alerts in realtime. I do the search once, and then, get results pushed to me, rather than pull them over and over and over again.

    That’s my 2 cents! I’d love to chat more if you have time! My email should be posted.

  • Archives

  • Recent Posts

  • Meta