Real Time Search Systems, Part 4

June 24, 2010

Editor’s note: In this final snippet from my June 15 and June 17, 2010, lectures, I want to relate the challenge of real-time content to the notion of “aboutness.” An old bit of jargon, I have appropriated the term to embrace the semantic methods necessary to add context to information generated by individuals using such systems as blogging software, Facebook, and Twitter. These three content sources are representative only, and you can toss in any other ephemeric editorial engine you wish. The “aboutness” challenge is that a system must process activity and content. “Activity” refers to who did what when and where. The circumstances are useful as well. The “content” reference refers to the message payload. Appreciate that some message payloads my be rich media, disinformation, or crazy stuff. Figuring out which digital chunk has value for a particular information need is a tough job. No one, to my knowledge, has it right. Heck, people don’t know what “real time” means. The more subtle aspects of the information objects are not on the radar for most of the people in the industry with whom I am acquainted.

Semantics

I hate defining terms. There is always a pedant or a frustrated PhD eager to set me straight. Here’s what I mean when I use the buzzword “semantic”. A numerical recipe figures out what something is about. Other points I try to remember to mention include:

  • Algorithms or humans or both looking at messages, trying to map content to concepts or synonyms
  • Numerical recipes that send content through a digital rendering plant in order to process words, sentences, and documents and add value to the information object
  • Figure out or use probabilities to take a stab at the context for an information object
  • Spit out Related Terms, or Use For Terms
  • Occupy PhD candidates, Googlers, and 20-something MBAs in search of the next big thing
  • A discussion topic for a government committees nailing down the concept before heading out early on a Friday afternoon.

When semantics is figured out and applied, the meaning of Lady Gaga becomes apprehendable to a goose like me:

image

In order to tackle the semantics of a real time content object, two types of inputs are needed: activities or monitoring the who does what and when. The other is the information object itself. When the real time system converts digital pork into a high value wiener, the metadata and the content representation become more valuable than the individual content objects. This is an important concept, and I am not going to go into detail. I will show you the index / content representation diagram I used in my lectures:

image

The nifty thing is that when a system or a human beats on the index / content representation, the amount of real time information increases. The outputs become inputs to the index / content representation. The idea is that as the users beat on the index / content representation, the value of the metadata goes up.

image

We have now entered the world of information physics. As you might anticipate, I am going to leave it to you, gentle reader of a free Web log, to link the quantum world with the real time information world. In my view, there is a great deal of money to be made in this interesting information niche.

To wrap up, I want to show you what happens when real time content is processed and the index / content representation is exercised.

image

We have entered the world of information creation. The volume of content is increasing, and as the shift to real time content processing gains traction, a revolution is, in my opinion, going to take place. Each time the dog chases its tail, the dog gets bigger and more powerful. Too bad I am an old goose.

Stephen E Arnold, June 24, 2010

Freebie

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta