Google Time

May 13, 2009

Searchology strikes me as a forum for Google to remind journalists, the faithful, unbelievers, and competitors that the GOOG is the big dog in search, You can read dozens of reports about Google’s search enhancements, A good round up was “Google Unveils New Search Features” here. Don’t like AFP, run this query on Google News and pick a more useful summary. For me, the key announcements had to do with time. The date of a document and the time of an event are important but different concepts. Time is a difficult problem, and Google’s announcements underscore the firm’s time expertise. Timelines? No problem. Date sort? No problem. For me what’s important is that time prowess is a tiny tip of much deeper underlying technical capabilities. The Google has some muscles it is just starting to flex.

Stephen Arnold, May 13, 2009

Search and Predictive Math

May 9, 2009

Short honk: Curious about how predictive math will affect search and retrieval? Check out “How Your Search Queries Can Predict the Future here. Queries are useful in interesting ways.

Stephen Arnold, May 9, 2009

Google Disclosed Time Function for Desktop Search

May 7, 2009

Time is important to Google. Date tags on documents are useful. As computer users become more comfortable with search, date and time stamps that one can trust are a useful way to manipulate retrieved content.

The ever efficient USPTO published US7,529,739 on May 5, 2009, “Temporal Ranking Scheme for Desktop Searching”. The patent is interesting because of the method disclosed, particularly the predictive twist. The abstract stated:

A system for searching an object environment includes harvesting and indexing applications to create a search database and one or more indexes into the database. A scoring application determines the relevance of the objects, and a querying application locates objects in the database according to a search term. One or more of the indexes may be implemented by a hash table or other suitable data structure, where algorithms provide for adding objects to the indexes and searching for objects in the indexes. A ranking scheme sorts searchable items according to an estimate of the frequency that the items will be used in the future. Multiple indexes enable a combined prefix title and full-text content search of the database, accessible from a single search interface.

You can snag a copy at http://www.freepatentsonline.com or you can brave the syntax at the USPTO here.

Stephen Arnold, May 7, 2009

Visualization: Interesting and Mostly Misleading

May 7, 2009

i fund the EVvie award for excellence in search system analysis. This year’s number two winner was Martin Baumgartel for his paper “Advanced Visualization of Search Results: More Risks, or More Chances?”. You can read the full text here and a brief interview with him here. You will want to read Mr. Baumgartel’s paper and then the Fast Company article called “Is Information Visualization the Next Frontier for Design?” here by Michael Cannell.

These two write ups point out the difference between a researcher’s approach to the role of visualization and a journalist’s approach to the subject. In a nutshell, visualization works in certain situations. In most information retrieval applications, visualization is not a benefit.

The Fast Company approach hits on the eye candy value of visualization. The title wisely references design. Visualization as illustration can enhance a page of text. Visualization as an aid to information analysis may not deliver the value desired.

Which side of the fence is right for your search system? Selective use of visualization or eye candy? The addled goose favors selectivity. Most visualizations of data distract me, but your response may be different. Functionality, not arts and crafts, appeal to the addled goose.

Stephen Arnold, May 7, 2009

Open Text Vignette: Canadian Omelet

May 6, 2009

A happy quack to the reader who alerted me to this big buck ($300 million plus) deal for Open Text to purchase the financially challenged content management vendor Vignette. You can read the Canadian press take on the announcement here. This story is hosted by Google, so it may disappear after a short time. I recall seeing a story by Matt Asay in August 2008 here that the two companies were talking. Well, the deal appears to be done. Open Text is a vendor with a collection of search systems, including Tim Bray’s SGML engine, Information Dimension’s BASIS, the mainframe centric BRS search, the Fulcrum system, and some built in query systems that came with acquisitions. Vignette, on the other hand, is complex and expensive content platform. The company has some who love it and some like me who think that it is a pretty messy bowl of clam linguini. The question is, “What will Open Text do with Vignette”?” Autonomy snagged Interwoven, snagged some up sell prospects, and fattened its eDiscovery calf. Open Text has systems that can manage content. Can Open Text manage the money losing Vignette? Autonomy in my opinion is pressuring Open Text. Open Text now has to manage the Vignette system and marshal its forces against the aggressive Autonomy. Joining me on the skepticism skateboard is ZDNet’s Stephen Powers. He wrote “Can Open Text Turn the Page on Vignette’s Recent History?” here. He wrote:

The other interesting question raised by this announcement: what to do about the Vignette brand? The press release states that Vignette will be run as a wholly-owned subsidiary. But will Open Text continue to invest in what some argue is a damaged brand? Or will they eventually go through a rebranding, as they did with their other ECM acquisitions, and retire the purple logo? Time will tell.

Mr. Powers is gentle. I think the complexity of the Vignette system, its money losing 2008, and the push back some of its licensees have directed at the company. Does Open Text have the management skill and the resources to deal with integration, support, and product stability issues? Will Open Text customers grasp the difference between Open Text’s overlapping product lines?

My hunch is that this deal is viewed by Autonomy as an opportunity, not a barrier.

Stephen Arnold, June 6, 2009

Google vs Alpha: A Phantom Faceoff

May 6, 2009

Technology Review had an opportunity to put Google Public Data (not yet available) and Wolfram Alpha (not yet available) to the test. The expert crafting the queries and reporting the results of the phantom face off was David Talbot. You can read his multipart analysis here:

I found the screenshots interesting. The analysis, however, did not answer the questions that I had about the two services; for example:

  • How will these services identify sources with “credibility” scores? I have heard that Google calculates these types of values, and I assume that Wolfram Alpha will want to winnow the goose feathers from the giblets as I try to do in this human written Web log
  • What is the scope of the data sets in these two “demo” or trial systems? I know first hand how difficult it is to get some data in their current form for on the fly analyses. There are often quite a few data sets in the wild. The question is which ones are normalized and included and which ones are not? Small data sets can lead to interesting consequences for “the decider”.
  • What is the freshness of the data; that is, how often are the indexes refreshed? Charts can be flashy but if the information is not fresh, the value can be affected.

Technology Review is trying to cover search and that’s good. Summarizing sample queries is interesting. Answering the questions that matter is much more difficult even for Technology Review.

Stephen Arnold, May 6, 2009

IBM and Data Mashups

May 6, 2009

Google Public Data and Wolfram Alpha. Dozens of business intelligence vendors like Business Objects and Clarabridge. Content processing systems like Connotate and Northern Light. And now IBM. These companies and IBM want to grab a piece of the data transformation, analysis, and  mashup business. In the pre crash days, MBAs normalized data, figured out what these MBA brainiacs thought were valid relationships, and created snazzy charts and graphs. In the post crash era, smart software is supposed to be able to do this MBA-type work without the human MBAs. IBM, already owners of Web Fountain and other data crunching tools, bought Exeros, a privately held maker of computer programs that help companies analyze data across corporate databases. You can read one take on the story here.

If you want more information about Exeros, explore these links:

  • The official news release here
  • The architecture for transformation and other methods here
  • Data validation block diagram here.

How does Exeros differ from what’s available from other vendors? Easy. Exeros has enterprise partners and customers plus some nifty technology.

What I find interesting is that IBM pumps big bucks into its labs, allows engineers to invent data transformation systems and methods, and then has to look outside for a ready-to-sell bundle of products and services. Does this suggest that IBM would get better return on its money by focusing on acquisitions, and scaling back its R&D?

Will this acquisition allow IBM to leap frog Google? Maybe, but I don’t think so. Google has had some of IBM Almaden wizards laboring in the Googleplex along with other “transformation” experts. Google is edging toward this enterprise opportunity with some exciting technology which I describe in Google: The Digital Gutenberg here. IBM thinks a market opportunity exists, and it is willing to invest to have a chance to increase its share.

Stephen Arnold, May 6, 2009

Microsoft and Search: Interface Makes Search Disappear

May 5, 2009

The Microsoft Enterprise Search Blog here published the second part of an NUI (natural user interface) essay. The article, when I reviewed it on May 4, had three comments. I found one comment as interesting as the main body of the write up. The author of the remark that caught my attention was Carl Lambrecht, Lexalytics, who commented:

The interface, and method of interaction, in searching for something which can be geographically represented could be quite different from searching for newspaper articles on a particular topic or looking up a phone number. As the user of a NUI, where is the starting point for your search? Should that differ depending on and be relevant to the ultimate object of your search? I think you make a very good point about not reverting to browser methods. That would be the easy way out and seem to defeat the point of having a fresh opportunity to consider a new user experience environment.

Microsoft enterprise search Web log’s NUI series focuses on interface. The focus is Microsoft Surface, which allows a user to interact with information by touching and pointing. A keyboard is optional, I assume. The idea is that a person can walk up to a display and obtain information. A map of a shopping center is the example that came to my mind. I want to “see” where a store is, tap the screen, and get additional information.

This blog post referenced the Fast Forward 2009 conference and its themes. There’s a refernce to EMC’s interest in the technology. The article wraps up with a statement that a different phrase may be needed to describe the NUI (natural user interface), which I mistakenly pronounced like the word ennui.

image

Microsoft Suface. Image Source: http://psyne.net/blog4/wp-content/uploads/2007/09/microsoftsurface.jpg

Several thoughts:

First, I think that interface is important, but the interface depends upon the underlying plumbing. A great interface sitting on top of lousy plumbing may not be able to deliver information quickly or in some cases present the information the user needed. I see this frequently when ad servers cannot deliver information. The user experience (UX) is degraded. I often give up and navigate elsewhere.

Read more

Evvie 2009 Winners: David Evans and Martin Baumgartel

May 4, 2009

Stephen E. Arnold of ArnoldIT.com, http://www.arnoldit.com, announced the Evvie “best paper award” for 2009 at Infonortics’ Boston Search Engine Meeting on April 28.

The 2009 Evvie Award went to Dr. David Evans of Just Systems Evans Research for “E-Discovery: A Signature Challenge for Search.” The paper explains the principal goals and challenges of E-Discovery techniques. The second place award went to Martin Baumgärtel of bioRASI for “Advanced Visualization of Search Results: More Risks or More Chances?”, which addressed the gap between breakthroughs in visualization and actual application of techniques.

evvie 2009

Stephen Arnold (left) is pictured with Dr. David Evans, Just System Evans Research on the right.

The Evvie is given in honor of Ev Brenner, one of the leaders in online information systems and functions. The award was established after Brenner’s death in 2006. Brenner served on the program committee for the Boston Search Engine Meeting since its inception almost 20 years ago. Everett Brenner is generally regarded as one of the “fathers” of commercial online databases. He worked for the American Petroleum Institute and served as a mentor to many of the innovators who built commercial online.

baumgartel

Martin Baumgartel (left) and Dr. David Evans discuss their recognition at the 2009 Boston Search Engine Meeting.

Mr. Brenner had two characteristics that made his participation a signature feature of each year’s program: He was willing to tell a speaker or paper author to “add more content,” and after a presentation, he would ask a presenter one or more penetrating questions that helped make a complex subject more clear.

The Boston Search Engine meeting attracts search professionals, search vendors, and experts interested in content processing, text analysis, and search and retrieval. Held each year in Boston, Ev, as he was known to his friends, demanded excellence in presentations about information processing.

Sponsored by Stephen E. Arnold (ArnoldIT.com), this award goes to the speaker who best exemplifies Ev’s standards of excellence. The selection committee consists of the program committee, assisted by Harry Collier (conference operator) and Stephen E. Arnold.

This year’s judges were Jill O’Neill, NFAIS, Sue Feldman, IDC Content Technologies Group, and Anne Girard, Infonortics Ltd.

Mr. Arnold said, “This award is one way for us to respect his contributions and support his life long commitment to excellence.”

The recipients receive a cash prize and an engraved plaque. Information about the conference is available on the Infonortics, Ltd. Web site at www.infonortics.com and here. More information about the award is here. Information about ArnoldIT.com is here.

The Beeb and Alpha

April 30, 2009

I am delighted that the BBC, the once non commercial entity, has a new horse to ride. I must admit that when I think of the UK and horse to ride, my mind echoes with the sound of Ms. Sperling saying, “Into the valley of death rode the 600”. The story (article) here carries a title worthy of the Google-phobic Guardian newspaper: “Web Tool As Important as Google.” The subject is the Wolfram Alpha information system which is “the brainchild of British-born physicist Stephen Wolfram”.

Wolfram Alpha is a new content processing and information system that uses a “computational knowledge engine”. There are quite a few new search and information processing systems. In fact, I mentioned two of these in recent Web log posts: NetBase here and Veratect here.

image

Can Wolfram Alpha or another search start up Taser the Google? Image source:

In my reading of the BBC story includes a hint that Wolfram Alpha may have a bit of “fluff” sticking to its ones and zeros. Nevertheless, I sensed a bit of glee that Google is likely to face a challenge from a math-centric system.

Now let’s step back:

First, I have no doubt that the Wolfram Alpha system will deliver useful results. Not only does Dr. Wolfram have impeccable credentials, he is letting math do the heavy lifting. The problem with most NLP and semantic systems is that humans are usually needed to figure out certain things regarding “meaning” of and in information. Like Google, Dr. Wolfram lets the software machines grind away.

Second, in order to pull of an upset of Google, Wolfram Alpha will need some ramp up momentum. Think of the search system as a big airplane. The commercial version of the big airplane has to be built, made reliable, and then supported. Once that’s done, the beast has to taxi down a big runway, build up speed, and then get aloft. Once aloft, the airplane must operate and then get back to ground for fuel, upgrades, etc. The Wolfram Alpha system is in it early stages.

Third, Google poses a practical problem to Wolfram Alpha and to Microsoft, Yahoo, and the others in the public search space. Google keeps doing new things. In fact, Google doesn’t have to do big things. Incremental changes are fine. Cumulatively these increase Google’s lead or its “magnetism”, if you will. So competitors are going to have to find a way to leapfrog Google. I don’t think any of the present systems have the legs for this jump, including Wolfram Alpha because it is not yet a commercial grade offering. When it is, I will reassess my present view. What competitors are doing is repositioning themselves away from Google. Instead of getting sand kicked in one face on the beach, the competitors are swimming in the pool at the country club. Specialization makes it easier to avoid Googzilla’s hot breath.

To wrap up, I hope Wolfram Alpha goes commercial quickly. I want to have access to its functions and features. Before that happens, I think that the Beeb and other publishing outfits will be rooting for the next big thing in the hopes that once of these wizards can Taser the Google. For now, the Tasers are running on a partial charge. The GOOG does not feel them.

Stephen Arnold, May 1, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta