Analysis of Google Wave

June 8, 2009

Radovan Seman?ík’s Weblog published “Storm Alert”, a quite interesting discussion of Google Wave. You must read his write up here. Among the points I noted were:

  1. Lack of security
  2. Inconsistent nomenclature
  3. Assumptions about performance when there are large numbers of users.

For me, the most telling comment in this article was:

Google Wave architecture does not adhere to architectural best practice. It is not minimal. The robots are described to communicate with Wave by HTTP/JSONRPC (robot is server), Client apparently communicates by HTTP (as AJAX application?) , while the wave federation protocol is described as XMPP-based. Why do we need so many protocols? Is there any reason why robot protocol and client-server protocol needs to be different? The non-minimalistic approach can be seen in the OT operations as well. The antidocumentelementstart and endantidocumentelementstart operations seems redundant to me. If they are not redundant, their existence should be explained in the architectural documents.

Highly recommended. In my opinion, Wave is a variant of the dataspace technologies. Like the first Searchology, I think Google built a demo and rushed it out the door in order to blunt the PR impact of Microsoft Bing.com’s roll out.

Stephen Arnold, June 8, 2009

MarkLogic: The Shift Beyond Search

June 5, 2009

Editor’s note: I gave a talk at a recent user group meeting. My actual remarks were extemporaneous, but I did prepare a narrative from which I derived my speech. I am reproducing my notes so I don’t lose track of the examples. I did not mention specific company names. The Successful Enterprise Search Management (SESM) reference is to the new study Martin White and I wrote for Galatea, a publishing company in the UK. MarkLogic paid me to show up and deliver a talk, and the addled goose wishes other companies would turn to Harrod’s Creek for similar enlightenment. MarkLogic is an interesting company because it goes “beyond search”. The firm addresses the thorny problem of information architecture. Once that issue is confronted, search, reports, repurposing, and other information transformations becomes much more useful to users. If you have comments or corrections to my opinions, use the comments feature for this Web log. The talk was given in early May 2009, and the Tyra Banks’s example is now a bit stale. Keep in mind this is my working draft, not my final talk.

Introduction

Thank you for inviting me to be at this conference. My topic is “Multi-Dimensional Content: Enabling Opportunities and Revenue.” A shorter title would be repurposing content to save and make money from information. That’s an important topic today. I want to make a reference to real time information, present two brief cases I researched, offer some observations, and then take questions.

Let me begin with a summary of an event that took place in Manhattan less than a month ago.

Real Time Information

America’s Top Model wanted to add some zest to their popular television reality program. The idea was to hold an audition for short models, not the lanky male and female prototypes with whom we are familiar.

The short models gathered in front of a hotel on Central Park South. In a matter of minutes, the crowd began to grow. A police cruiser stopped and the two officers were watching a full fledged mêlée in progress. Complete with swinging shoulder bags, spike heels, and hair spray. Every combatant was 5 feet six inches taller or below.

The officers called for the SWAT team but the police were caught by surprise.

I learned in the course of the nine months research for the new study written by Martin White (a UK based information governance expert) and myself that a number of police and intelligence groups have embraced one of MarkLogic’s systems to prevent this type of surprise.

Real-time information flows from Twitter, Facebook, and other services are, at their core, publishing methods. The messages may be brief, less than 140 characters or about 12 to 14 words, but they pack a wallop.

image

MarkLogic’s slicing and dicing capabilities open new revenue opportunities.

Here’s a screenshot of the product about which we heard quite positive comments. This is MarkMail, and it makes it possible to take content from real-time systems such as mail and messaging, process them, and use that information to create opportunities.

Intelligence professionals use the slicing and dicing capabilities to generate intelligence that can save lives and reduce to some extent the type of reactive situation in which the NYPD found itself with the short models disturbance.

Financial services and consulting firms can use MarkMail to produce high value knowledge products for their clients. Publishing companies may have similar opportunities to produce high grade materials from high volume, low quality source material.

Read more

Microsoft Fast ESP Revealed

June 3, 2009

Need an enterprise search system? Have four months? Microsoft Fast Enterprise 360 is for you. You can read a case study of a lightning fast (no pun intended) search implementation in a new Microsoft Fast white paper called “Enterprise Search 360: Achieving a Single Search Platform across the Enterprise”. I had a bit of trouble locating the document but I am an addled goose and your, if my Web log usage analysis system is working, are a 40 something, proud, confident, and an expert in search. If you have a user name and password for ZDAsia, you can download it here. If you get a 404, shave the url, register, and search for “enterprise 360”. If you are persistent, you can snag this one page write up. Here are some keepers from this remarkable “white paper”:

  1. “National Instruments was quickly attracted to the FAST platform’s versatility, flexibility, and capability to expand. Inside four months the FAST Enterprise Search Platform solution was fully implemented by National Instruments’ team without the assistance and added cost of professional services personnel for the installation.” My thought, “Wow.”
  2. “FAST’s impact was apparent from the get-go.” My thought, “Wow again.”
  3. “When customers do seek out customer support, the FAST-supported online support request functionality gives application engineers vital information about the customers and their needs before the support conversation even begins.” My thought, “ESP. Extra sensory perception. I knew the meaning of the acronym and thought it meant enterprise search platform.

I downloaded this document. Much food for thought and analysis.

Stephen Arnold, June 1, 2009

Digital Reef Makes Microsoft Fast Work

May 25, 2009

I puzzled over “Digital Reef Partners with FAST, Helps Manage SharePoint Content” in CMSWire here. The article covers a number of content functions that I try to keep separate; for example, unstructured data, “out of the box support for eDiscovery, compliance, Office SharePoint Server management, data security, and storage initiatives”, and analytic tools. Oh, I almost omitted manipulation of structured data. Who provides this happy family of services? Digital Reef. You can read more about this company here. The company asserts that it handles these different functions. My view is that the company knows how to tame SharePoint and implement Fast Search’s ESP “out of the box”. In my experience, prior to the acquisition of Fast Search & Transfer, implementing Fast ESP as an “out of the box” solution was time consuming, difficult, and required a Fast engineer with email and phone access to senior Fast Search wizards in Boston and Oslo. Dark days ahead for third party vendors of alternatives to Microsoft SharePoint services.

Stephen Arnold, May 25, 2009

SirsiDynix Search Plus Discovery for Libraries

May 24, 2009

Brainware landed a deal to provide search and discovery to SirsiDynix. After a bit of poking around, I learned that SirsiDynix wanted to move beyond key word search and provide users of its library systems with discovery functions. “Discovery”, as used in this sense, refers to giving a person looking for information easy-to-use methods to look for related information and suggested information also germane to the user’s query. Endeca hooked up with Ebsco to provide “guided navigation” to Ebsco customers. Most online public access catalogs and library-centric search systems match the users’ query terms or force the user to search by entering an author’s name. Change, at long last, seems to be coming to the library for search of an institution’s textual information. I wrote about some of the Brainware system’s capabilities in my 2008 study “Beyond Search” for the Gilbane Group here. I also did a short write up about Brainware in this Web log in early 2008 here.

A reader alerted me to an announcement here that SirsiDynix will roll out an enhanced enterprise search and discovery system to over 30 libraries. You can read that announcement here. The system includes such features as:

  • Trigram analysis, or “fuzzy logic” which evaluates each trigram in a word to allow for typos, diacritics and more: a first in the library search and discovery market
  • “Did you mean” suggestions which are based on terms in the catalog (rather than a generic third-party dictionary)
  • Dynamic search suggestions
  • Delivery of saved searches through an RSS web feed
  • Email and print options for search results
  • Built-in “Library Favorites”
  • The capability for libraries to define their own “Favorites”, profiles, languages and filters.

You can test the Brainware power “enterprise” service at the Wells County Public Library here.

The library market has been under severe price competition. This information sector is coming under more and more pressure from Google. The world’s largest search provider has been slowly expanding its services, including the controversial Google Books’ program. So far, specialized vendors of library information systems have been able to maintain the grip in today’s slippery economic one lane highway. The impact of Google on this market will be interesting to observe.

Stephen Arnold, May 24, 2009

Autonomy Expands into Marketing

May 17, 2009

Attensity has been moving into marketing and marketing-related search applications. Autonomy has offered tools that provide insights into market behavior announced at the eMetrics Marketing Optimization Summit a deal that indicates Autonomy is serious about this application of its search technology. Autonomy announced that the Optimost Adaptive Targeting is now powered by Autonomy’s Meaning Based Marketing engine. Autonomy is showing agility in its leveraging of its Interwoven acquisition. The company said here:

Optimost Adaptive Targeting mines all major types of customer attributes to create customer segments, including context (how the visitor arrives at the Website, e.g. search keyword), geography, time of day, and demographic, behavior, and account profile information. Once customer segments are created, multivariate tests are conducted on an unlimited number of copy ideas, offers, and layouts to determine the best solution for each audience segment. By adding the Meaning Based Computing capabilities of IDOL to Adaptive Targeting, marketers gain a unique ability to understand and effectively serve their customers. By leveraging IDOL, Optimost Adaptive Targeting now includes unique keyword clustering capabilities that automatically identifies
concepts and patterns as they are emerging on the web. For instance, an online pet store might discover that an unusually high number of “long-tail” searches relate to vacationing with pets. The solution could then automatically serve up more content and special offers around travel tote bags and kits.

The addled goose predicts that other vendors of search and content processing technology will increase their efforts to blend search and content processing with online and traditional marketing functions. Google is active in online marketing, and could increase its presence in this sector quickly and without warning. A stampede may be forming on the search prairie.

Stephen Arnold, May 17, 2009

Wolfram Alpha and Niche Ville

May 17, 2009

I have been grabbing quick looks at the write ups about the Wolfram Alpha search system. One of the more interesting essays is by Larry Dignan. You can read “Wolfram Alpha Launches: Can It Break Out of Niche Ville?” here. I liked the idea of a “Niche Ville”. The phrase connoted a small town which may be interesting, but it is not likely to suck MBAs into its maw the way Manhattan does. Aside from the suggestion that Wolfram Alpha was a “hamburger and fries” type of search and content processing system, I found this comment quite suggestive:

Overall, Wolfram/Alpha reads like an encyclopedia. It’s handy at times, but the big question is whether the search engine can break out of niche-ville. Sure, geeks like the presentation and it Wolfram/Alpha can be handy for deep dives, but the average person will want some sort of results every time. In that regard, Wolfram/Alpha may be a disappointment.

My thought was, “Wolfram Alpha is like other question answering systems. It’s a bit like an advanced search function because the user has to do more thinking than typing pizza into a Google Map. As a result, a small percentage of Web users may have the mental energy to tackle these systems.” Niche is a useful term.

Stephen Arnold, May 17, 2009

Text Analytics Data from Hurwitz and Associates

May 13, 2009

IT Analysis published Dr. Fern Halper’s “2009 Text Analytics Survey” here. The core of the essay was data from a longer Hurwitz and Associates study, which I have not seen. Based on the data in the article, you may want to get the full study. Two items jumped out at me.

First, customer and competitive intelligence were text analytics drivers. The also ran? Compliance. Second, and more surprising, implementation was as software as a service.

Interesting data.

Stephen Arnold, May 13, 2009

Some Google in the White House

May 13, 2009

A month ago, I received a call from a journalist asking about the Obama White House’s uses of Google. I did not answer the question because big time journalists ask me question, and I am not a public library reference desk worker any more.

One insight can be found here. Google said:

App Engine supports White House town hall meeting
In late March, the White House hosted an online town hall meeting, soliciting questions from concerned citizens directly through its website. To manage the large stream of questions and votes, the White House used Google Moderator, which runs on App Engine. At its peak, the application received 700 hits per second, and across the 48-hour voting window, accepted over 104,000 questions and 3,600,000 votes. Despite this traffic, App Engine continued to scale and none of the other 50,000 hosted applications were impacted. For more on this project, including a graph of the traffic and more details on how App Engine was able to cope with the load, see the Google Code blog.

How Googley is the Obama White House? Pretty Googley I hear.

Ste3phen Arnold, May 13, 2009

.

Google Time

May 13, 2009

Searchology strikes me as a forum for Google to remind journalists, the faithful, unbelievers, and competitors that the GOOG is the big dog in search, You can read dozens of reports about Google’s search enhancements, A good round up was “Google Unveils New Search Features” here. Don’t like AFP, run this query on Google News and pick a more useful summary. For me, the key announcements had to do with time. The date of a document and the time of an event are important but different concepts. Time is a difficult problem, and Google’s announcements underscore the firm’s time expertise. Timelines? No problem. Date sort? No problem. For me what’s important is that time prowess is a tiny tip of much deeper underlying technical capabilities. The Google has some muscles it is just starting to flex.

Stephen Arnold, May 13, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta