Why Dataspaces Matter

August 30, 2008

My posts have been whipping super-wizards into action. I don’t want to disappoint anyone over the long American “end of summer” holiday. Let’s consider a problem in information retrieval and then answer in a very brief way why dataspaces matter. No, this is not a typographical error.

Set Up

A dataspace is somewhat different from a database. Databases can be within a dataspace, but other information objects, garden variety metadata, and new types of metadata which I like to call meta metadata, among others can be encompassed. These are represented in an index. For our purpose, we don’t have to worry about the type of index. We’re going to look up something in any of the indexes that represent our dataspace. You can learn more about dataspaces in the IDC report #213562, published on August 28, 2008. It’s a for fee write up, and I don’t have a copy. I just contribute; I don’t own these analyses published by blue chip firms.

Now let’s consider an interesting problem. We want to index people, figure out what those people know about, and then generate results to a query such as “Who’s an expert on Google?” If you run this query on Google, you get a list of hits like this.

google expert

This is not what I want. I require a list of people who are experts on Google. Does Live.com deliver this type of output? Here’s the same query on the Microsoft system:

live expert output

Same problem.

Now let’s try the query on Cluuz.com, a system that I have written about a couple of times. Run the query “Jayant Madhavan” and I get this:

cluuz

I don’t have an expert result list, but I have a wizard and direct links to people Dr. Madhavan knows. I can make the assumption that some of these people will be experts.

If I work in a company, the firm may have the Tacit system. This commercial vendor makes it possible to search for a person with expertise. I can get some of this functionality in the baked in search system provided with SharePoint. The Microsoft method relies on the number of documents a person known to the system writes on a topic, but that’s better than nothing. I could if I were working in a certain US government agency use the MITRE system that delivers a list of experts. The MITRE system is not one whose screen shots I can show, but if you have a friend in a certain government agency, maybe you can take a peek.

None of these systems really do what I want.

Enter Dataspaces

The idea for a dataspace is to process the available information. Some folks call this transformation, and it really helps to have systems and methods to transform, normalize, parse, tag, and crunch the source information. It also helps to monitor the message traffic for some of that meta metadata goodness. An example of meta metadata is an email. I want to index who received the email, who forwarded the email to whom and when, and any cutting or copying of the information in the email to which documents and the people who have access to said information. You get the idea. Meta metadata is where the rubber meets the road in determining what’s important regarding information in a dataspace.

Read more

The Enterprise Search Thrill Ride

August 29, 2008

Summer’s ending, and the search engine thrill ride is accelerating. Before you fire up your personal computer and send me an email asking for juicy details, appreciate that I can only comment in a broad way, making observations at a high level. If you have an appetite for more information, you will have to dip into your piggy back and engage me to show up and discuss the state of the industry in a less chatty setting like this Web log.

Every amusement park has a thrill ride. Kids love these roller coasters, bungee jumps, and spinning barrels. Adults or people with an aversion to fear are generally content to watch. Once in a great while, a thrill ride goes wrong. The thrill seekers can be injured and once in a while killed.

Search and content processing companies are in a sense a thrill ride in way. The launch of a company is filled with anticipation. Then the company chugs along and usually gets a sale, and the process repeats itself. At the end of the ride, the company speeds along and in most cases the ride ends with the employees’ displaying big smiles. When a ride goes wrong, the employees aren’t so chipper, but the lawyers often show sly grins.

rollercoaster blur copy

I am quite confident that the September to December 15, 2008, period will be quite exciting for me. First, the search and content processing sector of the enterprise software market is poised for change. Second, a number of companies will have to make their numbers or face the prospect of enduring the lash of venture capitalists’ whips, changing careers, or closing up for good. Third, the GOOG is beginning to move slowly forward in the enterprise sector. Even if Google’s management insists “We’re just running a beta test”, those “beta tests” will be disruptive for established search and content processing vendors. Fourth, newcomers to the North American market will make their presence felt to a greater degree than in the first six months of 2008. Newcomers often become irritants with their promise of better, faster, or cheaper. Of course, the customer may pick two of these claims, but incumbents have to waste time and money deflecting these competitive challenges. Finally, superplatforms–big enterprise software vendors–have to protect their turf. I expect significant pressure from these firms to add another variable to the search and content sector. After all, what can a company do when Microsoft bundles an incrementally improving search and retrieval system with a widely used server product like SharePoint.

Read more

Growth of Electronic Information

August 29, 2008

Larry Borsato, writing for the Industry Standard, presents some interesting information about the growth of electronic information. You can read his article “Information Overload on the Web, and Searching for the Right Sifting Tool” here. The most startling item was this statement:

IBM predicts that in the next couple of years, information will double every 11 hours [PDF].

The article runs down the problems encountered when looking for information using various search services. He’s right. Search is a problem. But that doubling of information every 11 hours underscores the opportunity that exists for a person or company with an information access solution.

Stephen Arnold, August 29, 2008

Google: Dashboard or Buzz Word

August 29, 2008

ZDNet’s “Google Apps Dashboard: Serious about the Enterprise?” does a good job of explaining that Google continues to push into the corporate market. The article, written by Michael Krigsman, summarizes a software component that allows Google Apps Premier licensees a way to check on the status of the services. For me, the most interesting point Mr. Krigsman made was:

Although Google may offer this service level to large accounts such as Cap Gemini, I doubt smaller customers will receive any personalized attention whatsoever. After all, Google isn’t known for providing stellar customer service; actually, the company’s customer care record sucks widgets. Only time will tell whether Google can successfully transition from its mass market consumer mentality to becoming a trusted, service oriented enterprise vendor.

I too have heard that Google does not return telephone calls, misses meetings, and ignores teleconference start times. But I have also heard that Google commissioned an expert to analyze the weaknesses of its sales approach and listened as the consultant explained that Google had to change its ways.

Google is a decade old, and it must give up some of its math club ethos, not just create software and spout buzz words. Will the company make the shift? I think we must wait and see.

Stephen Arnold, August 29, 2008

Dataspaces Analysis Available

August 29, 2008

IDC, the research giant near Boston, has issued for its paying customers “Google: A Push Beyond Databases”. The write up is part of the firm’s Technology Assessment series. Sue Feldman, the IDC search and content processing lead analyst and industry expert, is the lead author. I provided some background rowing. The result is a useful first look at a Google initiative that’s been rolling along since 2006. The 12-page document provides a brief definition of dataspaces, a timeline of key events, and several peeks into the technology and applications of this important technology. Ms. Feldman and I teamed to outline some of the implications that we identified. If you want a copy of this document, you will have to contact IDC for document #213562. If your company has an IDC account, you can obtain the document directly. If you wish to purchase a copy of this report, navigate to http://www.idc.com/ and click on the “Contact” link. As with my BearStearns’ Google analyses, I am not able to release these documents. I’m sure others know about dataspaces, but I find the topic somewhat fresh and quite suggestive.

This report is particularly significant in light of Google’s making its “golden oldie” technology MapReduce available to Aster Data and Greenplum. You can read about this here. Last year, I spoke with representatives of IBM and Oracle. I asked about their perceptions of Google in the database and data management business. Representatives of both companies assured me that Google was not interested in this business. Earlier this year, via a government client I learned that IBM’s senior managers see Google as a company that is fully understood by the top brass of the White Plains giant. My thought is that it must be wonderful to know so much about Google, its deal for MapReduce, and now the dataspace technology before anyone else learns of these innovations. The dataspace write up, therefor, will be interest to those who lack the knowledge and insight of IBM and Oracle wizards.

Stephen Arnold, August 29, 2008

SurfRay: Has the Company Missed the Search Wave. Nope

August 29, 2008

Update: October 26, 2008

I have summarized several of the themes from my write ups and from the posts to the SurfRay thread. You can find this article at http://arnoldit.com/wordpress/2008/10/24/surfray-round-up/ or click here.

Update: August 29, 2008

SurfRay is alive and well. A phone glitch plus the unfortunate unanswered emails from me gave me the impression that this company was realigning. You can contact the company at this phone number, which is now working:  +45 70 250 250. I’m tracking the company because I have been a long time fan of the Mondosoft SharePoint solution. (Why Microsoft ignored this system still baffles me. And, as you may know, I have been a strong advocate of the Speed of Mind technology to cure the DB2 and Oracle performance headaches. So, don’t wait. Snag the solution the Vatican uses for its multilingual Web site here www.surfray.com.

Original post querying for information:

Last May 2007, I made a comment about the change in ownership of Mondosoft, the Danish search engine company. I speculated that management changes in 2006 and the company’s merging with SurfRay, a Copenhagen based technology and services company, left the future of Mondosoft in doubt. After my panel, an earnest Dane told me that SurfRay and Mondosoft were in business and servicing their 1,500 customers. One of these customers is the Vatican, and I concluded that the Swiss Guards would take strong action against me if I suggested that the Church’s search engine was an orphan.

A colleague in Europe alerted me on Tuesday, August 26, 2008, that the SurfRay telephone number is no longer being answered. I asked a colleague who speaks Danish to verify the number and talk to a person at SurfRay. No luck. My hope is that this is a telephone glitch, and not a more serious issue with the company.

The Mondosoft search system was one of the first “snap in” replacements for native SharePoint search-and-retrieval services. The company also was among the first search vendors to include Web site analytics as part of the company’s search system. I used screen shots of the reports that showed which pages attracted users and which triggered abandonment of a site. Mondosoft also acquired Ontolica, a specialist in taxonomy and content processing, to add additional indexing to SharePoint content.

SurfRay snapped up a company called Speed of Mind. In the first edition of the Enterprise Search Report, which I wrote for CMSWatch.com, I profiled this company. Speed of Mind’s technology used an innovation to accelerate access to information and data in tables generated by Oracle, IBM DB2 and Informix, and My SQL. I met the founders of Speed of Mind and was impressed with their unique approach to cracking the problem of making searchable the most recent updates to a database table in near real time.

SurfRay had other technology, but I focused on the search, content processing, and database access systems. The SurfRay Web site is still online at www.surfray.com. If anyone has additional information about the company, please, let me know. If the firm shutters its doors, a number of major accounts will be in the market for a replacement search engine. The changes that Microsoft continues to make in its SharePoint system make it tough to “freeze” a search system while the SharePoint environment is changing.

Stephen Arnold, August 29, 2008

Vivisimo Sells 38 Licenses in Six Months

August 28, 2008

Autonomy may have to check its rear view mirror to see if Vivisimo’s Velocity is catching up with the Cambridge, UK, search firm. Marketwatch here on August 27, 2008, reported

 Demonstrating continued sales growth, Vivisimo sold a record 38 software licenses for its Velocity Search Platform(TM) during the first half of 2008, up 27 percent over the same period a year ago. In addition, deployments of Velocity through OEM relationships continued to accelerate in the first half of 2008, growing to more than 900 organizations using Vivisimo’s search platform. Vivisimo also signed seven new reselling and consulting partnerships. The growth continued Vivisimo’s expansion into international markets, with partnerships added in northern and southern Europe and Latin America. Among the new partnerships is Kisiwa Technologies SRL in Italy, which produced a number of Vivisimo search deployments within the Italian government. 

A number of search vendors seem to have found rough financial waters in the first half of 2008. For example, one mid sized European vendor may have folded its tent and retired from the field of battle. Another outfit has repositioned its search technology as a utility service. One vendor has been generating a flow of good news press releases. Autonomy reported strong quaterly earnings and now Vivisimo (a privately held firm) seems to have cracked the code on closing six deals a month, a remarkable number.

What’s Vivisimo’s secret? Marketwatch quotes Vivisimo saying:

We help our customers maximize the business value of their information by using sophisticated search and discovery to drive collaboration and innovation throughout their organizations.

I hope the Vivisimo powered USA.gov removes its limit on the number of images accessible from this service. I can’t get at Library of Congress images. The “business value” of this important service would be enhanced without this arbitrary limit.

Do you think there’s a secret ingredient at Vivisimo that the other vendors lack? Share your thoughts.

Stephen Arnold, August 28, 2008

Internet Explorer: A Sneak Attack on Google

August 28, 2008

I admire the wordsmithing at Forbes Magazine. The story “Microsoft’s Sneak Attack on Google” by Victoria Barrett takes a swing at Microsoft and misses, then throws a punch at Google and doesn’t come close. The core of this story is the addition of a search box to Internet Explorer 8 and icons that send the query to one of Microsoft’s best friends; for example, Amazon. Ms. Barrett points out that the crafty Microsofties display a Microsoft map with an IE 8 user highlights an address. Google, I learn, doesn’t have a good answer for Microsoft’s dominance of the browser market. To keep the interesting writing exercise balanced, Ms. Barrett reminds me that Microsoft’s buying traffic does not work too well and that the gap between Google and Microsoft remains wide.

What caught my attention is the characterization of Google as a foe which can be challenged only by a sneak attack. Furthermore, Microsoft comes across looking a bit like a mugger waiting to catch a victim unawares. Google doesn’t fare much better because the company, as I read the story, is indifferent to small ad markets presumably too preoccupied with loftier sales ambitions.

I find this an outstanding example of technical analysis. Just what the doctor ordered for managers who need a search box in a browser explained as the equivalent of a digital ninja stalking an indifferent Googzilla.

Stephen Arnold, August 28, 2008

Publish Magazine Raises Doubts about Google

August 28, 2008

I8 enjoy traditional publishers’ analyses of Google. In the last few months, criticizing Google has become a cottage industry. Navigate here to read Publish Magazine’s “Can Google Search Sustain All its Other Ventures?” by Clint Boulton. For me the most interesting point in the article was this statement:

Some industry observers wonder whether Google’s reliance on search to buoy its YouTube property and other investments, including Google Apps in the SAAS messaging and collaboration sector, is sustainable in the long run.

What’s interesting about this headline and quote is that there is nothing more to the article. Maybe Publish had a bad hair day. My hunch is that this type of story harms Google. Traditional publishers seem to be like deer in headlights. Could this negative headline and one sentence story constitute a cheap shot? Would a beleaguered traditional publishing firm take this action to generate traffic for a Web page? Let me know your thoughts. 

Stephen Arnold, August 28, 2008

Google: Another Legal Hassle

August 27, 2008

A number of sources–including ZDNet UK–are reporting that Google has been named in a legal action by Kalusner Technologies. You can get more information from Reuters here. The issue concerns automatic notification of voice mail. Google declined to comment. For me, this type of legal action just adds one more task on Google’s lawyers’ to do list. I wonder if the wild and crazy world of software and business process patents was created for legal eagles to stimulate the sale of condos in Costa Rica.

Stephen Arnold, August 27, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta