Google Mini Signals Maxi Change
May 16, 2008
In San Francisco earlier this week, I spent some time with one of my tech pals. In the course of the conversation, we talked about the lousy margins on hardware, even the flashiest gear from HP, IBM, and Sun. He said, “Too much cost, not enough fast cash.” He also told me that Google was going to trim its line of Google Search Appliances.
Yesterday, TechCrunch–an information life support device for my aging self in rural Kentucky–said much the same thing. Mark Henderson’s “Rumor: Google to Launch Hosted Site Search, Ditch Mini” appeared on May 15, 2008. One point that jumped out at me was:
It’s not exactly clear what this decision means for the enterprise search industry, but it won’t be surprising if Google does indeed come out with a cloud-based solution.
The comments from my friend and Mr. Henderson’s blockbuster mesh with what I have learned from people using Google’s custom search. Custom search is a no-charge way to get Google search for your Web site. We use it as one search option for ArnoldIT.com’s Web log, “Beyond Search”. We’ve tested the function and found that it works with near-zero latency and spiders tirelessly, often picking up changes to test custom search pages in less than 15 minutes.
Why do we think this “mini” change signals a “maxi” shift? There are three reasons:
Google isn’t in the hardware business. Google’s wizards love hardware, and the company has patent applications that are stuffed with fans, racks, and other gizmos. Hardware equals support, and if there’s one thing less exciting to a Googler than attending a lecture on ancient Greek pottery, it’s dealing with a a flesh-and-blood customer
Google moves in surprisingly small, incremental steps for a giant company. Any shift to a cloud-based service is no casual decision. Folks, we have a signal.
The Google Search Appliance is a beast of burden. To get the most out of the OneBox API and deliver the functionality that customers are discovering is possible, more robust devices are needed. The “blue” Mini was a black sheep compared to the “yellow” GB (Google Box) siblings.
Companies that dismiss Google’s enterprise ambitions are certainly free to continue emulating ostrich. The more strategically-minded may want to increase their fly-bys of the Googleplex. The enterprise market with its billions appeals to Google’s financial officer.
Stephen Arnold, May 15, 2008
Data Harmony Update a Suite Release
May 16, 2008
Access Innovations Inc., a data management systems company, is releasing version 3.4 of its Data Harmony software suite, and it sounds like a sweet deal.
The five-component software is used to make and maintain taxonomies, thesaurus, and indexing systems. Data Harmony focuses on accuracy, precision, and repeatability in its search results, an emphasis that receives a happy quack from the Arnold IT mascot.
The major updates include more than 30 new features and revised documentation (to keep you in tune). The company says current users will recognize the same look and feel of the program and appreciate “friendlier and more functional features.”
President and Chairman Marjorie M.K. Hlava said the upgrade comes courtesy user requests and suggestions. It’s refreshing to find a tech company making such efforts to rework a good product and actually making it better. We like Ms. Hlava’s old-fashioned, hands-on, we-care approach most refreshing at a time when software vendors do better PR than coding. The full list of the Data Harmony enhancements for 3.4 can be found here.
Jessica Bratcher, May 16, 2008
Google Translate
May 15, 2008
The Google Search Appliance is a pretty nifty gizmo when you know how to “pimp” your GSA with the One Box API. On May 15, 2008, the GOOG confirmed what I heard at the Where 2.0 conference yesterday afternoon: Google Translate now handles another 10 languages. You can read the Googlers’ official announcement here.
Why mention this on Beyond Search with its narrow editorial scope. Well, in addition to cross language searchers, you get to play with the AJAX language API. For the clever kids at Adhere Solutions, you can use some of this translation goodness to allow a language-challenged person to read a document written in another language by a colleague half way across the world.
These baby step announcements can be overlooked unless you keep your ears attuned to the sound of big Googzilla paws advancing toward the enterprise. What will Google tell me about this “interesting” expansion of Google Translate. I’m a persona non grata, so my queries fall on deaf but wealthy ears. I can hear those claws scraping across the pavement of Shoreline Drive.
Stephen Arnold, May 15, 2008
Google: A Brace of Media Analyzer Inventions
May 11, 2008
On May 8, 2008, the USPTO, an outstanding organization with a stellar search system, published two Google patent applications. US2008/0107337 is “Methods and Systems for Analyzing Data in Media Material Having Layout” and US2008/0107338 is “Media Material Analysis of Continuing Article Portions”. You can download these here.
Both inventions, to which Google is the assignee, pertain to figuring out what’s important and what’s not on Web pages. Companies that scan hard copy and convert those images to machine-readable ASCII use some tricks but a great deal of brute force to figure out what’s information and what’s advertising or other dross.
The inventions’ systems and methods can also be applied to other types of images converted to a machine-readable form; for example, a PDF that consists of the PDF wrapper and the TIFF image in the wrapper. I know that commercial database publishers are on top of Google’s innovations in content processing, so this is old news to the wizards at ProQuest, Reed Elsevier, and Thomson Reuters. But others in the less rarified atmosphere may find these disclosures interesting. Two patent documents stumbling through the USPTO’s hallowed halls are not an accident of fate.
Stephen Arnold, May 11, 2008
Let’s Assume Microsoft Acquires Powerset
May 10, 2008
I read Dan Farber’s most intriguing post “Is Microsoft Stalking Powerset’s Search Technology?”
I have Saturday chores to do, and I was sweeping the garage with the Microsoft-Powerset tie up buzzing in my head. I dropped the broom and grabbed by notebook for this post. Please, navigate to the News.com site and snag this “Outside the Lines”, May 10, 2008, information.
Mr. Farber writes:
Powerset raises the bar on search based on a preview that I had of the service last month. Powerset differs from the Google in that it extracts and indexes concepts, relationships, and meaning, rather than keywords. It’s able to create connections and pivot in some cases in ways that elude Google’s proficient engine, which favors more of a statistical approach
I saw an interesting demonstration of the Powerset technology at the BearStearns’ (oh, the late, lamentable BearStearns’) Internet Conference a year or two ago. I also received a link that allowed me to run some test queries on the system. Based on technology from Xerox PARC (Palo Alto Research Center), Powerset delivers some of the functionality I wrote about in my description of Cluuz.com here.
Quite a few companies are processing content, identifying relationships, and trying to move beyond key word search. I’m not going to revisit these points. My broom awaits, and I want to offer these ideas for comment:
- Assume the Microsoft buys Powerset. Now the giant from Redmond has to figure out what to do with its various Live.com search functionality, the Fast Search & Transfer Web search (which you can see here as AllTheWeb.com, branded as a Yahoo service but delivered using Fast Search & Transfer’s system), and the hybrid solution from Powerset (home-grown plus the third-party code from Xerox PARC).
- Powerset has undergone a lengthy gestation. I think the service is interesting, but Hakia, which beat Powerset to market, has a niche focus in health care and a growing appetite for enterprise deals. If I had to pick between Hakia and Powerset, I think I would lean toward the Hakia system for two reasons: [a] most, if not all of the code, is the product of the Hakia team, so there’s no pesky third-party involved; and [b] the company, despite its hunger for capital, has pushed products out the door, not just demonstrated prototypes.
- Microsoft has to find a way to slow Googzilla, and I am not certain that buying search technologies is a way to throw some body punches at the mathematicians in Mountain View, California. For example, Google continues to build out a 21st-century version of the “pre-break up” AT&T infrastructure without much push back from anyone. Even IBM has a bad case of Google love. AT&T and Verizon along with Wall Street see Google as a one-trick pony, albeit a big, big pony. Loading up on search wizards is a good thing. Trying to integrate different search technologies into the existing Microsoft platform may be less good.
Okay, now I have to return to my garage clean up duty. A happy quack from the Beyond Search goose to Mr. Farber for his interesting article and the respite he gave my tired wings.
Stephen Arnold, May 10, 2008
Semantic Web: Useful Links
May 10, 2008
Advancing Insights posted a list of useful links for “Web 3.0, RDF, and the Semantic Web”. A content goose squawk for Jim Wilde for the links. Clicking through these documents is instructive. If you follow Google’s activities in the semantic space, you can see why Google has pushed forward with its programmable search engine.
Invented by former IBM Almaden scientist, the PSE or programmable search engine could, if deployed on a large scale by Google, make Google the de facto “hub” for semantic processing. You can download one of the Google PSE documents by navigating to the USPTO’s awesome Web site and searching for US2007 00386616, filed on April 10, 2005, and published on February 15, 2007.
Stephen Arnold, May 9, 2008
The AP Analyzes Microsoft’s Live Search Options
May 9, 2008
ut of the aether, I received Jessica Mintz’s story “With Microsoft Mum, Analysts Mull Next Moves for Live Search”. You can read the story here or here. As I often say, snag it quickly. The wild and wonderful world of the Associated Press’s online system can baffle even a skilled researcher.
I scanned the story, intrigued that “analysts mull” much of anything related to search, text processing, or information retrieval. The sector has a glass ceiling that kicks in at the $350 to $400 million level with most companies in the market trying to make the losses and voracious appetite for investment look like a great business.
Her analysis, which I’m confident is Grade A for the AP arrived as I read the PCWorld story “Microsoft’s Answer to Google Sky to Launch at End of May”.
Microsoft has to do more than play me-too if Microsoft is going to hobble Googzilla. The GOOG isn’t very good at PR, marketing, or sales. At least, Microsoft pays attention. That’s a good thing, I suppose.
Ms. Mintz’s interesting essay is about Micrsooft after Yahoo. I think her point is that without Yahoo, Microsoft has no easy, fast, cheap way to increase its search traffic and, hence, its online ad revenue. She writes:
Some analysts say Microsoft must increase its search traffic to attract advertisers. Others believe Microsoft should concede that market to Google Inc. and find success elsewhere — leapfrogging rivals in areas such as display and mobile advertising. All that is clear is Microsoft must come up with a Plan C soon, after acknowledging that its Plan A of going solo was troubled, forcing it to turn to Plan B of acquiring Yahoo. Part of the problem analysts face predicting Microsoft’s next moves is that the company has already tried the obvious tactics. It built its own search-ad platform from scratch and spent $6 billion to buy a major online advertising company, aQuantive. Microsoft overhauled its search engine technology, and most analysts agree that its results are at least as good as Google’s. It tweaked the design of its Live Search service to become more like Google.
Whoa, Nellie!
The most interesting information for me was Ms. Mintz presents a series of action items. I’m not sure if these are Mr. Ballmer’s or if these have been constructed from the search experts Ms. Mintz interviewed for this story. Set aside provenance for a moment. Let’s look at each action item. For ease of comparison, I put Ms. Mintz’s suggestions in the column “Microsoft Tasks” and my comment in the column labeled “Beyond Search”.
| Microsoft Tasks | Beyond Search Comment |
| Do the basics | Google’s been at the basics since 1998. Time to start I guess |
| Innovate in “quick waves” to force Google to play catch up | “quick” and “Microsoft” are an oxymoron |
| Change the basic experiences of communication and search | Microsoft needs to deliver search that people actually use |
| Gain scale | Good idea. Google’s been building plumbing for a decade. Microsoft’s just started |
I don’t think my research for Google Version 2.0 supports the idea that Microsoft can catch Google with these four actions, individually or collectively. Let me run through my reasoning based on the information available to me.
First, Google delivers a search experience that is increasing its market share. Google’s approach works. Microsoft’s approach hasn’t. What’s astounding to me is that with Internet Explorer’s default search the Live.com service, the canyon in market share is almost unbelievable. IE users are ignoring the default search box and consciously selecting Google. That’s just amazing. One bit of bad news. The market share data are not accurate. Google’s market share is in the 80 percent range. In countries like Denmark, Google’s share is over 90 percent.
Google: Content Management for YouTube
May 9, 2008
My hobby is reading Google’s opaque, jargon-filled, and disjointed patent documents. If you are following the $1 billion legal dispute between the GOOG and the media dinosaur Viacom or you upload video to Google, you will want to take a gander at US 20080109369, “Content Management System” by eight Googlers.
The invention is a control panel that shifts certain content tasks to the person posting content to the Google system. There are references to bits of Google technical magic that make the system smarter than the clunky content management systems that most organizations use.
In my opinion, this Google disclosure could shift the burden from Google to the person or software function posting content. You can download the document from the wonder system provided without charge by the US Patent & Trademark Office. I’m interested in your views of US 10080109369. The Verizon attorneys have undoubtedly gone over this invention with the legal acumen embodied in their sleek selves. I just read this stuff as I find it. This one’s worth a quick look if you are curious about one of Google’s systems for handling the more than one million video uploads pumped into the company every three or four weeks.
Keep in mind that the system and method in this patent document can be extended to other types of content. This invention could–note the could, please–make Google into a great big database publisher. Now Google is just inventing, not doing, what the system and method asserts. Patent applications aren’t products and services.
Stephen Arnold, May 9, 2008
Redmond Magazine: Brainware a Winner in Desktop Search; Google, the Loser
May 8, 2008
Redmond Magazine, an independent publication that tracks things Microsoft, featured a “bake off” among desktop search systems. The companies in the technology comparison are Brainware, dtSearch, Google Desktop Search, and Microsoft Windows Desktop Search. The winner? Brainware’s Globalbrain search technology based on patented technology. The company uses trigrams–three letter sequences–to identify relevant documents.
You can read the summary of the bake off here. These types of reports can disappear or become hard to find in a blink, so click quickly.
The most interesting part of the analysis is that Globalbrain scored lower on features, but beat the other vendors’ systems on documentation, ease of installation, ease of use, and administration. dtSearch, however, matched Globalbrain on installation ease. In my own tests of these systems, I found some of Globalbrain’s terminology confusing, particularly with regard to selecting specific collections to index and search. Obviously the Redmond Magazine team didn’t have any problem with terminology.
Another key finding was that the lowest-rated search system was Google Desktop Search. In our tests, Google fared near the top of the stack, but it lagged behind both Coveo and ISYS Search Software. These two vendors’ products were not included in the Redmond Magazine analysis.
Microsoft Windows Desktop Search scored only slightly better than Google. In our tests of Windows Desktop Search, we encountered odd latency when the system processed certain queries. Again, our experience seems to be at variance with the Redmond Magazine results.
Bottomline: Redmond Magazine flagged Brainware’s Globalbrain as the “winner”; Google, the vendor with the desktop search system that needs the most work. One tip: if you run a query on Live.com or Yahoo.com to locate Windows desktop search, make certain you get the desktop system, not the SharePoint Search Express system. Two different animals, and the Redmond Magazine test looked at the desktop version which does not require SharePoint.
Stephen Arnold, May 8, 2008
US Government Uses AdWords
May 6, 2008
By the time you read this, the estimable Financial Times will have renamed the file, moved it to a digital dungeon, and besiege you with advertisements. The headline that stopped me in my web-footed tracks is, “US Advertises on Google to Snare Surfers”. Click here for what I hope is the original FT link.
The idea is that traffic to a US government site–America.gov–needs to be goosed (no pun intended, dear logo). Do you think the government might use content? Do you think the US government might use backlinks from high-traffic Web sites? Do you think the government might use nifty Web 2.0 features? Keep in mind that this site’s tag line is, “Telling America’s story”.
The answer is, “No.” The US government bids for such zippy terms as terrorism. The person who clicks on an advertisement and gains an insight into the American government’s psyche.
The FT story said:
In recent months the US administration has quietly been running the advertisements for its America.gov site, which is intended to give foreign audiences the Washington take on US foreign policy, culture and society.
I am not doing any government work at this time. I hope someday to meet the consultant who came up with this idea. I will try to get this wizard to take me to lunch. I have a hunch this consultant made some money on this project.
Stephen Arnold, May 6, 2008

