Why Microsoft Fears Google Sort of Revealed

July 7, 2008

My trusty news reader delivered this knowledge dumpling to me on July 5, 2008. I scanned it between pitches at the Louisville Bats baseball game. Like most commercial business writing, the essay is good, well-reasoned, and shaped to put major events into an understandable context. You can read “Why Microsoft Will Win Yahoo” here. by David Kirkpatrick, senior editor of Fortune, writing on CNN.com. This business essay has an A at Wharton written all over it: primary research in the form of quotes from a personal discussion with Microsoft’s Steve Ballmer, humor in the form of a left-handed description of Google as owner of the “world’s most powerful and profitable marketplace” with Google technology described as not “the best”, and an analysis pegged to Microsoft’s need to get its transmission in gear in terms of online advertising.

With Mr. Kirkpatrick, I agree on most points, but one passage nibbled at me during the thrill-lacking Louisville Bats’s game:

Google (GOOG, Fortune 500) drives Microsoft crazy for two fundamental reasons. One, Google has developed the fastest-growing new pool of profit in technology with its ad-supported search business. And secondly, it has taken the mantle of “greatest and most powerful tech company” away from Microsoft, with all the associated benefits that go along with that, most notably a very high stock-market valuation.

Here’s why.

Recall that Google is a decade old. It is no start up, but some journalists fall for the crazy college guys charade with astounding consistency.

Today, the company is without a doubt the leader in search. With some inspired me-too borrowing, Google jumped into online advertising and now, after facing minimal competition is these areas controls somewhere between 65 percent and 75 percent of the market for search in North America and Europe. Russia is lagging because of Yandex.

And Google’s search and ad platform was accidentally discovered by Google as able to support a range of applications, services, and functions. Some of these are definitely not-searchy. For about nine years, Google’s engineers have built more than 80 services, created a Google Search Appliance, moved into mapping, probed online database technology, and offered products and services to the education market, commercial enterprises, and government agencies. The telecommunications industry discovered how annoying Google could be with its New Age approach to mobile communications. The video industry is relying on Viacom to put Google in a straight jacket. All told, Google is exploring six or seven industry sectors with products and services. Some of these are only variations of search technology; other thrusts such as online payments just run on the Google super computer.

Microsoft is now trying to “close the gap”, “catch up”, or more colloquially “kill Google” by taking these actions:

  • Creating Live.com, expanding cloud or network solutions, and making Google a focal point much as Inquisition worker bees focused on folks who did not follow the desired line of thought
  • Announcing investments in data centers and research labs. Both of these initiatives look quite a bit like Google’s approach to data centers and research labs, but Google got its ideas from AltaVista.com engineers and Bell Labs. The problem is that it takes time to get data centers up and running, particularly when you are using systems that require name brand hardware and technical baby sitting.
  • Buying search technology.

The leader in collecting search technology is Yahoo. So, Microsoft will buy that company and get licenses and technicians familiar with these search systems: {a} AllTheWeb.com which was originally based on Fast Search & Transfer technology now owned by Microsoft; [b] AltaVista.com, the company whose engineers provided Google with a turbo boost in the the 1999 to 2002 period with some residual kick continuing even now; [c] Inktomi, one of the original Web indexing systems; [d] the Flickr search system; [e] Stata Labs, the email search system in use in Yahoo mail; [f] InQuira, the natural language search system used for Yahoo Help; [g] the various research search engines that range from the moth balled Mindset to the newer and somewhat flaky semantic Microsearch; [h] the moth balled Delicious.com search system; and a handful of others lost in the jumble room of my 64 year old memory.

Add to this collection, the SQL Server search technology, two types of SharePoint search, the search functions in Dynamics CRM, Fast Search & Transfer’s Web search and ESP (enterprise search platform), a couple of flavors of desktop search, the mind boggling awful search in Outlook and Outlook Express, and the search function in the Xbox which at one time was provided by Mondosoft (now part of SurfRay in Denmark). The Powerset search system built on Xerox PARC technology and partially running on Amazon’s Web services platform. Of course, I’m probably forgetting a few.

What I am driving at is that if and when Microsoft buys Yahoo’s search business, the Yahoo.com site really should be part of the bargain. Despite its many flaws and weird America Online approach to information, Yahoo.com gets traffic. And traffic is what Microsoft needs, not more wacky search technology.

Consider this statement by Mr. Kirkpatrick:

So even though Microsoft has with painstaking and expensive effort come near to par with Google on search technology, it still shows little likelihood of competing successfully with it as a search business.

“Par with Google on search technology” is not going to deal with the fact that Microsoft finds itself on the defensive not just in search but in these key areas:

  • Brand (generally positive)
  • Demographic hooks (moving into education where Apple and Microsoft have long ruled the hen house)
  • Business models (Advertisers pay, “pull” sales, not “push” sales the way Microsoft earns money, piece of the action, etc.)
  • Velocity of innovation (slowing but still pretty zippy)
  • Cheaper operating and infrastructure costs (not understood and overlooked as a competitive advantage by Wall Street analysts)
  • Time (Google has a head start and is still moving forward).

When I read business articles, I recognize the care that goes into them. I know from my days at Ziff Communications how much it costs to craft prose that flows.

My concern is that after a decade of Google being Google, no one recognizes that Google represents a fundamental change in the software, services, and systems business. Microsoft is now reacting but I think it is too late to buy aging portals, collect odd ball technologies, and try to use money to buy parity with Google.

The Fortune article does not and can not tackle such issues in a short essay. I am looking forward to Fortune and other publications coverage of Google. I do hope to read something more substantive than Microsoft will keep trying to catch Google (pretty much an impossibility in my little Kentucky sphere of understanding), Google making people stupid (a concept I don’t understand because Google manifests a demographic. Google did not create the demographic’s behaviors), and Google is on the ropes in the telco business.

Google drifts above these particulars. The company does face a grave threat but from attorneys, not software companies or venture capitalists’ bets on “Google killers”.

Information on this perspective on Google is available, just hard to find.

Stephen Arnold, July 6, 2008

Not So Fast, Folks

July 6, 2008

My news reader alerted me to an essay “How Fast Is Attivio?” The author is Adrian Bloem, a contributing analyst to CMSWatch.com and its Enterprise Search Report. I try to keep track of ESR and its new shepherds.

Some readers may know I suffered a “heart event” and had to step away from Enterprise Search Report in February 2007. This was a tough decision because ESR was in a way my “baby.” I wrote the first three editions (2004 to 2006).

If there was a mistake in the math of the cost analysis, I made it. I was responsible.

But when I quit, I was no longer responsible. When someone asks me why the number of profiles has been reduced, what do I know? When I stepped away from ESR, I gave up any control over what CMSWatch.com includes or excludes from the ESR fourth edition.

Sure, I still get a pittance in royalties, but if there is a mistake, I sure as heck am not responsible because I quit, allowing others to take control. I tell people, “Take it up with CMSWatch.com, not me.” I think you follow my logic here.

How does this apply to Attivio, a start up?

Attivio: Moving Beyond Search

Some history: My interest in Attivio arose from the research I did for my April 2008 monograph Beyond Search: What to Do When Your Enterprise Search System Doesn’t Work. When I was recovering from my little health problem, I did some hard thinking about the sameness of the enterprise search vendors, the many problems I had documented, and survey data that said, “60 percent of the users of enterprise search were dissatisfied with their enterprise search system.” 60 percent! That is a big number, now backed up by other survey findings. But in 2007 few knew what I knew about the problems with traditional enterprise search systems.

The data said to me, “Key word and legacy systems are not what users looking for information needed.”

My study Beyond Search explains these needs and some options, consuming about 300 pages to summarize my data. As I worked on the study, I become more vocal about the user dissatisfaction with enterprise search than some of the other speakers the search conference circuit.

The root of the problem is that the users’ needs and expectations. Most search systems were not changing fast enough or simply could not change.

A colleague alerted me to Attivio. I recall her saying, “Attivio is trying to leapfrog the many problems of traditional enterprise search.”

So, I played telephone tag and finally chased down the Attivio senior management team. I do have a tenacious streak. I used the same technique on executives from Silobreaker (Stockholm, Sweden), MarkLogic (San Carlos, California), Exalead (Paris), and 24 other companies with systems that move “beyond search”. A phase change is taking place, and I wanted first-hand intelligence about what was happening. You can read some of this information in my free Search Wizards Speak series on ArnoldIT.com.

When I read Mr. Bloem’s well-crafted essay, published on July 5, 2008, I was struck by the skepticism with regard to Attivio and, by extension, toward other companies with pre-alpha or early-stage systems; for example, Powerset (now part of Microsoft), Radar Networks (Twine), and Tigerlogic (ChunkIt), among dozens of others.

These are works in progress, and each of these companies is working hard to sign up customers and make sales.

My information about Attivio comes from my knowing a couple of the founders, several of whom who worked at Fast Search’s offices near Boston, Massachusetts. Also, I know a bit about Fast Search technology because I had an engineering oversight role for the US Federal government on the Fast Search implementation for the FirstGov.gov, a US government wide index of citizen-facing content.

In my push forward way, I tracked down Attivio’s founder–Ali Riaz–and interviewed him about his new company and its technical approach.

Mr. Riaz made it clear to me that he and his former colleagues wanted to move “beyond search”; that is, these professionals knew that key word and legacy information retrieval systems were not what users wanted. Mr. Riaz wanted to follow another path, moving to next-generation information access methods. He said:

From all our accumulated experiences in the industry, we realized that search is simply not enough to solve the problems that search has been trying to solve. We realized that today’s search platforms–specifically enterprise search–have become legacy technologies. The market needed a fresh approach, and that’s why we created Attivio.

Please, read the complete interview with Mr. Riaz, which took place in May 2008, here.

In my primary research with Attivio’s management, I learned that Attivio is not a reseller of Fast Search & Transfer’s Enterprise Search Platform. I also learned that Attivio–like IBM, Siderean Software, Tesuji, and other information access companies–was using the open source search system Lucene as a base upon which to build.

The idea was, according to Mr. Riaz:

The Active Intelligence Engine, or AIE. Our AIE enables enterprises to blend their structured data and unstructured content without compromising the richness of either, offering the precision of SQL and the fuzziness of search by “mashing up” search and business intelligence data warehouse technologies.

Attivio, like six or seven of the companies profiled in my Beyond Search study for the Gilbane Group, was designed as a blend of components–some open source, others proprietary, and a lot of their own intellectual property.

Mr. Riaz told me that he was not reselling any vendors’ technology, preferring to “do his own thing”. No big surprise here. Mr. Riaz left Fast Search in mid-2006 and I was talking to him in May 2008.

What Gnaws at Me

The issue that gnaws at me is the implication as I understand Mr. Bloem’s essay that Attivio and its executives are tainted by the problems that have surfaced after they left Fast Search in 2006.

Mr. Riaz and the other Fast Search professionals were employees, reporting to the management team in Norway.

My research indicates that Fast Search’s problems came about because of a dearth of engineers who could install and customize the Fast Search system. I written extensively on this subject in this Web log. The posts are here.

The core of my analysis pivots on gap between the ability of the Fast Search sales team to close major deals, and the difficulty Fast Search encountered hiring enough qualified engineers to implement these enterprise systems.

Without qualified engineers, it is tough to install any enterprise search system. Expertise in search and content processing remains in short supply just as it has been since the ascendance of Google.

In my experience, this type of staffing problem, like a bobsled racing down a run, begins slowly and then gains momentum over time when hiring is slowed due to a shortage of talent.

In the period between 2006 to the present, the engineering  staff issue accelerated like a bobsled racing down a mountain.

Read more

Microsoft Hosted Exchange Security and Archiving

July 6, 2008

I lost track of Microsoft’s 2005 acquisition, FrontBridge. The company, as you may recall, was a provider of comprehensive secure messaging services. FrontBridge’s “Total Message Management” services ensure the security, compliance and continuity of electronic messages. The system provided managed services for email and instant message archiving, spam filtering, virus scanning, encrypted email, policy enforcement and disaster recovery.

I had in my files a schematic that shows the FrontBridge architecture, which remains largely intact within the hosted Exchange service.

frontbridge diagram

With search vendors morphing into eDiscovery, you may want to update your links to Microsoft’s Exchange Hosted Services, where FrontBridge plays a role. You can find the start page here. At the time of the acquisition, I understood that FrontBridge would be a Microsoft subsidiary. By 2006, FrontBridge became part of the Exchange product. The original 2006 pricing for email and message filtering was pegged at at $1.75 per user, per month; Archiving at $17.25 per user, per month with an unlimited retention period and 3.6 gigabytes of storage; Continuity at $2.50 per user, per month; and Encryption at $1.90 per user, per month. I am not sure how the pricing operates as Microsoft evolves its hosted services.

As you try to determine the value of licensing a third-party secure messaging service or use the hosted Exchange solution, you may find the diagram useful in getting your bearings.

Stephen Arnold, July 6, 2008

SurfRay AB Update

July 6, 2008

In the first two editions of Enterprise Search Report, I profiled Mondosoft, a Microsoft-centric search system. I gave it a favorable review. Like most Microsoft-centric products, unless properly resourced, performance can become an issue. By the time I started work on the third edition of ESR in 2006, I had heard rumors of some changes underway at the firm. By late 2007, Mondosoft became part of SurfRay, a Danish search conglomerate. I found the search system implemented for the Vatican quite interesting. Hit boosting and multi-lingual support added zest to what could have been a sinfully bad (no pun intended) search experience. You can try it here.

In 2004, Mondosoft caught my attention because it was one of the first search vendors to offer analytics for licensees. Mondosoft, when deployed in a SharePoint environment, brought much needed usage data into the SharePoint picture. Instead of flying blind, Mondosoft gave the system administrator useful information about user actions. With Mondosoft’s analytics, SharePoint sites could be tuned to improve the user’s experience. Microsoft talked about SharePoint user experience; Mondosoft delivered technology that addressed user experience.

Mondosoft then acquired Ontolica, a company that made better use of SharePoint metadata and generated other useful tags. With Ontolica 3.2 installed and properly resourced, a SharePoint administrator could provide a useful set of hot links related to the user’s query. Microsoft delivered a blunt instrument; Ontolica provided a precision tool.

SurfRay’s product line includes an advanced, multi-lingual search engine suite with three components [a] MondoSearch,  [b] BehaviorTracking, and [c] InformationManager, SurfRay’s Speed Index search and retrieval system, and Ontolica  Search for SharePoint, providing business intelligence on information creation, search, retrieval and use. SurfRay also owns technology that can speed up searches of traditional relational database tables. In addition, SurfRay provides consulting services to its licensees. Plus, the company offers SurfRay XP search for Xerox’s multifunction document systems.

SurfRay/Mondosoft customers include Bosch, Burger King Corporation, Coleman. Hilton Hotels, Honeywell Process Solutions, Microsoft, Overnight Transportation, People’s Bank, Shell Oil, Siemens, SimCorp, The Swiss Army, TDC, The Vatican Holy See and United Technologies. SurfRay’s CEO and founder is Martin Veise. The president of the company is Steffen Saxil.

SurfRay has offices in New York, Stockholm, Bangkok and Copenhagen. You can learn more about the company here.

Stephen Arnold, July 6, 2008

SeeWhy: Real Time Business Intelligence without Search

July 6, 2008

SeeWhy came on my radar with its “no search” marketing angle. I poked around and was, at first, confused. The company appeared to occupy a no-man’s-land between search engine optimization and business intelligence that I avoid. A quick look revealed that the company has a business event system with some interesting twists.

Real Time and My Concern with the Phrase

“Real time” has been promoted from technical impossibility to buzz word. The general notion of “real time” among computer scientists is that simultaneity across linked systems is impossible outside of the bizarre world of high-energy physics. No matter how minute, latencies exists even if measured in picoseconds. But to a marketer, “real time” connotes a software, gentler world far from the “batch oriented” or human-intermediated world familiar to most professionals.

Now, real time is coming to the enterprise. Exegy, based in St. Louis, Missouri, offers an appliance that can ingest content by the megabyte per second and spit out processed content without much latency. To achieve this, Exegy has done some hardware engineering, but the gizmo works. When you shift to “real time” in the types of server environments found in a trucking company or a consulting company where capital investment is mostly out of the question, “real time” is not in Exegy’s league.

Let me be clear: to deliver near real time content processing Exegy style, you need specialized infrastructure. The average Dell server is not able to deliver no matter how insistent Bill Trucking Company’s information technology consultant becomes.

A number of text and content processing companies are asserting that their systems operate in “real time”. They don’t. Against this background, let’s look at one interesting company. I will not comment on this firm’s emphasis on real time processing, preferring to provide some basic information about this single firm and then offering, as a wrap up, a handful of generalized observations.

SeeWhy Software: Operational Business Intelligence

SeeWhy is one of the ?rst “open source” real time Business Intelligence platform for the event driven enterprise. SeeWhy continuously analyzes and interprets streams of individual business events, to alert you immediately to opportunities and risks and enable everyday decisions to be automated.

basic idea

The marketing angle that snared my attention.

This company Incorporated in 2003 by BI industry veteran Charles Nicholls, SeeWhy is backed by several venture capital investors, including LogiSpring, Pentech Ventures, Delta Partners and  handful of private folks. SeeWhy is headquartered in Windsor, United Kingdom.

The Charles Nicholls, founder and CEO, said here:

I began to ponder on the Business Intelligence industry with all its unfulfilled promise, often long on vision and short on delivery. The more that you challenge the status quo, the faster that you can see the opportunities to make the world a better place. It was this process that started me on a journey that led inevitably to create SeeWhy.

The basic premise of the company is summarized in this diagram from “In Search of Insight,” a 43 page document from Mr. Nicholls:

bi 2

The Web 2.0 Angle

You can download a monograph “In Search of Insight” about the company’s approach to business intelligence here, no annoying registration, thank you, SeeWhy.

Read more

Email Analysis

July 5, 2008

This summer I have been asked about email analysis on two different occasions. In order to respond to these requests, I had to grind through my archive of email-related information. I wrote about Clearwell Systems and its approach earlier this year. You can read this essay here.

I cannot reproduce the information my paying customers received. I can take a representative company–in this case, Stratify, a unit of Iron Mountain–and show you two different screen shots. These layouts and representations are the property of Stratify, and I am including them in this essay for two reasons:

  1. Stratify has been one of the early players in text analytics. First as Purple Yogi and then as Stratify, the company was engaged in the difficult missionary marketing needed to make non believers into believers
  2. The company has gained some traction in the legal market, which in the US, is a booming sector. The problems of the economy translate into a harvest of riches for some legal firms. Email is a big deal in discovery, and few have the resources to get a human to read all the baloney that zooms around an organization involved in a legal matter.

The Problem

You know the problem. Email was once ASCII shot between two people on Arpanet. Today email is the bane of the knowledge worker. The volume is high. The storage systems antiquated. The attachments madden the sane. The people using email forget that the messages live on different servers and can, in the process of discovery, be copied to a storage device and delivered to the attorney or attorneys who have to find something germane to the legal matter in the terabytes of digital data.

To summarize the challenges:

  • Email volume (lots of it, maybe a billion messages in a mid-sized organization every year)
  • Email attachments (tough to find the “right” one)
  • Email crashes (restores don’t always work, which you probably know first hand)
  • Email sent as if it were a one-time, secret communication
  • Email with recipients who, by definition, have some relationship.

For a lawyer, email is good and bad. It’s good if one finds a smoking gun or better yet a gun in the act of shooting. It’s bad if the bullets are coming at the opposing side’s legal eagles, worse if the bullet shoots a legal eagle out of the sky with a slug through the brain.

Ergo: email is a big, big deal in the information world of litigation.

The Solution

The fix is obvious–search. Actually to be precise, the conundrums of email invite text processing, text analytics, link analysis, relationship extraction, entity extraction, and other nifty methods.

The basics of email analysis are actually simple on the surface, more complicated under the hood and out of sight of non-technical types like lawyers: [a] copy email to a storage device that is fast, [b] tell email analysis program to index the email, [c] key word search or browse outputs, [d] make notes, print out email, and read individual documents of interest, [e] repeat taking care to bill for the time. (That’s the best part of email analysis. It’s quicker than manual methods, but the systems have to have a baby sitter. Those operating these systems can bill without working up too much of mental headache. Automated processes do make some legal thinking less painful. The best part is billing for this less stressful time.)

What do these systems show the user? The illustration below shows a Stratify search screen. Since I obtained this screen shot, Stratify has probably updated the interface. The main features are our interest. Take a look at what the Stratify system user sees when analyzing processed email:

stratify email analytics

Stratify’s email visualization

The principal features of this display are:

  1. Simplicity. You don’t want to confuse attorneys
  2. A picture showing people and their relationships as discerned by the system. Remember, an email can be sent to a person unrelated to a subject either by accident or for some other reason such as an “this is what I am doing” courtesy
  3. Links on the right hand panel to make it easy for the user to poke around by sender, topic, etc.

Let’s assume that the email is one part of a discovered collection of information. Stratify provides a richer interface. This one includes the bells and whistles that warrants the Stratify system price tag which is in six figures in case you want to license the system.

Read more

Fast Cash, Faster Crash

July 4, 2008

On July 3, 2008, Erick Schonfeld summarized the continuing saga of Fast Search & Transfer’s fastest move ever. The story “Did the Enron of Norway Pull a Fast One on Microsoft? More Details about the Mess at Fast Search $ Transfer? is here.

The story is quite thorough, according to my sources in Norway, and there is little I can add to the TechCrunch write up.

I would like to highlight one point, provide the links to my analysis of the Fast Search saga, and offer several observations about the nature of enterprise search. Before I start, take a look at this graphic because this is the wild bobsled ride that many vendors are queued to take:

bobsled fixed 01

Once a vendor starts down the sales bobsled run, it is tough to stop. The vendor has to ride to the bottom of the hill, hoping that he will not crash, rising serious injury and maybe death.

The Key Point for Me

After reading the TechCrunch essay, one segment gnawed at me; specifically:

…It [Microsoft’s paying $1.2 billion for Fast Search & Transfer] does point to a certain blindness on the part of Microsoft, or at least a willingness to look the other way, in its obsessive quest to become a player in search (see Yahoo and Powerset). It also raises questions about Fast’s underlying search technology. If Fast was having trouble closing deals for its products, how good can its technology really be?

Yes, this is the key question. The Fast Search & Transfer core technology was purpose built to index static Web sites. At the time Google started operations, AltaVista.com was an orphan, quickly losing its leadership position due to the voracious demand for resources that public Web search engines demand. The mantra is “Feed me computing resources or dies”.

Fast Search offered a Web site called AllTheWeb.com, and it was pretty good. At the time of 9/11, the AllTheWeb.com news indexing system was among the first to have reasonably timely information. Fast Search made a fateful decision in 2002 which led to Fast Search & Transfer’s exiting the Web indexing business. Fast Search sold its Web indexing business to Overture for $70 million with more money promised if certain goals were achieved. Fast Search took the money and focused on enterprise search.

The decision, as I recall my conversations with Fast Search & Transfer executives, when I was involved in the Fast Search deployment for a government project was that enterprise search was a great opportunity. Fast Search’s executives suggested to me that the company could move quickly to dominate the search market. At the time, there was little reason to doubt the confidence of the Fast Search team. A Fortune 50 was backing the Fast Search system in the government-wide indexing program. In the 2002-2003 time period, there were not too many systems that could demonstrate an index of 40 million documents. Even today, licensees of search systems do not grasp the hurdles that indexing large amounts of text puts in front of an organization. I have written extensively about this elsewhere, and I have little to add to the ignorance about search scaling that continues to plague organizations.

Read more

Google Working on Dynamic Runtime

July 3, 2008

A colleague called to my attention Microsoft wizard James Hamilton’s post about a possible Google initiative. You can read the full note here. For me the most interesting point in the note was:

…The popular speculation is that Google will be announcing a dynamic language runtime with support for Python, JavaScript, and Java. A language runtime running on both server-side and client-side with support for a broad range of client devices including mobile phones would be pretty interesting.

Why is this important? More flexibility for developers. Google’s programming innovations continue to percolate.

Stephen Arnold, July 3, 2008

Yahoo’s Semantic Search Still Available

July 3, 2008

In the firestorm of publicity burning through blogland, Yahoo’s semantic search system has been marginalized. I admit, the url is not the easiest to remember: http://www.yr-bcn.es/demos/microsearch/. The moniker Microsearch seems to be intended to tell the astute user that Yahoo processes microformat information. A microformat is a Web-based data formatting approach that seeks to re-use existing content as metadata.

The site is labeled a demonstration, and the Yahoo logo is visible in a funereal black, which I quite like. The service is called Microsearch. The system supports supports RDFa marked-up pages plus some other semantic formats. Yahoo says:

Microsearch is a richer search experience combining traditional search results with metadata extracted from web [sic] pages. At the moment your Yahoo! Search is enriched in three ways: [a] by showing ‘smart’ snippets that summarize the metadata inside the page and allow to take action without actually visiting the page; [b] by showing map and timeline views that aggregate metadata from various pages, [c] by showing pages related to the current result.

I had to dig a bit to find the explicit connection with the Semantic Web, but the site offers a version of semantic search. Yahoo includes a link to the Semantic Web page at the World Wide Web consortium.

Let’s look at the system. Yahoo provides some suggested queries, but I prefer my own.

My first query was “enterprise search”. The system returned the following result page:

ymicro ent search 01

The map was visually arresting, but it was irrelevant to the query and the result set. I looked at the results and was surprised to find Microsoft was the number two result. The other results were okay. The same query on Google returned more Microsoft links. My conclusion was that the “semantic” feature on Yahoo worked about as well as regular Google. The other conclusion I drew was that Microsoft is working hard to come up at the top of the results list for the word pair “enterprise search”. Too bad I don’t think of Microsoft and enterprise search as sector leaders.

My second query was for the phrase “Michael Lynch Autonomy”. Here’s what Microsearch displayed:

ymicro lynch

For this query, the map did not render. I assumed that the system would show me the location of Autonomy’s headquarters in the United Kingdom. Sigh. Microsearch is at version 1.4 on July 3, 2008, and whizzy features should be working. The results were stale. The top ranked hit was a 2006 interview. My recollection is that the Financial Times ran an essay by Mr. Lynch a few days ago. Alas, the system seems unable to factor time into its results ranking. News stories often carry time and date data, and News XML includes explicit tags for these data. I ran the same query on standard Google. Google returned the results set more quickly than Yahoo. Google’s results were poor. The first hit was to someone other than Autonomy’s Mike Lynch. The other hits were more stale than Yahoo’s. Autonomy may want to emulate Microsoft’s search engine optimization push.

Observations

The semantic features of Microsearch did not appear front and center. The mapping function did not work. Compared to Google, Yahoo performed as well as market leader Google. To be fair, Google’s results were not too good and Yahoo hit that benchmark.

Agree? Disagree? Let me know.

Stephen Arnold, July 3, 2008

Texas: A Clever Twist on Computer Consulting

July 2, 2008

Working as an expert witness, I was in a big shot Houston, Texas, law firm. One of the legal eagles had screwed up his laptop. He asked me if I could resolve the problem. I looked at the machine, checked the size of his Outlook PST file (the cause of the problem), did a little nerd magic, and pronounced the machine battle ready.

According to an essay posted at Institute for Justice: Litigating for Liberty, “Magnum PC? New Texas Law Limits Computer Repair to Licensed Private Investigators”, I would have been guilty of a crime. You can read the story here.

The most interesting point in the write up for me is:

The law also criminalizes consumers who knowingly use an unlicensed company to perform any repair that constitutes an investigation in the eyes of the government.  Consumers are subject to the same harsh penalties as the repair shops they use: criminal penalties of up to one year in jail and a $4,000 fine, and civil penalties of up to $10,000—just for having their computer repaired by an unlicensed technician.

So, not only was I a bad buy, the lawyer was a bad guy too. I am not sure if this is a hoax or if it is one more example of how interesting the legal system is. A number of scenarios are buzzing through my little mind now. I wonder if consultants working for Booz, Allen & Hamilton involved in systems work will have to be licensed. Somehow a consultant licensed as a private investigator and being paid to root through a client’s computer tickles my funny bone. Texas will need to clarify its consultant monitoring policies, I suppose. The State can’t allow an unlicensed technical SWAT team to fix a computer without the right paperwork.

Next time I am in Texas, I won’t fix your Macbook, Windows notebook, your AS/400–not even your mobile phone with email access. I wonder how much a private investigator’s license is in Texas? Will I have to pass a physical?”

Stephen Arnold, July 2, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta