Microsoft and Search in a Time Warp

November 2, 2008

In grade school in Illinois, I recall learning the meaning of the word anachronistic. For some reason, the explanation has stuck with me for more than 55 years. The teacher, Miss Chessman, told the class, “Ancient Greeks did not have an alarm clock.” The idea is that if you mix up when things occur, you run the risk of creating the equivalent of a dog’s breakfast.

My newsreader delivered CIO Magazine’s “Search Will Outshine KM” by Mike Altendorf to read. I don’t know Mr. Altendorf, and I have to admit that I disagree with a couple of the points he makes in this two part article, which you can read here. I am a tired goose, and I don’t want to trigger a squabble in the barn yard. I do want to point out where Mr. Altendorf and I part company.

First, the notion of search outshining KM is not something I have thought much about. KM is mostly baloney, one of those “trends” that promise much and deliver little that one can measure. When I watch intelligent people leaving one company for another, no software system captures what that person knows. IBM is trying to prevent a chip designer from leaving Big Blue to join Apple. If KM worked, IBM wouldn’t take such extreme action to prevent a person from changing jobs. That’s KM for you. It doesn’t deliver. And search? It is, in general, not too helpful either. In fact, search is one of the few software systems to engender a dissatisfaction rate among its users of 60 to 70 percent. In my opinion, search outshining KM is a silly assertion, and one that makes it seem that one lousy system can deliver information better than another lousy system. Both search and KM work best when applied to specific problems and bounded by realistic expectations and budgets.

Second, the reference to Microsoft’s acquisition of Fast Search and Transfer and Powerset is startling. First, Mr. Altendorf makes no reference to the police action in Oslo that threatens to undermine the credibility of the Fast Search technology, finances, and executives. Second, Powerset is not complement to Fast Search technology for two reasons: [a] Powerset technology is part of the Live.com service and has not to my knowledge been hooked to the Fast Search system at this time and [b] notion that Microsoft can “tie together” disparate technologies is out of touch with reality. Let me be clear. Microsoft has compatibility issues within its own product families; specifically, SharePoint and the Dynamics range of software. When you toss in the Fast Search conglomeration of original code, the bits of open source Fast Search has used in its system, and the technology from the acquisitions Fast Search made prior to its purchase by Microsoft, you have quite a bit of integrating to do. Now add the Powerset original code with the licensed technology from Xerox Parc, and you have even more work. Microsoft’s units can’t make the ribbon interface consistent across Outlook, Word, and Visio in Office 2007. Mr. Altendorf’s blithely reassures me that Microsoft can work out these incongruities. I beg to differ.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Microsoft, Online (general), Technology | 2 Comments

Microsoft Azure

October 28, 2008

The most useful write up about Microsoft’s cloud computing play is Mary Jo Foley’s. You can find “Microsoft’s Azure Cloud Platform: A Guide for the Perplexed” here. Her approach is to describe the layers of Azure, highlighting important components like Red Dog, the base operating system. Please, read her write up. It’s an excellent summary. On the other hand, Azure might be a big demo. Click here for this view. The Microsoft Azure splash page is here.

The questions that I have about price, licensing, service level agreements, and deployment data remain unanswered. I watched a couple of videos today, but the Microsoft engineers were too cheerful for me. I tuned the programs out, but I do recall the word “great” being used several times. The layers are not surprising at all. The engineering details about resolving bottlenecks, eliminating manual tasks and moving them to smart software, and getting away from expensive, high performance data center gear are lacking. I remain baffled about SharePoint search running from Azure. In my experience, performance is a challenge when SharePoint runs locally, has resources, and has been tuned to the content. A generic SharePoint running from the cloud seems like an invitation to speeds similar to my Hayes 9600 baud dial up modem. I am taking a wait and see approach. Clouds are wonderful as long as the user has bandwidth, the cloud does not crash, and unexpected software problems don’t make an application sit and wait while the operating system tries to figure out what to do what an unexpected event occurs. Some of the engineering issues are described in the Monsoon paper by Albert Greenberg, et al, which is available from the ACM as 978-1-60558-181-1/08/08. Azure has some interesting engineering short cuts baked into it if this paper “towards a Next Generation Data Center Architecture: Scalability and Commoditization” is accurate.

Stephen Arnold, October 28, 2008

Written by Stephen E. Arnold · Filed Under Cloud computing, Microsoft, News, Technology | 3 Comments

Search Meltdown at the Digital Studio 54

October 8, 2008

After a whirlwind series of meetings outside the U.S. I picked up information about growing problems in the enterprise search sector. Not surprisingly, my newsreader offered a trigger for this Web log post. Navigate to Dot.Life, one of the bewildering array of Web logs from the BBC, here. Rory Cellan-Jones, whom I have never met, wrote “Technology – The Party Really Is Over” on October 8, 2008. For me the most interesting point was this comment:

And, almost unnoticed, technology companies have been sucked into this vortex of gloom…. Autonomy, which specializes in search technology for big businesses, has recently entered the FTSE 100. But over the last week its shares have been tumbling as rapidly as the index as a whole. They started above £10, and last time I looked they were around £7.60. This despite a trading update in which the chief executive said Autonomy expected its third quarter results to be “significantly ahead of expectations”. The market has decided that the big enterprises which are Autonomy’s customers will be trimming their spending too.

Let’s step away from Autonomy and think about the mood at the Enterprise Search Summit held a fortnight ago in San Jose, California. Here are several “economic” observations that provide some color for my observations about what’s coming in search, content processing, and text mining:

I learned of several executive shake ups. These have not been announced, but when heads roll at a major conference, I sense that sales performance may be triggering the game of musical chairs. As I visited vendor stands, I had to ask, “Didn’t you work at X?” Old faces in new booths were common enough for me to notice. (I was jet lagged and bored, so it took some effort to light up my radar.)
I heard that one vendor pulled out at the last minute, deciding that waiting for prospects to walk by booth was less effective than making direct sales calls or working the email marketing system. I wondered why there were several yawning gaps amidst the exhibitors. Priorities or cash may have been the deciding factor.
My hallway conversations with PR folks left me with a sense that this is not a good time to be in the buzz business. Several PR wizards told me that their clients wanted to get coverage on the hot Web logs. Too bad this Web log — Beyond Search — has only two or three readers. Corporate honchos and honchettes want their firm’s solutions on the major news aggregation sites and showing up on StumbleUpon.com and other trend makers. One person told me, “Web PR is tough.”
Talk about Mercado and SurfRay suggested that both firms were trying to dog paddle in rough seas. The Mercado news, which I reported on this Web log, if true, may be ominous. SurfRay is best known for its Mondosoft SharePoint solution, and the financial reports from Denmark just arrived. Those documents may shed some light on that firm’s health, but two people asked me what I knew about the company.

Is the party really over? No, I don’t think so. There will be some severe dislocations and realignments. But in the search and closely related markets, the task of selling a complex system mean long sales cycles. With cash drying up, the sluggish sales and executive churn are symptoms of a disease that has been infecting silently for a long time. The good news is that once the system adapts, new opportunities will poke their noses from behind the CFOs’ locked doors.

Investment firms with money in search and content processing firms will demand more from their stable of search stallions. I think more focus will be brought to the sales process. A good example will be search firms who deliver solutions that work with a minimum of the three to six month set up period. Units with specific problems will license solutions from firms who can deliver a system that works, not a bunch of jargon about intelligent system, latent semantic indexing, and automatic taxonomy generation in real time.

The financial downturn will motivate customers to demand results and more for whatever the customer pays a vendor. The vendors will be working in a world “red in tooth and claw”. I for one will be delighted to cull down the list of 350 to 400 vendors who assert that their firm offers “enterprise search”. I don’t know what enterprise search means, and if some vendors are finding customers unwilling to write checks for their systems, I submit respectfully that the customers on a budget want to buy something that delivers a pay off, can be explained to its users, and solves a specific problem.

For these firms — what I call the vendors to watch in the Gilbane study — the party may just be beginning for a select few. For those who make the cut, the old world of Studio 54 will keep on trucking.

Stephen Arnold, October 8, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, News, Search, Technology | 3 Comments

The Goose Quacks: Arnold Endnote at Enterprise Search Summit

October 4, 2008

Editor’s Note: This is a file with a number of screen shots. If you are on a slow connection, skip this document.

One again I was batting last. I arrived the day before my talk from Europe, and I wasn’t sure what time it was or what day it was. In short, the addled goose was more off kilter than I had been in the Netherlands for my keynote at the Hartmann Utrecht conference and my meetings in Paris squished around the Utrecht gig.

I poked my head into about half of the sessions. I heard about managing search, taxonomies, business intelligence, and product pitches disguised as analyses. I’m going to be 65; I was tired; and I had heard similar talks a few days earlier in Europe. The challenges facing those involved with search are reaching a boiling point.

After dipping into the presentations, including the remarkable Ahead in the Clouds talk by Dr. Werner Vogels, top technical gun at Amazon, and some business process management razzle dazzle, I went back to the drawing board for my talk. I had just reviewed usage data that revealed that Google’s lead in Web search was nosing towards 70 percent of the search traffic. I also had some earlier cuts at the traffic data for the Top 50 Web sites. In the two hours before my talk, I fiddled with these data and produced an interesting graph of the Web usage. I did not use it in my talk, sticking with my big images snagged from Flickr. I don’t put many words on PowerPoint slides. In fact, I use them because conference organizers want a “paper”. I just send them the PowerPoint deck and give my talk using a note card which I hold in my hand or put on the podium in front of me. I hate PowerPoints.

Here’s the chart I made to see how the GOOG was doing in terms of Microsoft and Yahoo.

Source: http://blogs.zdnet.com/ITFacts/

The top six sites are where the action is. The other 44 sites are in the “long tail”. In this case, the sites out of the top 50 have few options for getting traffic. The 44 sites accounted in August 2008 for a big chunk percent of the calculated traffic, but no single site is likely to make it into the top six quickly. Google sits on top the pile and seems to be increasing its traffic each month. Google monetizes its traffic reasonably well, so it is generating $18 billion or so in the last 12 months.

In the enterprise search arena, I have only “off the record” sources. These ghostly people tell me that Google has:

Shipped 24, 600 Google Search Appliances. For comparison, Fast Search & Transfer prior to its purchase by Microsoft had somewhere in the neighborhood of 2,500 enterprise search platform licensees. Now, of course, Fast Search has access to the 100 million happy SharePoint customers. Who knows what the Fast Search customer count is now? Not me.
Become the standard for mapping in numerous government agencies, including those who don’t have signs on their buildings
Been signing up as many as 3,000 Google Docs users per day, excluding the 1.5 million school children who will be using Google services in New South Wales, Australia.

I debated about how to spin these data. I decided to declare, “Google has won the search battle in 2008 and probably in 2009.” Not surprisingly, the audience was disturbed with my assertion. Remember, I did not parade these data. I use pictures like this one to make my point. This illustration shows a frustrated enterprise search customer setting fire to the vendor’s software disks, documentation, and one surly consultant:

How did I build up to the conclusion that Google has won the 2008-2009 search season. Here are the main points and some of the illustrations I used in my talk.

Written by Stephen E. Arnold · Filed Under Business strategy, Cloud computing, Conferences, News, Online (general), Search, Technology, Text analytics, Text processing | 1 Comment

Microsoft: Frequent Searcher Points and Free Enterprise Search

October 1, 2008

Ina Fried’s “Microsoft Still Paying People to Search” is a useful reminder that Microsoft is “still paying people to search”. You can read various wizards’ comments at Search Engine Journal, LiveSide, and others. I quite liked the approach taken by Nathania Johnson, Search Engine Watch, in her “Microsoft Launches SearchPerks; Like Credit Card Rewards, Except for Search here.” For me, the most interesting point in her write up was this passage:

Microsoft’s Frederick Savoye, senior director at Live Search, assured me that this is an incentive program that fits into their three overall pillars of search: [a] Delivering the best search results [b] Simplifying key tasks such as booking airline, travel, shopping, finding user opinions, etc. [c] Innovating the business model. (Note: I did a bit of format tweaking to keep the passage from becoming hard to read]

The announcement comes hard on the heels of the news that Microsoft will be hurt by the financial problems sweeping through the US and threatening the European markets (more information here) and that Microsoft will make Oslo, Norway, the pivot point for its search research (more information on that here).

I wanted to offer several observations before the addled goose brain I have forgets them.

Frequent flier blues. Earning points for search is a good idea. I have quite a few air miles, but the airlines change the threshold for an award or retire the miles before I can use them. I am, therefore, deeply indifferent to usage credits because of how other customer reward programs have tricked me.
Can’t buy me love. I am a rental. I sell time. When someone buys my time, I love them. When that someone doesn’t pay me, I don’t love them. As long as the pay is commensurate with the work, I go along with my rent-my-time approach to business. I don’t think the dough offered for me to change my habits, the automated scripts, and the free Google crawls I run every couple of hours is sufficient for me. For a critical mass of Web users, I am skeptical. I don’t think payola will work in search, but it worked for a while in radio someone told me.
Business model silliness. Google’s business model is that someone pays Google to give away services. Users of Google expect free or low cost services to avoid the ads. Giving away free services without a third party paying or just paying people to use a service is not a business model. The tactic is marketing. I see these ploys as a type of discount coupon for tires, “Buy three and get one free”. The cost of the fourth tire is covered in the markup on the first three tires, the extra charge for balancing, or the labor cost to undo the lug nuts and put the new tires on.

In the consumer Web space, Google maintains and may be incrementally increasing its market share. I think that some of the research outfits tracking Web search share report that Google is north of 65 percent of the search traffic now. I have some first hand and anecdotal data that indicate the 65 percent figure may be low. From where I sit in my Kentucky hollow with my geese, Google’s market share in Web search is close enough to two-thirds for me. With Ask.com, Microsoft, and Yahoo chopping up the remainder, paying users probably won’t have a significant impact. Users choose what to search. Once habits in online form, those habits can be tough to change. The malarkey about search being a one click easy decision does not reflect the fact that “habits, like a soft bed, are easy to fall into and hard to get out of.” That’s a quote from Miss Costello’s sixth grade classroom poster. Miss Costello was my teacher in the 1950s. Pretty accurate statement for Web search I believe, even 50 years after I first read the message.

A quick horizon scan reveals that in enterprise search, the “give away” approach to market share is keeping Microsoft in the enterprise search game. But vendors tell me that sales of their SharePoint search plug ins continue to sell. What vendors are reaping the rewards of the SharePoint search opportunity. I can’t include the dozens who play in this space but Coveo, Endeca, ISYS Search Software, and Vivisimo have told me or hinted that SharePoint represents a good market. In fact, one vendor told me that the SharePoint market is stronger since Microsoft rolled out free SharePoint, enhanced MOSS, and bought the complex Fast Search & Transfer Enterprise Search System. In one engagement, the vendor was hired quickly, replacing the incumbent Microsoft system with minimal red tape. In the enterprise search sector, where user annoyances are commonplace, Microsoft is not yet paying people to use its search system, but Microsoft may have to take more aggressive steps to keep third party vendors out of the SharePoint members-only club.

To sum up, the search game for the Web and in the enterprise are quite different. Microsoft will have to find a way to leap frog in both markets. That will take some doing. I am excited to learn what Redmond will do on both search fronts. Google, of course, has a more integrated approach to search, which I think may present both technical and cost challenges to Microsoft.

Stephen Arnold, October 1, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Enterprise, Google, Microsoft, News, Search, SharePoint | Comments Off on Microsoft: Frequent Searcher Points and Free Enterprise Search

An Exceptional Rumor: MSFT to Buy Yahoo AOL Combo

September 26, 2008

I saw this post on Venture Beat here. Then I saw a follow on story on Peter Kafka’s write up for Silicon Alley Insider here. I am delighted to point out that these writes up do not a done deal make. I find the notion fascinating, and I hope it comes to pass. Google will probably buy another dinosaur skeleton, reinstate day care, and design more lavish housing for the NASA Moffett Field Google Housing Units to celebrate. Please, read these two posts. The plan, as I understand this speculation, is that Yahoo gobbles up the wheezing AOL. I presume Yahoo will be able to work its technical magic on AOL’s infrastructure just as it did Delicious.com’s. Yahoo took two years to rewrite Delicious.com’s code, thus allowing other social sites and bookmarking services to flourish. Once the dust settles from that MBA fueled explosion, the Bain consultants will shape the package so that Microsoft can swoop in and snap up two hot properties, solve its search and portal problems, and catch up with Googzilla and chop off its tail.

When I worked at Booz, Allen & Hamilton, we called the Bain consultants Bainies. I can’t recall if we used this as a term of affection or derision. I like Bain and the work it did for Guinness just about 20 years ago. You can refresh your memory of that project here.

Let’s walk through the search and content processing implications of this hypothetical deal. I promise that I will not comment about SharePoint search, Live.com’s search, Outlook search, SQL Server search, Powerset search, or Fast Search & Transfer search.

AOL has search plus some special sauce. At one time Fast Search & Transfer was laboring in the AOL vine yards. Teragram, prior to its acquisition by SAS, was also a vendor. Two vendors are enough for Yahoo to rationalize. Heck, Yahoo is relying on Fast Search technology for its AllTheWeb.com service last I heard. The Teragram technology might be a stretch, but the Yahoo technical team will be up to the challenge. The notion of becoming part of Microsoft will put a fire in the engineers’ bellies.
AOL has its portal services. Granted these overlap with Yahoo’s. There’s the issue of AOL mail, AOL messenger, and AOL’s ad deals with various third parties. Google may still have a claw in the AOL operation as well. I haven’t followed Google’s tie up with AOL since word came to me that Google thought it made a bad decision when it pumped a billion into the company.
AOL has a cracker jack customer service operation. Yahoo has a pretty interesting customer service operation as well. I am not sure how one might merge the two units and bring both of them under the Yahoo natural language search system that doesn’t seem to know how to provide guidance to me when I want to cancel one of my very few Yahoo for fee services. Give this a try on your own and let me know how you navigate the system.

I am delighted that I don’t have to figure out how to mesh Yahoo and AOL and then integrate the Yahoo AOL entity with Microsoft. Overlapping services are trivial for these three firms’ engineers. No big deal. If the fix is to operate each much as they now are, I anticipate some cost control problems. Economies of scale are tough to achieve operating three separate systems and their overlapping features.

I think that when I read the stories in my newsreader on Monday, September 29, 2008, I will know more about this rumor. I am still struggling with how disparate systems and the number of search systems can be made to work better, faster, and cheaper. Maybe the owner of the Yahoo AOL property will outsource search to Google. Google is relatively homogeneous, and it works pretty well for quite a few Web users, Web advertisers, and Web watchers. Watch this Web log for clarification of this rumor. For now, the word that comes to mind is a Vista “wow”.

Stephen Arnold, September 26, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Microsoft, News, Search, Yahoo | Comments Off on An Exceptional Rumor: MSFT to Buy Yahoo AOL Combo

Virtual Servers: It Is Recrawl and Reindex Time

September 22, 2008

The malarky about virtualization has many information technology professionals courting chimeras. Some virtualization is good. For example, we have a couple of quad core, four gigabyte servers that are four to five times faster on our benchmark tests than the aged NetFinity 5500s we retired. The new servers have the moxie to run virtualization software. No problems so far. In fact, chopping boxes into separate virtual servers makes sense and is tame compared to some of the technologies that arrive at our office door.

Virtual storage, however, is another kettle of fish. Our experience has been that complex directory structures such as those spawned by SharePoint and certain enterprise applications are complicated. When these complex structures are mixed with virtual storage, we have encountered some excitement. We test software, so our trashed files provide us with useful data, not long weekends and sleepless nights.

InfoWorld on September 19, 2008, here called attention to some of the issues virtual storage drags along with the snappy marketing messages and rah rahs for cheaper administration. “Virtual Server Backups Prone to Failure, Survey Finds” makes clear that virtual solutions are not without some problems. The InfoWorld write up reports on a survey that asserts more than half the virtual server backups don’t restore. The article has some other data but I want to focus only on the backups not restoring.

Here’s the problem. Search is a storage intensive application. The indexes can be big. If an index doesn’t start out big, in a matter of months the index gets big. Logs get big. When a search or content processing system crashes or an index update corrupts the master index, an administrator turns to the back up sytem. If the search system is using a whizzy new virtual storage system, the backup won’t work. The problem is that rebuilding the index is not always a five minute or even a five hour job.

Recrawling and reindexing can be tricky. Systems that perform significant content processing can crunch for a day,. maybe more generating metadata. Our suggestion is to skip virtual storage for search and content processing systems. Already have one? You may want to devirtualize and quickly.

Stephen Arnold, September 22, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, News, Search, Technology | Comments Off on Virtual Servers: It Is Recrawl and Reindex Time

Autonomy: Quicker than Microsoft Fast Search Yet Again

September 19, 2008

Autonomy continues to out think Microsoft Fast Search. The nimbleness of Autonomy cannot be overlooked by the Redmond giant. Microsoft has revenues of $65 billion or so. Autonomy weighs in with $400 million or $500 million in revenues. Microsoft spent $1.2 billion for an enterprise search vendor which stunned the content processing world with a Web part to integrate SharePoint (a hugely complex content management system) with Fast ESP (an equally complex content processing system). Now Autonomy rolls out its “its information processing technology [that] extends Microsoft Office SharePoint Server (MOSS) to meet customer requirements for scalability, connectivity and conceptual search.” You can read the details of this new Autonomy product here.

Now if I were involved with Microsoft Fast Search, I would be tempted to say, “Those Autonomy folks have hit on a very good idea.” I might even be tempted to suggest that we buy Autonomy just to get the company’s marketing team. When you have billions in the bank and are fighting to out Google Google, why not buy Autonomy? It makes more sense than trying to weld together Powerset and Ciao.com technology in my opinion.

I don’t know who is running the Microsoft enterprise search operation. There’s been too many executive changes and too few substantive announcements to hold this addled goose’s short attention span. What’s clear is that Autonomy is able to pinpoint cracks in the Microsoft Fast Search armor and exploit them. Anyone who has any hands on experience with SharePoint knows that it’s easy to get a finger crushed in SharePoint’s moving parts. So Autonomy asserts:

Autonomy further extends global MOSS scalability through its distributed, brokered architecture and “geo-efficient” design which allows data to be automatically replicated in the most sensible location based on bandwidth, lag time, availability and demand. This enables high performance and gives users aggregated access to all enterprise information in a unified view in globally dispersed environments while reducing bandwidth overhead. Because IDOL creates a stub, or shortcut, to the data and supports tiered storage rather than requiring that data be stored in SQL Server, organizations using IDOL with SharePoint can further benefit from dramatically reduced SQL Server licenses and associated scalability limitations.

Microsoft may want to pay close attention to how Autonomy deftly points out the finger mashing gears and levers in SharePoint. Next Microsoft may want to put safety covers on the more dangerous bits. If SharePoint and Fast Search continue to dog paddle along, Autonomy and maybe other vendors will find the 100 million SharePoint users easy pickings.

A happy quack to Autonomy for this deft marketing move. A goose gift for the Redmond behemoth who seems unable to organize its parade and get it marching toward the objective of delivering a solution to the scaling problems in SharePoint. Agree? Disagree? Help keep me informed via the Comments function on this Web log.

Stephen Arnold, September 19, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Enterprise, Microsoft, News, Search, Technology | 4 Comments

Why Dataspaces Matter

August 30, 2008

My posts have been whipping super-wizards into action. I don’t want to disappoint anyone over the long American “end of summer” holiday. Let’s consider a problem in information retrieval and then answer in a very brief way why dataspaces matter. No, this is not a typographical error.

Set Up

A dataspace is somewhat different from a database. Databases can be within a dataspace, but other information objects, garden variety metadata, and new types of metadata which I like to call meta metadata, among others can be encompassed. These are represented in an index. For our purpose, we don’t have to worry about the type of index. We’re going to look up something in any of the indexes that represent our dataspace. You can learn more about dataspaces in the IDC report #213562, published on August 28, 2008. It’s a for fee write up, and I don’t have a copy. I just contribute; I don’t own these analyses published by blue chip firms.

Now let’s consider an interesting problem. We want to index people, figure out what those people know about, and then generate results to a query such as “Who’s an expert on Google?” If you run this query on Google, you get a list of hits like this.

This is not what I want. I require a list of people who are experts on Google. Does Live.com deliver this type of output? Here’s the same query on the Microsoft system:

Same problem.

Now let’s try the query on Cluuz.com, a system that I have written about a couple of times. Run the query “Jayant Madhavan” and I get this:

I don’t have an expert result list, but I have a wizard and direct links to people Dr. Madhavan knows. I can make the assumption that some of these people will be experts.

If I work in a company, the firm may have the Tacit system. This commercial vendor makes it possible to search for a person with expertise. I can get some of this functionality in the baked in search system provided with SharePoint. The Microsoft method relies on the number of documents a person known to the system writes on a topic, but that’s better than nothing. I could if I were working in a certain US government agency use the MITRE system that delivers a list of experts. The MITRE system is not one whose screen shots I can show, but if you have a friend in a certain government agency, maybe you can take a peek.

None of these systems really do what I want.

Enter Dataspaces

The idea for a dataspace is to process the available information. Some folks call this transformation, and it really helps to have systems and methods to transform, normalize, parse, tag, and crunch the source information. It also helps to monitor the message traffic for some of that meta metadata goodness. An example of meta metadata is an email. I want to index who received the email, who forwarded the email to whom and when, and any cutting or copying of the information in the email to which documents and the people who have access to said information. You get the idea. Meta metadata is where the rubber meets the road in determining what’s important regarding information in a dataspace.

Written by Stephen E. Arnold · Filed Under Database, Feature, Google, Online (general), Search, Semantic, Technology, Text analytics, Text processing | Comments Off on Why Dataspaces Matter

The Future of Search Layer Cake

August 14, 2008

Yesterday I contributed a short essay about the future of search. I thought I was being realistic for the readers of AltSearchEngines.com, a darn good Web log in my opinion. I wanted to be more frisky than the contributions from SearchEngineLand.com and Hakia.com too. I’m not an academic, and I’m not in the search engine business. I do competitive technical analysis for a living. Search is a side interest, and prior to my writing the Enterprise Search Report, no one had taken a comprehensive look at a couple dozen of the major vendors. I now have profiles on 52 companies, and I’m adding a new one in the next few days. I don’t pay much attention to the university information retrieval community because I’m not smart enough to figure out the equations any more.

From the number of positive and negative responses that have flowed to me, I know I wasn’t clear about my focus on behind the firewall search and Google’s enterprise activities. This short post is designed to put my “layer cake” image into context. If you want to read the original essay on AltSearchEngines.com, click here. To refresh your memory, here’s the diagram, which in one form or another I have been using in my lectures for more than a decade. I’m a lousy teacher, and I make mistakes. But I have a wealth of hands on experience, and I have the research under my belt from creating and maintaining the 52 profiles of companies that are engaged in commercial search, content processing, and text analytics.

I’ve been through many search revolutions, and this diagram explains how I perceive those innovations. Furthermore, the diagram makes clear a point that many people do not fully understand until the bills come in the mail. Over time search gets more expensive. A lot more expensive. The reason is that each “layer” is not necessarily a system from a single vendor. The layers show that an organization rarely rips and replaces existing search technology. So, no matter how lousy a system, there will be two or three or maybe a thousand people who love the old system. But there may be one person or 10,000 who want different functionality. The easy path for most organizations is to buy another search solution or buy an “add in” or “add on” that in theory brings the old system closer to the needs of new users or different business needs.

Written by Stephen E. Arnold · Filed Under Business strategy, Enterprise, Feature, Online (general), Search, Technology | 3 Comments

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Microsoft and Search in a Time Warp

Microsoft Azure

Search Meltdown at the Digital Studio 54

The Goose Quacks: Arnold Endnote at Enterprise Search Summit

Microsoft: Frequent Searcher Points and Free Enterprise Search

An Exceptional Rumor: MSFT to Buy Yahoo AOL Combo

Virtual Servers: It Is Recrawl and Reindex Time

Autonomy: Quicker than Microsoft Fast Search Yet Again

Why Dataspaces Matter

The Future of Search Layer Cake

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta