Summer Search Rumor Round Up

July 26, 2010

The addled goose has been preoccupied with some new projects. In the course of running around and honking, he has heard some rumors. The goose wants to be clear. He is not sure if these rumors are 100 percent rock solid. He does want to capture them before the mushy information slips away:

image

Source: http://oneyearbibleimages.com/rumors.gif

First, the goose heard that there will be some turnover at Microsoft Fast. The author of some of the posts in the Microsoft Enterprise Search Blog may be leaving for greener pastures. You can check out the blog at this link. What does this tell the goose? More flip flopping at Microsoft? Not sure. Any outfit that pays $1.2 billion for software that comes with its own police investigation is probably an outfit that would scare the addled goose to death. The blog is updated irregularly with such write ups as “Crawling Case Sensitive Repositories Using SharePoint Server 2010” and “SharePoint 2010 Search ‘Dogfood’ Part 3 – Query Performance Optimization.” Ah, the new problem of upper and lower case and the ever present dog food regarding performance. I thought Windows most recent software ran as fast as a jack rabbit. Guess not.

Second, a number of traditional search vendors are poking around for semantic technology. The notion that key words don’t work particularly well seems to be gaining traction. The problem is that some of the high profile outfits have been snapped up. For example, Powerset fell into the Microsoft maw and Radar Networks was gobbled by Paul Allen’s love child, Evri. Now the stampede is on. The problem is that the pickings seem to be slim, a bit like the t shirts after a sale at the Wal-Mart up the road from the goose pond here in Harrods’s Creek. For some lucky semantic startups, Christmas could come early this year. Anyone hear, a sound like “hack, hack”. Oh, that must be short for Hakia. You never know.

Third, performance may have forced a change at HMV.co.uk in merrie olde England. Dieselpoint was the incumbent. I heard that Dieselpoint is on the look out for partners and investors. The addled goose tried to interview the founder of the company but a clever PR person sidelined the goose and shunted him to the drainage ditch that runs through Blue Island, Illinois. Will Dieselpoint land the big bucks as Palantir did.

Fourth, the goose heard that a trio of Microsoft certified partners with snap in SharePoint search components were looking for greener pastures. What seems to be happening is that the easy sales have dried up since Microsoft started its current round of partner cheerleading. The words are there, but the sales are not. Microsoft seems to want the money to flow to itself and not its partners. Who is affected? The goose cannot name names without invoking the wrath of Redmond and a pride of PR people who insist that their clients are knocking the socks off the competition. However, does the enterprise need a half dozen companies pitching metatagging to SharePoint licensees? I think not. If sales don’t pick up, the search engine death watch list will pick up a few new entries before the leaves fall. Vendors in the US, Denmark, Germany, Austria, and Canada are likely to watching Beyond Search’s death watch list. Remember Convera? It spawned Search Technologies. Remember the pre Microsoft Fast? It spawned Comperio? When a search engine goes away, the azurini flower.

Fifth, what’s happened to the Oracle killers? I lost track of Speed of Mind years ago. There was a start up with a whiz bang method of indexing databases. I haven’t heard much about killing Oracle lately. In fact, stodgy old Oracle is once again poking around for search and content processing technology according to one highly unreliable source. With SES11g now available to Oracle database administrators, perhaps the time is right to put some wood behind a 21st century search solution.

If you want to complain about one of these rumors, use the comments section of this blog. Alternatively, contact one of the azurini outfits and get “real” verification. Some of their consultants use this blog as training material for the consultants whom you compensate. No rumor this. Fact.

Stephen E Arnold, July 26, 2010

Freebie

Index Engines Polishes Platform

July 23, 2010

Index Engines recently announced they’ve made enhancements to their 3.2 platform that will better the system to allow for indexing of multiple streams of data from backup tapes. A significantly larger amount of tape data can be processed with these new developments in tight time frames.

Up to six streams of data can be processed now at a speed of one terabyte per hour. The process can save a company millions in storage costs and the stockpiling of these tapes can be a liability according to a company spokesman. We did not test the new system, so you may want to run some benchmarks on your own before whipping out your American Express Platinum card.

This is, according to Index Engines, the only product of its type on the market that directly indexes stored data. Index Engines is involved with enterprise discovery solutions. The company was founded in 2003 and their mission is to organize enterprise data assets, making them immediately accessible, searchable and easy to manage.

Rob Starr, July 23, 2010

Vivisimo Chases Call Center Sales

July 22, 2010

One of the most frustrating things for a call center agent is not having the information that a customer needs right at their fingertips. Any business knows that they can lose customers when they have agents fumbling around through applications looking for answers, and no one really has the resources to be constantly updating this kind of information.

Sometimes the solutions come from unlikely sources. Vivisimo started by supplying applications for the military and academia but is now tackling the more practical problems that call centers face with Velocity. Here’s a real company on the move and they swear by this new information platform which they say optimizes fragmented information with any easy to use interface.

Vivisimo’s history begins with an on-the-fly clustering function, veers into Web indexing, jumps to enterprise search, embraced integration, and now flirts with call center search. Agility or chasing revenue? The goslings and I are not sure.

Now is most definitely the time for some of the world’s best companies to apply their knowledge to practical economic solutions.

Vivisimo may have to show some Autonomy-style innovation to make a quantum leap in revenue in my opinion.

Stephen E Arnold, July 22, 2010

A Factoid from Dell Computer

July 13, 2010

Dell: 90% of Data Is Never Read Again” appeared on PC Pro, a UK Web site. The article presented data from Dell Computer that asserted “90% of company data is written once and never read again.” The write up contains some azure chip stuff; for example:

It’s an odd statistic. How is that data measured? 90% of all documents? 90% of stored bytes? When they said “ever again” did they mean explicitly retrieved by name, or should we include free text searches in that statistic? How long an interval needs to pass before some piece of data is clearly identified as belonging to the 90%, so that steps can be taken to reflect its reduced importance?

Anyone hear about offline storage, near line storage, and online storage? Certainly not at Dell, an outfit trying to boost its storage revenues and its knowledge of what companies do with their data.

One of the challenges of enterprise search is to index information and deliver relevant results. Popularity based systems—like the method used in the original Google Search Appliance—don’t work in organizations. Google figured this out and adapted its system. Specialized vendors, including Index Engines, built their business around the fact that once data are archived no one knows what’s there or how to find it.

Modern search and content processing systems are tough to configure for many reasons. One of them is the fact that most information tucked on an organization’s computers is lost. Only a handful of systems deliver what an employee needs to make a business decision. That information is usually relatively recent data. The write up descends into the weeds of which storage systems are going to ring the journalists’ and consultants’ chimes.

The topic I wanted to see addressed was ignored: search, indexing cycles, relevance, and other trivial questions. Buying hardware is more important I suppose.

Stephen E Arnold, July 13, 2010

Freebie

Sophia Search Lands Venture Funding

July 9, 2010

Wisdom is a good name for a search and content processing system. If you live in rural Kentucky, the Greek becomes “Sophia”, which denotes wisdom. (Gentle reader, “wisdom” is not highly prized in Harrod’s Creek.)

The news that Sophia Search (founded in 2007) landed $1.2 million in seed money reached me via Marketwire. The investors include Volcano, based in Belfast, and Javelin Ventures in London. The story’s title was effective in arresting my attention: “Sophia Search Secures Largest Angel Investment in Northern Ireland to Address Global Demand for Next-Generation Enterprise Search and Discovery.” The news item said:

Sophia’s technology is purpose built on the company’s unique, patented, Contextual Discovery Engine (CDE) based on the linguistical model of Semiotics, the science behind how humans understand the meaning of information in context. The CDE platform automatically detects relationships and themes in unstructured content to enable organizations to seamlessly search, extract, deduplicate and eliminate redundancy of content to minimize risk and reduce the cost of retrieving, storing and managing enterprise information.

The news story revealed that Sophia is built on a patented, next-generation search engine platform. The system can “automatically discover relationships and themes in unstructured content.”

The company, according to my notes, is a spin out from University of Ulster and Saint Petersburg State University. Sophia Search was one of the companies recognzed by the PricewaterhouseCooper entrepreneur competition. (Keep in mind that I do work for the outfit that help PricewaterhouseCoopers conduct these entrepreneur competitions.)

A quick trip to our Overflight system yielded some useful nuggets about this company. The Sophia Search white paper, dated January 2009, pointed out that the method is “fundamentally different to [sic] any other search tool.” The white paper continued:

These tools are based on ideas & principles drawn from disciplines such as Signal Processing or Mathematics. These ideas are  ‘borrowed’ from these disciplines and applied to text retrieval to provide search. In Sophia we believe that in order to retrieve useful information for users we must first understand its meaning and as such we build Sophia upon the recognised linguistical model of Semiotics.

The system “understands” the context in which a word or phrase is used. The white paper said: “In order to understand the meaning of a word it must be taken within the context of other words around it.” We agree. Key word indexing is one reason why most search systems drive users to distraction.

The white paper introduces the idea of “intertextuality”. Here’s what the Sophia white paper says:

All  texts  are  rehashes  of  previously  existing ones and in order to understand them properly they must be read within the  context of all information available that is related to them.

Many search engines remain ignorant of what has been previously processed. Google’s programmable search engine includes a context server which addresses this problem in the context of Ramanathan Guha’s method. But Google does not as far as I know offer its context server technology to third parties. Sophia’s engineers are heading down an interesting path in my opinion.

The system processes content, picks out key themes, and then clusters the pointers into “themes”. The idea is that a search rturns content which is “topically similar”. According to the write up in the University of Ulster’s U2B newsletter (Winter 2007), Dr. David Patterson, one of the founders of the company, revealed:

Sophia just doesn’t ind relevant information for customers, it also empowers them with an understanding of the meaning of the information returned. Using conventional search is akin to using a torch in a dark room (the torch represents the search engine and the room, an organisation’s information). Only the parts of the room that have the beam of light focussed on them can be seen at any one time, with limited understanding of the information in view. Using SOPHIA is like licking the switch for a bright ceiling light. The whole room can be seen and all information understood at once.

If you are into technical papers, you can get a feel for the system’s method in “Sophia: An Interactive Cluster-Based Retrieval System for the OHSUMED Collection,” published in 2005.

With some search systems fading, new entrants often find eager audiences. Will Sophia become a break out solution? We wish the Sophia team the best.

Stephen E Arnold, July 9, 2010

Freebie

Fwix and the Local Web

July 7, 2010

I read a tweet about some load balancing problems at Fwix.com today. Before the craziness of my whirlwind trip to Spain and the 4th of July weekend, I received a link to a story in EarthTimes. “Fwix Begins Indexing the Local Web” reported:

SAN FRANCISCO, CA — 06/03/10 — Fwix, the local information company, today launched the first-ever hyper local and local information index. Fwix is the fastest-growing local news network, as well as a top-rated iPhone and iPad news app for more than two hundred markets in the English-speaking world. With today’s launch, Fwix expands its content to include not only real-time local and hyper local news, but crime information, real estate, tweets, check-ins, deals and more.

The service operates in near real time. The content consists of blog content and other news from various sources. In June 2010 PaidContent.org said that Fwix was moving beyond news. For me, the key passage in the write up “Fwix Moves Beyond News, Indexing Everything Local” was:

In addition to Twitter and Foursquare, Fwix’s local index will come from a variety of sites, such as Flickr, Yelp, Gowalla, Trulia, Groupon, Citysearch, Oodle and other sources, including local governments and police blotters with information tied to 30,000-plus neighborhoods.

What’s interesting is that one of the founders worked at Xoom.com, where the chief gosling labored along with the chief goose’s partner, Chris Kitze worked.

In a browser, the service sniffs the user’s location and displays news for that city. You can select from ore than 120 cities in the US and a growing number of cities in Canada and the UK. The service is expanding to Australia and New Zealand as well.

image

The ads on Fwix on July 6, 2010, were provided by Oneriot. Ads were not obtrusive.

The basic Louisville splash page provides search box, ads, and hot links to recent stories, top stories, the weather, and stories by category. A user can register. When we tried to sign up today, we received a “server unavailable” message.

The service is available on the iPad. If you are running around a city with your iPad connected to the network, the service provides a swipeable display. Here’s what that listing in the App Store shows for the interface:

fwix photo

The service has more than 130 US cities and some major cities in the UK, Australia, New Zealand, and Ireland. Canada listings include the major metro areas.

The company has raised more than $6.0 million in two rounds of financing. Worth a look because the New York Times has signed on to syndicate some of its content via Fwix.

Worth a look.

Stephen E Arnold, July 7, 2010

Freebie

Google from Won to Can Win

July 2, 2010

Nope, I am not writing about Google and China. Tired that about face, iterative approach to governments with armies, secret agencies, and weaponized bureacracies. Silly.

I want to point to the article “From Search to Share: How Google Can Win in the Social Age” which highlights the change in Google’s fortunes. It is heretical to suggest that a company that kicked Viacom to the curb, trampled hapless Yahoo, and driven Microsoft into a state of apoplexy could fail. But failure is the point of the “From Search to Share.” Google dominates search. Google is not repeating its previous success with social content. The article from Smart Data Collective said:

But Google’s revenue model is under threat in the age of Social Media. Facebook recently surpassed Google in the US to become the most visited website… But this strategy of indexing web content and displaying most relevant search results will no longer work in Social Age as users rely more on Social Media channels and less on search engines to get relevant information. Google should shift its strategy from Search to Share, to continue its dominance in the Social Age as people are more likely to pay attention to content from those they trust rather than ones suggested by search engines.

When I completed work on Google Version 2.0, there were early signals of some diffusion. Google was moving slowly with some of the innovations I summarized in that monograph. The company’s context server is one example. In 2007, Facebook was growing but certainly was not a big deal in comparison with the Google revenues.

Today, Google is a company described with the phrase “How Google Can Win.” I must admit I have been surprised that Google seems to have caught Microsoftitus. I don’t think it is fatal, but Facebook at this time seems to be like one of those healthy, happy people carting bodies from central London during the Black Death.

Stephen E Arnold, July 2, 2010

Freebie

EMC Beefs Up Its Content Processing

June 27, 2010

Data collection agency EMC, http://www.emc.com, has moved to build a platform for expanding business in the future, thanks to a recent partnership inked with low-profile legal discovery company Applied Discovery. Rumor has it that EMC learned about search via a marriage and divorce with the Fast Search & Transfer technology.  The most recent move is to create a comprehensive service by blending SourceOne eDiscovery-Kazeon with the case discovery review power of Applied Discovery’s process and review engine. EMC started out as large storage vendor, and they bought Kazeon.Will the result be a complete solution for indexing and searching large data stores? EMC hopes this is the findability fix.

Patrick Roland, June 27, 2010

Freebie

Search Vendors Try New Sales Hooks

June 25, 2010

Forget the surveys that companies run to make clear the problems in information access. Anyone who looks for information today knows that pinpointing information to answer a business question is not exactly bulletproof. Recommind, once a vendor anchored in the legal market, stretched its wings into the enterprise. My recollection is that some of the company’s technology reminded me of Autonomy’s original approach. Now Recommind seems to be pushing into a different space, one that combines indexing, risk management, some MBA speak, and a dash of legal lingo. Navigate to “Disconnect Between Legal and IT Getting Worse, Recommind Survey Reveals.”

In my experience, information technology organizations are definitely disconnected from most of the corporate functions. I don’t think IT is at fault. IT departments are trying to protect themselves from what I call “requests from the clueless.” I know business managers are under pressure. CFOs are wild eyed in their efforts to cut costs and maximize returns. The top executives are scrambling to find ways to buy their private island, get a new BMW, and create a life without BP scale risks, bloggers, and 20somethings who want to make their bones on the corpses of today’s market leaders. Many managers see a demo or chat with pals at the country club and come to the office on Monday with requests that are essentially impossible for an IT department to meet with available resources.

What’s the Recommind survey purport to tell me? IT and legal eagles are operating on different wave lengths. I need a survey to tell me this. I don’t even operate on the same wave length as my two attorneys and I pay these guys to try and help me. For me, here’s a quote that reveals more about client management and vendors than about IT departments:

At a time when e-Discovery and regulatory issues are gaining momentum, these results don’t exactly instill confidence across the enterprise.

Here’s my view of the situation:

  1. Certain vendors of search technology have to find a way to make sales to keep the money pipe full. The options are market like the devil or go to Satans’ spawn and get more funding. Which path would you take? I vote for marketing. I think these types of surveys are marketing efforts and when the results are released, I know the data are viewed by the survey sponsor as a way to generate sales leads.
  2. Obviously plain vanilla search is not a hot ticket. I think I was one of the first people to explain that search was dead in my Searcher article for Barbara Quint four or five years ago. No search vendor is going to bridge the gap between IT and the many over stressed units in an organization. Successful vendors find ways to solve problems, not tackle the management tensions that are human centric organizational issues.
  3. The new lingo does not convince me that content processing software can address deeper issues with management and governance.

You may have a different view, so read the survey results. Many search vendors have marketed themselves into a corner. Now organizations have to find solutions to information access problems. I don’t think there is much margin for error. Sure, some assert the economy is improving. That’s wonderful. But the glory days of search marketing are behind us, and I think more than catch phrases, house surveys, sponsored white papers, and fawning azure chip consultants will be needed.

Here’s my checklist for starters:

  1. Demonstrations that solve a problem
  2. Clear statements of what a findability-centric software system can and cannot do
  3. Avoidance of MBA crazy talk, jargon, unsupported assertions, and faux case analyses
  4. Partnerships that give a prospect confidence that the system can be made to work at a reasonable cost in a reasonable period of time
  5. Focus on solutions. Search and content processing vendors are not blue chip management consultants, never will be and probably cannot afford the ministrations of Bain, Booz, Boston Consulting, or McKinsey and, therefore, have little first hand information of what is required to tackle management challenges in an organizations.

Many search vendors are scrambling for a new sales hook. What approach will work? No clue have I.

Stephen E Arnold, June 25, 2010

Freebie

Facebook Goes for Graph Curated Web Search

June 25, 2010

I whipped up the phrase to explain what Facebook is trying to do. Indexing the “Web” is an expensive proposition. Brute force indexing takes time. Google’s 1998 vision was clever, no Kleinbergian pun intended. Now Google continues to brute force index. The company has worked and invested to deal with the flood of new content and the changes to content already indexed. The real time content just adds to Google’s indexing challenge.

Is there a better way? Yep. Let people point out what pages they like and use these as a seed list. Then take a peek at what behaviors users evidence. Toss in some relationship analyses. Shake well. Serve to members. This is the recipe that Facebook hopes will put a needle in the steadily inflating Google search usage.

You can get the details in “Facebook Unleashes Open Graph Search Engine, Declares War On Google.” The write up has a dose of buzzwords. My take is easy to express.

First, cheaper. Facebook can index the content that members “like” and then add other content sources based on usage analysis and other number crunching. Google pays to index everything. Facebook indexes what users like and the math suggests will satisfy user information needs.

Second, advertising. With this approach to curated indexing, Facebook has a story to tell advertisers. Like Apple’s ad strategy, Facebook may be able to demand more dough because of the member behavior angle. Google may not be a Wal-Mart. Think Dollar General if Apple and Facebook can make their approach seduce advertisers.

Third, social. Google was in the running until Orkut got a flat tire. Google has not yet been able to close the gap between it and Facebook. Facebook, regardless of its weaknesses, seems to have figured out the hippy dippy social networking thing. I don’t get but other companies don’t either. Facebook has bolted search onto its juggernaut.

Interesting play. Will it work? I know that Microsoft will be rooting for Facebook and hoping its investment in the outfit pays off big time. A wounded Google would make some Microsofties happy, very happy.

Stephen E Arnold, June 24, 2010

Freebie.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta