February 7, 2011
We learned from one of our readers that Kartoo has turned out its lights. According to Wikipedia, the company shut down after a nine year run. Kartoo relied on Flash to display search results. Novel? Yes. Useful. In some types of queries, yes.
If you are interested in visual search, you can check out Yometa.com. This is a federating search system which taps results from Bing, Google, and Yahoo. A query for “Stephen E Arnold” returned this display.
Yometa displays the most relevant search results based on a combination of the three search engines ranking determined by the Yometa algorithm.
The company developed its approach based on research that showed that 97 percent of search results by the three search engines(Google, Yahoo and Bing) are different and there is only three percent overlap. The visual interface allows users to see results of Google, Bing and Yahoo individually and in various combinations. Users can see any combination of search results from Bing, Yahoo and Google in one screen and is displayed in a visual interface. The search results are displayed in a Venn Diagram, the results closer to the middle are more relevant.
For more information navigate to www.yometa.com/about/ .
Stephen E Arnold, February 7, 2011
August 11, 2010
Yippy, Inc. has good reason to rejoice. In “Yippy Releases Family Friendly Search For Nintendo Wii” http://www.tmcnet.com/usubmit/2010/07/28/4925824.htm VP Emily Parker says “the Yippee Wii search has been optimized for use with Nintendo Wii game controls and features Yippy content-blocking protocols.” The report also tells of a soon-to-be-released Yippee Wii Browser with cloud-based content management platforms.
Let’s not get ahead of ourselves. A family friendly search was the focus of The Point (Top 5% of the Internet), developed by Beyond Search’s Stephen E. Arnold, his son, Erik S. Arnold, and business partner, Chris Kitze in 1993. the Point service sold to Lycos in 1996, and, alas, Lycos lost its way. Now, a 17-yr-old idea is back, proving The Point was right on target almost two decades ago.
Brett Quinn, August 11, 2010
February 14, 2010
Abe Lederman (one of the founders of Verity) alerted me this morning that his company, Deep Web Technology, signed a deal and partnership agreement with SWETS. This Netherlands-based company is one of the world’s leading subscription services. SWETS helps government agencies and companies with subscriptions and related services. The firm has clients in over 160 countries and describes itself as “a long-talk powerhouse.”
Deep Web Technology provides the software and systems that fuel Science.gov, a US government search and retrieval project. Science.gov taps into a wide range of data and information related to science and technology. The invention of the Deep Web method was an outgrowth of Dr. Lederman’s experience in providing a user with access to a broad range of structured and unstructured data. In my various reports on enterprise and special purpose search, I have given Dr. Lederman’s method high marks, and I even let him buy me a taco in a restaurant in Santa Fe, after I finished a lecture at Los Alamos. Dr. Lederman contributed at Los Alamos prior to founding Deep Web as I recall.
The deal brings Dr. Lederman’s federation technology to the SwetsWise Searcher. This service will be powered by Deep Web Technology. SwetsWise is designed to help librarians and their users meet the challenge of searching and finding relevant results from the ever-increasing catalog of content available online. The search system simplifies access to an organization’s diverse and valuable resources, along with the open Web content users are accustomed to searching. SWETS will deliver search results through the Deep Web ranking engine, providing incremental results for fast response times, scalability and flexibility. SwetsWise Searcher performs a rapid parallel search of all available sources or selected sources in real-time, ensuring fresh information and that documents are retrieved the minute they are published into a collection’s database. A simple search box to cover all sources can be integrated into any web page, blog or Intranet homepage.
A happy quack to Deep Web Technology. No more tacos in Santa Fe. I want a nuked burrito, a nod to our friends up the road.
Stephen E Arnold, February 14, 2010
No one paid me to write this. I do have a promise of a taco in Santa Fe, which I have just rejected. I will report this to the Food & Drug Administration.
October 13, 2009
I have found the Kartoo.com service useful and innovative. I learned today that the company has rolled out a new interface and links that make it easier to locate the company’s other content processing technology. The new interface provides thumbnails of the top hits. You can explore other results by clicking on the links on the page. The default interface for the query “text mining” appears below:
Other new features include:
- E-reputation tools
- Metasearch functions
- Support for anonymous search
- Support for French, English and Dutch language.
If you have not explored the Kartoo service, give it a whirl.
Stephen Arnold, October 13, 2009, published because I like the French
October 8, 2009
Vivisimo, http://www.vivismo.com, a company that works with email archiving, eDiscovery, and information management solutions, just released a revved-up version of the Velocity Enterprise Search Platform, which builds search-centric programs. The platform focuses on extensibility, scalability and performance; Vivisimo is using it to accelerate into OEM and reseller markets. Those programs are designed to add value to existing applications and develop new solutions for sorting information assets, for example, it supports searching 1 billion emails on a single server. Vivisimo also says “With Velocity 7.5, new traceable accuracy metrics can accurately prove and defend that all data has been crawled and identify any documents that were not indexed due to corrupt file types.” This can be a big plus for companies dealing with growing regulation. A happy quack for Vivisimo (tagline: “Search Done Right!”). Any progress that can help enterprise business advance search and make sense of unstructured data is a good thing.
Jessica Bratcher, October 8, 2009
October 1, 2009
I have been using Coveo’s products for years. I remember the first time I fired up the original desktop search program. I found the interface intuitive and the features in line with how I looked for information. I learned from the company yesterday (September 30, 2009) that a new version of the product is now available. I noticed that the company has added several new features to its Enterprise Desktop Search application; for example:
- Search of content on my netbook, my Outlook mail store, and other applications running in my Harrod’s Creek data center.
- A centralized index of all enterprise information, including the formerly risky and elusive, cross-enterprise PC and laptop content, which is useful when I am in a meeting and need a coding gosling to locate a particular item of information that I tucked away without telling anyone its location
- Enhanced monitoring functions.
After installing the application, you will want to check out the built in connectors, the faceted “point and click” search function, and the support for access from a BlackBerry device. Nifty indeed because RIM’s search function is not too useful in my opinion.
The president and founder Laurent Simoneau told me:
With our roots dating to the early days of Copernic, a global leader in consumer desktop search, we were committed to build the cross-enterprise capability to index and provide unified access for employees to their desktop content, including their email,” said Coveo CEO and President Laurent Simoneau, who prior to founding Coveo in 2005 was COO of Copernic. “What we’ve done is elevate that access to a higher level, with unified search of not only their individual PCs and laptops, but of contextually relevant knowledge and information residing in any enterprise system, based on IT permissions. In so doing, we’ve placed control over cross-enterprise desktop content indexing, with complete security and access permissions, in the hands of IT.
The benefits of the new system struck me as reducing the time spent hunting for email. Larger organizations will be able to reduces costs and risks as well.
The Coveo Enterprise Desktop Search application is powered by the Coveo Enterprise Search 6.0 platform, which is scalable from hundreds of thousands to billions of documents, and requires approximately 20 percent of the server footprint of legacy enterprise search solutions. Our tests show that Coveo is one of the more modular and scalable enterprise search solutions. It ranks as one of the easiest to install and configure search solutions we have tested. Worth a look. Fill out the form and give it a spin.
Stephen Arnold, October 1, 2009
September 28, 2009
I thought I made Google’s intent clear in Google Version 2.0. The company provides a user with access to content within the Google index. The inventions reviewed briefly in The Google Legacy and in greater detail in Google Version 2.0 explain that information within the Google data management system can be sliced, diced, remixed, and output as new information objects. The analogy is similar to what an MBA does at Booz, McKinsey, or any other rental firm for semi-wizards. Intakes become high value outputs. I was delighted to read Erick Schonfeld’s “With Google Places, Concerns Rise that Google Just Wants to Link to Its Own Content.” The story makes clear that folks are now beginning to see that Google is a digital Gutenberg and is a different type of information company. Mr. Schonfeld wrote:
The concerns arise, however, back on Google’s main search page, where Google is indexing these Places pages. Since Google controls its own search index, it can push Google Places more prominently if it so desires. There isn’t a heck of a lot of evidence that Google is doing this yet, but the mere fact that Google is indexing these Places pages has the SEO world in a tizzy. And Google is indexing them, despite assurances to the contrary. If you do a search for the Burdick Chocolate Cafe in Boston, for instance, the Google Places page is the sixth result, above results from Yelp, Yahoo Travel, and New York Times Travel. This wouldn’t be so bad if Google wasn’t already linking to itself in the top “one Box” result, which shows a detail from Google Maps. So within the top ten results, two of them link back to Google content.
Directories are variants of vertical search. Google is much more than rich directory listings.
Let me give one example, and you are welcome to snag a copy of my three Google monographs for more examples.
Consider a deal between Google and a mobile telephone company. The users of the mobile telco’s service run a query. The deal makes it possible for the telco to use the content in the Google system. No query goes into the “world beyond Google”. The reason is that Google and the telco gain control over latency, content, and advertising. This makes sense. Let’s assume that this is a deal that Google crafts with an outfit like T Mobile. Remember: this is a hypothetical example. When I use my T Mobile device to get access to the T Mobile Internet service, the content comes from Google with its caches, distributed data centers, and proprietary methods for speeding results to a device. In this example, as a user, I just want fast access to content that is pretty routine; for example, traffic, weather, flight schedules. I don’t do much heavy lifting from my flakey BlackBerry or old person hostile iPhone / iTouch device. Google uses its magical ability to predict, slice, and dice to put what I want in my personal queue so it is ready before I know I need the info. Think “I am feeling doubly lucky”, a “real” patent application by the way. T Mobile wins. The user wins. The Google wins. The stuff not in the Google system loses.
Interesting? I think so. But the system goes well beyond directory listings. I have been writing about Dr. Guha, Simon Tong, Jeff Dean, and the Halevy team for a while. The inventions, systems and methods from this group have revolutionized information access in ways that reach well beyond local directory listings.
The Google has been pecking away for 11 years and I am pleased that some influential journalists / analysts are beginning to see the shape of the world’s first trans national information access company. Google is the digital Gutenberg and well into the process of moving info and data into a hyper state. Google is becoming the Internet. If one is not “in” Google, one may not exist for a certain sector of the Google user community. Googleo ergo sum.
Stephen Arnold, September 28, 2009
September 23, 2009
TechFlash reported an interesting article called “Windows Live Lost $560 Million in FY2009”. With revenues of $520, the loss chewed through $64,000 an hour or $2,663 a minute 24×7 for 365 days. With Microsoft’s revenue in the $58 billion range, a $560 million is not such a big deal. In my opinion, profligate spending might work in the short term, but I wonder if the tactic will work over a longer haul on the information highway.
Stephen Arnold, September 23, 2009
September 7, 2009
When a company offers multiple software products to perform a similar function, I get confused. For example, I have a difficult time explaining to my 88 year old father the differences among Notepad, WordPad, Microsoft Works’ word processing, Microsoft Word word processing, and the Microsoft Live Writer he watched me use to create this Web log post. I think it is an approach like the one the genius at Ragu spaghetti sauce used to boost sales of that condiment. When my wife sends me to the store to get a jar of Ragu spaghetti sauce, I have to invest many minutes figuring out what the heck is the one I need. Am I the only male who cannot differentiate between Sweet Tomato Basic and Margherita? I think Microsoft has taken a different angle of attack because when I acquired a Toshiba netbook, the machine had installed Notepad, WordPad, and Microsoft Works. I added a version of Office and also the Live Writer blog tool. Some of these were “free” and others products came with my MSDN subscription.
Now the same problem has surfaced with basic search. I read “FAST ESP versus MOSS 2007 / Microsoft Search Server” with interest. Frankly I could not recall if I had read this material before, but quit a bit seemed repetitive. I suppose when trying to explain the differences among word processors, the listener hears a lot of redundant information as well.
The write up begins:
It took me some time but i figured out some differences between Microsoft Search Server / MOSS 2007 and Microsoft FAST ESP. These differences are not coming from Microsoft or the FAST company. But it came to my notice that Microsoft and FAST will announce a complete and correct list with these differences between the two products at the conference in Las Vegas next week.These differences will help me and you to make the right decisions at our customers for implementing search and are based on business requirements.
Ah, what’s different is that this is a preview of the “real” list of differences. Given the fact that the search systems available for SharePoint choke and gasp when the magic number of 50 million documents is reached, I hope that the Fast ESP system can handle the volume of information objects that many organizations have on their systems at this time.
The list in the Bloggix post numbers 14. Three interested me:
- Faceted navigation
- Advanced federation.
First, scalability is an issue with most search systems. Some companies have made significant technical breakthroughs to make adding gizmos painless and reasonably economical. Other companies have made the process expensive, time consuming, and impossible for the average IT manager to perform. I heard about EMC’s purchase of Kazeon. I thought I heard that someone familiar with the matter pointed to problems with the Fast ESP architecture as one challenge for EMC. In order to address the issue, EMC bought Kazeon. I hope the words about “scalability” are backed up with the plumbing required to deliver. Scaling search is a tough problem, and throwing hardware at hot spots is, at best, a very costly dab of Neosporin.
Second, faceted navigation exists within existing MOSS implementations. I think I included screenshots of faceted navigation in the last edition of the Enterprise Search Report I wrote in 2006 and 2007. There was a blue interface and a green interface. Both of these made it possible to slice and dice results by clicking on an “expert” identified by counting the number of documents a person wrote with a certain word in them. There were other facets available as well, although most we more sophisticated that the “expert” function. I hope that the “new” Fast ESP implements a more useful approach for users of Fast ESP. Of course, identifying, tagging, and linking facets across processed content requires appropriate computing resources. That brings us back to scaling, doesn’t it? Sorry.
Third, federation is a buzz word that means many different things because vendors define the term in quite distinctive ways. For example, Vivisimo federates, and it is or was at one time a metasearch system. The query went to different indexing services, brought back the results, deduplicated them, put the results in folders on the fly, and generated a results list. Another type of federation surfaces in the descriptions of business intelligence systems offered by SAS. The system blends structured and unstructured data within the SAP “environment”. Others are floating around as well, including the repository solutions from TeraText which federates disparate content into one XML repository. What I find interesting is that Microsoft is not delivering “federation” which is undefined. Microsoft is, according to the Bloggix post, on the trail of “advanced federation”. What the heck does that mean. The explanation is:
FAST ESP supports advanced federation including sending queries to various web search APIs, mixing results, and shallow navigation. MOSS only supports federation without mixing of results from different sources and navigation components, but showing them separately.
Okay, Vivisimo and SAP style for Fast ESP; basic tagging for MOSS. Hmm.
To close, I think that the Fast ESP product is going to add a dose of complexity to the SharePoint environment. Despite Google’s clumsy marketing, the Google Search Appliance continues to gain traction in many organizations. Google’s solution is not cheap. People want it. I think Fast ESP is going to find itself in a tough battle for three reasons:
- Google is a hot brand, even within SharePoint shops
- Microsoft certified search solutions are better than Fast ESP based on my testing of search systems over the past decade
- The cost savings pitch is only going to go so far. CFOs eventually will see the bills for staff time, consulting services, upgrades, and search related scaling. In a lousy financial environment, money will be a weak point.
I look forward to the official announcement about Fast ESP, the $1.2 billion Microsoft spent for this company is now going to have to deliver. I find it unfortunate that the police investigation of alleged impropriety at Fast Search & Transfer has not been resolved. If a product is so good as Fast ESP was advertised to be, what went wrong with the company, its technology, and its customer relations prior to the Microsoft buy out? I guess I have to wait for more information on these matters. When you have a lot of different products with overlapping and similar services, the message I get is more like the Ragu marketing model, not the solving of customer problems in a clear, straightforward way. Sigh. Marketing, not technology, fuels enterprise search these days I fear.
Stephen Arnold, September 7, 2009
August 25, 2009
I was exploring usage patterns via Alexa. I wanted to see how Silobreaker, a service developed by some savvy Scandinavians, was performing against the brand name business intelligence companies. Silobreaker is one of the next generation information services that processes a range of content, automatically indexing and filtering the stream, and making the information available in “dossiers”. A number of companies have attempted to deliver usable “at a glance” services. Silobreaker has been one of the systems I have relied upon for a number of client engagements.
I compared the daily reach of LexisNexis (a unit of the Anglo Dutch outfit Reed Elsevier), Factiva (originally a Reuters Dow Jones “joint” effort in content and value added indexing now rolled back into the Dow Jones mothership), Ebsco (the online arm of the EB Stevens Co. subscription agency), and Dialog (a unit of the privately held database roll up company Cambridge Scientific Abstracts / ProQuest and some investors). Keep in mind that Silobreaker is a next generation system and I was comparing it to the online equivalent of the Smithsonian’s computer exhibit with the Univac and IBM key punch machine sitting side by side:
Silobreaker is the blue line which is chugging right along despite the challenging financial climate. I ran the same query on Compete.com, and that data showed LexisNexis showing a growth uptick and more traffic in June 2009. You mileage may vary. These types of traffic estimates are indicative, not definitive. But Silobreaker is performing and growing. One could ask, “Why aren’t the big names showing stronger buzz?”
A better question may be, “Why haven’t the museum pieces performed?” I think there are three reasons. First, the commercial online services have not been able to bridge the gap between their older technical roots and the new technologies. When I poked under the hood in Silobreaker’s UK facility, I was impressed with the company’s use of next generation Web services technology. I challenged the R&D team regarding performance, and I was shown a clever architecture that delivers better performance than the museum piece services against which Silobreaker competes. I am quick to admit that performance and scaling remain problems for most online content processing companies, but I came away convinced that Silobreaker’s engineering was among the best I had examined in the real time content sector.
Second, I think the museum pieces – I could mention any of the services against which I compared Silobreaker – have yet to figure out how to deal with the gap between the old business model for online and the newer business models that exist. My hunch is that the museum pieces are reluctant to move quickly to embrace some new approaches because of the fear of [a] cannibalization of their for fee revenues from a handful of deep pocket customers like law firms and government agencies and [b] looking silly when their next generation efforts are compared to newer, slicker services from Yfrog.com, Collecta.com, Surchur.com, and, of course, Silobreaker.com.
Third, I think the established content processing companies are not in step with what users want. For example, when I visit the Dialog Web site here, I don’t have a way to get a relationship map. I like nifty methods of providing me with an overview of information. Who has the time or patience to handcraft a Boolean query and then paying money whether the dataset contains useful information or not. I just won’t play that “pay us to learn there is a null set” game any more. Here’s the Dialog splash page. Not too useful to me because it is brochureware, almost a 1998 approach to an online service. The search function only returns hits from the site itself. There is not compelling reason for me to dig deeper into this service. I don’t want a dialog; I want answers. What’s a ProQuest? Even the name leaves me puzzled.
I wanted to make sure that I was not too harsh on the established “players” in the commercial content processing sector. I tracked down Mats Bjore, one of the founders of Silobreaker. I interviewed him as part of my Search Wizards Speak series in 2008, and you may find that information helpful in understanding the new concepts in the Silobreaker service.
What are some of the changes that have taken place since we spoke in June 2008?
Mats Bjore: There are several news things and plenty more in the pipeline. The layout and design of Silobreaker.com have been redesigned to improve usability; we have added an Energy section to provide a more vertically focused service around both fossil fuels and alternative energy; we have released Widgets and an API that enable anyone to embed Silobreaker functionality in their own web sites; and we have improved our enterprise software to offer corporate and government customers “local” customizable Silobreaker installations, as well a technical platform for publishers who’d like to “silobreak” their existing or new offerings with our technology. Industry-wise,the recent statements by media moguls like Rupert Murdoch make it clear that the big guys want to monetize their information. The problem is that charging for information does not solve the problem of a professional already drowning in information. This is like trying to charge a man who has fallen overboard for water instead of offering a life jacket. Wrong solution. The marginal loss of losing a few news sources is really minimal for the reader, as there are thousands to choose from anyways, so unless you are a “must-have” publication, I think you’ll find out very quickly that reader loyalty can be fickle or short-lived or both. Add to that that news reporting itself has changed dramatically. Blogs and other types of social media are already favoured before many newspapers and we saw Twitters role during the election demonstrations in Iran. Citizen journalism of that kind; immediate, straight from the action and free is extremely powerful. But whether old or new media, Silobreaker remains focused on providing sense-making tools.
What is it going to be, free information or for fee information?
Mats Bjore: I think there will be free, for fee, and blended information just like Starbuck’s coffee.·The differentiators will be “smart software” like Silobreaker and some of the Google technology I have heard you describe. However, the future is not just lots of results. The services that generate value for the user will have multiple ways to make money. License fees, customization, and special processing services—to name just three—will differentiate what I can find on your Web log and what I can get from a Silobreaker “report”.
What can the museum pieces like Dialog and Ebsco do to get out of their present financial swamp?
Mats Bjore: That is a tough question. I also run a management consultancy, so let me put on my consultant hat for a moment. If I were Reed Elsevier, Dow Jones/Factiva, Dialog, Ebsco or owned a large publishing house, I must realize that I have to think out of the box. It is clear that these organizations define technology in a way that is different from many of the hot new information companies. Big information companies still define technology in terms of printing, publishing or other traditional processes. The newer companies define technology in terms of solving a user’s problem. The quick fix, therefore, ought to be to start working with new technology firms and see how they can add value for these big dragons today, not tomorrow.
What does Silobreaker offer a museum piece company?
Mats Bjore: The Silobreaker platform delivers access and answers without traditional searching. Users can spot what is hot and relevant. I would seriously look at solutions such as Silobreaker as a front to create a better reach to new customers, capture revenues from the ads sponsored free and reach a wider audience an click for premium content – ( most of us are unaware of the premium content that is out there, since the legacy contractual types only reach big companies and organizations. I am surprised that Google, Microsoft, and Yahoo have not moved more aggressively to deliver more than a laundry list of results with some pictures.
Is the US intelligence community moving more purposefully with access and analysis?
The interest in open source is rising. However, there is quite a bit of inertia when it comes to having one set of smart software pull information from multiple sources. I think there is a significant opportunity to improve the use of information with smart software like Silobreaker’s.
Stephen Arnold, August 25, 2009