February 23, 2011
We have noted a number of management changes in the search and content sector.
Now X1 Technologies has appointed a new leader for their eDiscovery division. X1 Technologies Appoints John Patzakis as President of eDiscovery, citing his extensive background in eDiscovery and corporate compliance as well as his knowledge of the law.
“I am pleased to welcome someone as accomplished as John to the X1 team,” said John Waller, CEO of X1 Technologies. “John’s background as a senior software executive coupled with his deep understanding of compliance and discovery law make him a perfect fit to lead our efforts in the eDiscovery market.”
X1’s eDiscovery Search Suite allows users to search data stored in over 500 different files types and applications. This allows for quick retrieval of electronically stored information (ESI) for early case assessment. X1’s support of social media applications will be released this quarter. In Patzakis, X1 has found a leader with the experience and skill to push them forward in the eDiscovery sector.
Emily Rae Aldridge, February 243, 2011
February 13, 2011
So Google can be fooled. It’s not nice to fool Mother Google. The inverse, however, is not accurate. Mother Google can take some liberties. Any indexing system can. Objectivity is in the eye of the beholder or the person who pays for results.
Judging from the torrent of posts from “experts”, the big guns of search are saying, “We told you so.” The trigger for this outburst of criticism is the New York Times’s write up about JC Penny. You can try this link, but I expect that it and its SEO crunchy headline will go dark shortly. (Yep, the NYT is in the SEO game too.)
I am not sure how many years ago I wrote the “search sucks” article for Searcher Magazine. My position was clear long before the JC Penny affair and the slowly growing awareness that search is anything BUT objective.
In the good old days, database bias was set forth in the editorial policies for online files. You could disagree with what we selected for ABI/INFORM, but we made an effort to explain what we selected, why we selected certain items for the file, and how the decision affected assignment of index terms and classification codes. The point was that we were explaining the mechanism for making a database which we hoped would be useful. We were successful, and we tried to avoid the silliness of claiming comprehensive coverage. We had an editorial policy, and we shaped our work to that policy. Most people in 1980 did not know much about online. I am willing to risk this statement: I don’t think too many people in 2011 know about online and Web indexing. In the absence of knowledge, some remarkable actions occur.
You don’t know what you don’t know or the unknown unknowns. Source: http://dealbreaker.com/donald-rumsfeld/
Flash forward to the Web. Most users assume incorrectly that a search engine is objective. Baloney. Just as we set an editorial policy for ABI/INFORM each crawler and content processing system has similar decisions beneath it.
The difference is that at ABI/INFORM we explained our bias. The modern Web and enterprise search engines don’t. If a system tries to explain what it does, most of the failed Web masters, English majors working as consultants, and unemployed lawyers turned search experts just don’t care.
Search and content processing are complicated businesses, and the appetite for the gory details about certain issues are of zero interest to most professionals. Here’s a quick list of “decisions” that must be made for a basic search engine:
- How deep will we crawl? Most engines set a limit. No one, not even Google, has the time or money to follow every link.
- How frequently will we update? Most search engines have to allocate resources in order to get a reasonable index refresh. Sites that get zero traffic don’t get updated too often. Sites that are sprawling and deep may get three of four levels of indexing. The rest? Forget it.
- What will we index? Most people perceive the various Web search systems as indexing the entire Web. Baloney. Bing.com makes decisions about what to index and when, and I find that it favors certain verticals and trendy topics. Google does a bit better, but there are bluebirds, canaries, and sparrows. Bluebirds get indexed thoroughly and frequently. See Google News for an example. For Google’s Uncle Sam, a different schedule applies. In between, there are lots of sites and lots of factors at play, not the least of which is money.
- What is on the stop list? Yep, a list can kill index pointers, making the site invisible.
- When will we revisit a site with slow response time?
- What actions do we take when a site is owned by a key stakeholder?
February 7, 2011
We learned from one of our readers that Kartoo has turned out its lights. According to Wikipedia, the company shut down after a nine year run. Kartoo relied on Flash to display search results. Novel? Yes. Useful. In some types of queries, yes.
If you are interested in visual search, you can check out Yometa.com. This is a federating search system which taps results from Bing, Google, and Yahoo. A query for “Stephen E Arnold” returned this display.
Yometa displays the most relevant search results based on a combination of the three search engines ranking determined by the Yometa algorithm.
The company developed its approach based on research that showed that 97 percent of search results by the three search engines(Google, Yahoo and Bing) are different and there is only three percent overlap. The visual interface allows users to see results of Google, Bing and Yahoo individually and in various combinations. Users can see any combination of search results from Bing, Yahoo and Google in one screen and is displayed in a visual interface. The search results are displayed in a Venn Diagram, the results closer to the middle are more relevant.
For more information navigate to www.yometa.com/about/ .
Stephen E Arnold, February 7, 2011
August 11, 2010
Yippy, Inc. has good reason to rejoice. In “Yippy Releases Family Friendly Search For Nintendo Wii” http://www.tmcnet.com/usubmit/2010/07/28/4925824.htm VP Emily Parker says “the Yippee Wii search has been optimized for use with Nintendo Wii game controls and features Yippy content-blocking protocols.” The report also tells of a soon-to-be-released Yippee Wii Browser with cloud-based content management platforms.
Let’s not get ahead of ourselves. A family friendly search was the focus of The Point (Top 5% of the Internet), developed by Beyond Search’s Stephen E. Arnold, his son, Erik S. Arnold, and business partner, Chris Kitze in 1993. the Point service sold to Lycos in 1996, and, alas, Lycos lost its way. Now, a 17-yr-old idea is back, proving The Point was right on target almost two decades ago.
Brett Quinn, August 11, 2010
February 14, 2010
Abe Lederman (one of the founders of Verity) alerted me this morning that his company, Deep Web Technology, signed a deal and partnership agreement with SWETS. This Netherlands-based company is one of the world’s leading subscription services. SWETS helps government agencies and companies with subscriptions and related services. The firm has clients in over 160 countries and describes itself as “a long-talk powerhouse.”
Deep Web Technology provides the software and systems that fuel Science.gov, a US government search and retrieval project. Science.gov taps into a wide range of data and information related to science and technology. The invention of the Deep Web method was an outgrowth of Dr. Lederman’s experience in providing a user with access to a broad range of structured and unstructured data. In my various reports on enterprise and special purpose search, I have given Dr. Lederman’s method high marks, and I even let him buy me a taco in a restaurant in Santa Fe, after I finished a lecture at Los Alamos. Dr. Lederman contributed at Los Alamos prior to founding Deep Web as I recall.
The deal brings Dr. Lederman’s federation technology to the SwetsWise Searcher. This service will be powered by Deep Web Technology. SwetsWise is designed to help librarians and their users meet the challenge of searching and finding relevant results from the ever-increasing catalog of content available online. The search system simplifies access to an organization’s diverse and valuable resources, along with the open Web content users are accustomed to searching. SWETS will deliver search results through the Deep Web ranking engine, providing incremental results for fast response times, scalability and flexibility. SwetsWise Searcher performs a rapid parallel search of all available sources or selected sources in real-time, ensuring fresh information and that documents are retrieved the minute they are published into a collection’s database. A simple search box to cover all sources can be integrated into any web page, blog or Intranet homepage.
A happy quack to Deep Web Technology. No more tacos in Santa Fe. I want a nuked burrito, a nod to our friends up the road.
Stephen E Arnold, February 14, 2010
No one paid me to write this. I do have a promise of a taco in Santa Fe, which I have just rejected. I will report this to the Food & Drug Administration.
October 13, 2009
I have found the Kartoo.com service useful and innovative. I learned today that the company has rolled out a new interface and links that make it easier to locate the company’s other content processing technology. The new interface provides thumbnails of the top hits. You can explore other results by clicking on the links on the page. The default interface for the query “text mining” appears below:
Other new features include:
- E-reputation tools
- Metasearch functions
- Support for anonymous search
- Support for French, English and Dutch language.
If you have not explored the Kartoo service, give it a whirl.
Stephen Arnold, October 13, 2009, published because I like the French
October 8, 2009
Vivisimo, http://www.vivismo.com, a company that works with email archiving, eDiscovery, and information management solutions, just released a revved-up version of the Velocity Enterprise Search Platform, which builds search-centric programs. The platform focuses on extensibility, scalability and performance; Vivisimo is using it to accelerate into OEM and reseller markets. Those programs are designed to add value to existing applications and develop new solutions for sorting information assets, for example, it supports searching 1 billion emails on a single server. Vivisimo also says “With Velocity 7.5, new traceable accuracy metrics can accurately prove and defend that all data has been crawled and identify any documents that were not indexed due to corrupt file types.” This can be a big plus for companies dealing with growing regulation. A happy quack for Vivisimo (tagline: “Search Done Right!”). Any progress that can help enterprise business advance search and make sense of unstructured data is a good thing.
Jessica Bratcher, October 8, 2009
October 1, 2009
I have been using Coveo’s products for years. I remember the first time I fired up the original desktop search program. I found the interface intuitive and the features in line with how I looked for information. I learned from the company yesterday (September 30, 2009) that a new version of the product is now available. I noticed that the company has added several new features to its Enterprise Desktop Search application; for example:
- Search of content on my netbook, my Outlook mail store, and other applications running in my Harrod’s Creek data center.
- A centralized index of all enterprise information, including the formerly risky and elusive, cross-enterprise PC and laptop content, which is useful when I am in a meeting and need a coding gosling to locate a particular item of information that I tucked away without telling anyone its location
- Enhanced monitoring functions.
After installing the application, you will want to check out the built in connectors, the faceted “point and click” search function, and the support for access from a BlackBerry device. Nifty indeed because RIM’s search function is not too useful in my opinion.
The president and founder Laurent Simoneau told me:
With our roots dating to the early days of Copernic, a global leader in consumer desktop search, we were committed to build the cross-enterprise capability to index and provide unified access for employees to their desktop content, including their email,” said Coveo CEO and President Laurent Simoneau, who prior to founding Coveo in 2005 was COO of Copernic. “What we’ve done is elevate that access to a higher level, with unified search of not only their individual PCs and laptops, but of contextually relevant knowledge and information residing in any enterprise system, based on IT permissions. In so doing, we’ve placed control over cross-enterprise desktop content indexing, with complete security and access permissions, in the hands of IT.
The benefits of the new system struck me as reducing the time spent hunting for email. Larger organizations will be able to reduces costs and risks as well.
The Coveo Enterprise Desktop Search application is powered by the Coveo Enterprise Search 6.0 platform, which is scalable from hundreds of thousands to billions of documents, and requires approximately 20 percent of the server footprint of legacy enterprise search solutions. Our tests show that Coveo is one of the more modular and scalable enterprise search solutions. It ranks as one of the easiest to install and configure search solutions we have tested. Worth a look. Fill out the form and give it a spin.
Stephen Arnold, October 1, 2009
September 28, 2009
I thought I made Google’s intent clear in Google Version 2.0. The company provides a user with access to content within the Google index. The inventions reviewed briefly in The Google Legacy and in greater detail in Google Version 2.0 explain that information within the Google data management system can be sliced, diced, remixed, and output as new information objects. The analogy is similar to what an MBA does at Booz, McKinsey, or any other rental firm for semi-wizards. Intakes become high value outputs. I was delighted to read Erick Schonfeld’s “With Google Places, Concerns Rise that Google Just Wants to Link to Its Own Content.” The story makes clear that folks are now beginning to see that Google is a digital Gutenberg and is a different type of information company. Mr. Schonfeld wrote:
The concerns arise, however, back on Google’s main search page, where Google is indexing these Places pages. Since Google controls its own search index, it can push Google Places more prominently if it so desires. There isn’t a heck of a lot of evidence that Google is doing this yet, but the mere fact that Google is indexing these Places pages has the SEO world in a tizzy. And Google is indexing them, despite assurances to the contrary. If you do a search for the Burdick Chocolate Cafe in Boston, for instance, the Google Places page is the sixth result, above results from Yelp, Yahoo Travel, and New York Times Travel. This wouldn’t be so bad if Google wasn’t already linking to itself in the top “one Box” result, which shows a detail from Google Maps. So within the top ten results, two of them link back to Google content.
Directories are variants of vertical search. Google is much more than rich directory listings.
Let me give one example, and you are welcome to snag a copy of my three Google monographs for more examples.
Consider a deal between Google and a mobile telephone company. The users of the mobile telco’s service run a query. The deal makes it possible for the telco to use the content in the Google system. No query goes into the “world beyond Google”. The reason is that Google and the telco gain control over latency, content, and advertising. This makes sense. Let’s assume that this is a deal that Google crafts with an outfit like T Mobile. Remember: this is a hypothetical example. When I use my T Mobile device to get access to the T Mobile Internet service, the content comes from Google with its caches, distributed data centers, and proprietary methods for speeding results to a device. In this example, as a user, I just want fast access to content that is pretty routine; for example, traffic, weather, flight schedules. I don’t do much heavy lifting from my flakey BlackBerry or old person hostile iPhone / iTouch device. Google uses its magical ability to predict, slice, and dice to put what I want in my personal queue so it is ready before I know I need the info. Think “I am feeling doubly lucky”, a “real” patent application by the way. T Mobile wins. The user wins. The Google wins. The stuff not in the Google system loses.
Interesting? I think so. But the system goes well beyond directory listings. I have been writing about Dr. Guha, Simon Tong, Jeff Dean, and the Halevy team for a while. The inventions, systems and methods from this group have revolutionized information access in ways that reach well beyond local directory listings.
The Google has been pecking away for 11 years and I am pleased that some influential journalists / analysts are beginning to see the shape of the world’s first trans national information access company. Google is the digital Gutenberg and well into the process of moving info and data into a hyper state. Google is becoming the Internet. If one is not “in” Google, one may not exist for a certain sector of the Google user community. Googleo ergo sum.
Stephen Arnold, September 28, 2009
September 23, 2009
TechFlash reported an interesting article called “Windows Live Lost $560 Million in FY2009”. With revenues of $520, the loss chewed through $64,000 an hour or $2,663 a minute 24×7 for 365 days. With Microsoft’s revenue in the $58 billion range, a $560 million is not such a big deal. In my opinion, profligate spending might work in the short term, but I wonder if the tactic will work over a longer haul on the information highway.
Stephen Arnold, September 23, 2009