<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Beyond Search &#187; Federated search</title>
	<atom:link href="http://arnoldit.com/wordpress/category/federated-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://arnoldit.com/wordpress</link>
	<description>by Stephen E. Arnold</description>
	<lastBuildDate>Sat, 26 May 2012 04:04:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>IBM Buys Vivisimo Allegedly for Its Big Data Prowess</title>
		<link>http://arnoldit.com/wordpress/2012/04/25/ibm-buys-vivisimo-allegedly-for-its-big-data-prowess/</link>
		<comments>http://arnoldit.com/wordpress/2012/04/25/ibm-buys-vivisimo-allegedly-for-its-big-data-prowess/#comments</comments>
		<pubDate>Wed, 25 Apr 2012 17:02:56 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Acquisition]]></category>
		<category><![CDATA[Feature]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=25852</guid>
		<description><![CDATA[Big data. Wow. That’s an angle only a public relations person with a degree in 20th century American literature could craft. Vivisimo is many things, but a big data system? News to me for sure. IBM has been a strong consumer and integrator of open source search solutions. Watson, the game show winner, used Lucene [...]]]></description>
			<content:encoded><![CDATA[<p>Big data. Wow. That’s an angle only a public relations person with a degree in 20th century American literature could craft. Vivisimo is many things, but a big data system? News to me for sure.</p>
<p>IBM has been a strong consumer and integrator of open source search solutions. Watson, the game show winner, used Lucene with IBM wrapper software to keep the folks in Jeopardy post production on their toes.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2012/04/vivisimosearch.png"><img style="display: inline; border: 0px;" title="vivisimo search" src="http://arnoldit.com/wordpress/wp-content/uploads/2012/04/vivisimosearch_thumb.png" alt="vivisimo search" width="244" height="150" border="0" /></a></p>
<p><span style="color: #800000; font-size: x-small;">A screen shot of the Vivisimo Velocity system displaying search results for the RAND organization. Notice the folders in the left hand panel. The interface reveals Vivisimo’s roots in traditional search and retrieval. The federating function operates behind the scenes. The newest versions of Velocity permit a user to annotate a search hit so the system will boost it in subsequent queries if the comment is positive. A negative rating on a result suppresses that result. </span></p>
<p>I learned that IBM allegedly purchased <a href="http://www.vivisimo.com" target="_blank">Vivisimo</a>, a company which I have covered in my various monographs about search and content processing. Forbes ran a story which was at odds with my understanding of what the Vivisimo technology actually does. Here’s the Forbes’ title: “<a href="http://www.forbes.com/sites/ericsavitz/2012/04/25/ibm-to-buy-vivisimo-expands-bet-on-big-data-analytics/" target="_blank">IBM To Buy Vivisimo; Expands Bet On Big Data Analytics</a>.” Notice the phrase “big data analytics.”</p>
<p>Why do I point out the “big data” buzzword? The reasons include:</p>
<ul>
<li>Vivisimo has a clustering method which takes search results and groups them, placing similar results identified by the method in “folders”</li>
<li>Vivisimo has a federating method which, like <a href="http://www.brightplanet.com" target="_blank">Bright Planet’s</a> and <a href="http://www.deepwebtech.com" target="_blank">Deep Web Technologies’</a>, takes a user’s query and sends the query to two or more indexing systems, retrieves the results, and displays them to the user</li>
<li>Vivisimo has a clever de-duplication method which makes the results list present one item. This is important when one encounters a news story which appears on multiple Web sites.</li>
</ul>
<p>According to the write up in Forbes, a “real” news outfit:</p>
<blockquote><p><a href="http://finapps.forbes.com/finapps/jsp/finance/compinfo/CIAtAGlance.jsp?tkr=ibm">IBM</a> this morning <a href="http://finance.yahoo.com/news/ibm-advances-big-data-analytics-130000269.html">said</a> it has agreed to acquire <a href="http://vivisimo.com/">Vivisimo</a>, a Pittsburgh-based provider of big data access and analysis tools.</p></blockquote>
<p>Okay, but in <em>Beyond Search</em> we have documented that Vivisimo followed this trajectory in its sales and marketing efforts since the company opened for business in 2000. In fact, the <a href="http://en.wikipedia.org/wiki/Vivisimo" target="_blank">Wikipedia</a> write up about Vivisimo says this:</p>
<blockquote><p>Vivisimo is a privately held enterprise search software company in Pittsburgh that develops and sells software products to improve search on the web and in enterprises. The focus of Vivisimo&#8217;s research thus far has been the concept of <a href="http://en.wikipedia.org/wiki/Data_clustering">clustering</a> search results based on topic: for example, dividing the results of a search for &#8220;cell&#8221; into groups like &#8220;biology,&#8221; &#8220;battery,&#8221; and &#8220;prison.&#8221; This process allows users to intuitively narrow their search results to a particular category or browse through related fields of information, and seeks to avoid the &#8220;overload&#8221; problem of sorting through too many results.</p></blockquote>
<p><span id="more-25852"></span></p>
<p>Recently Vivisimo was pushing the concept of “information optimization.” I am not sure what this phrase means. Perhaps the phrase is a synonym for “big data.” Seems a stretch to me. Vivisimo also, like MarkLogic, morphed into an enterprise search vendor and then, like Coveo, changed instantly into a vendor of customer support systems. Along the way, Vivisimo presented itself to some US government agencies as a systems integrator. A couple of the large scale deployments of the Vivisimo system lit up my radar as darned exciting. Sorry, no details in a free blog post.</p>
<p>Now let’s think about what IBM may be doing with a clustering and federating system? I think IBM could fix up the lousy clustering functions in products such as FileNet. I think IBM could use Vivisimo’s technology to improve the search function on the IBM.com Web site. I think IBM could use Vivisimo technology to refine the outputs of Lucene search system which is deployed in a number of IBM products.</p>
<p>However, I do not get the “big data” angle. No, not at all. Search systems which make sense of flows of data from social media, intelligence monitoring systems, and various types of specialized unstructured information systems come from outfits like <a href="http://www.digitalreasoning.com" target="_blank">Digital Reasoning</a>, <a href="http://www.ikanow.com" target="_blank">Ikanow</a>, Palantir, and a handful of other firms.</p>
<p>My view is that IBM is getting back into the walled garden business in an effort to sell specialized functions to customers who are not happy with IBM’s existing line up of solutions. It is possible that IBM can bolt Vivisimo to Cognos, but SPSS already contains some text processing capabilities. Anyone remember Clementine? The i2 Group’s technology is not really “big data.”</p>
<p>I have someone monitoring news of this deal. If anything substantive turns up, we will update this post. For now, the IBM Vivisimo deal seems to be about “big data.” If true, the assertion is one that does not match what I know about Vivisimo’s systems and methods. We are often wide of the mark here in Harrod’s Creek, but I am struggling to see how Vivisimo contributes in the same way that <a href="http://www.digitalreasoning.com" target="_blank">Digital Reasoning</a> or even <a href="http://www.datastax.com" target="_blank">DataStax</a> does in the search sector.</p>
<p>The BMW dealers in Pittsburgh will be happy. Hopefully one of the founders will be relieved of manual script editing duties now that IBM is the boss.</p>
<p><a href="http://www.arnoldit.com/sitemap.html" target="_blank">Stephen E Arnold</a>, April 25, 2012</p>
<p>Sponsored by <a href="http://www.polyspot.com" target="_blank">PolySpot</a></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2012/04/25/ibm-buys-vivisimo-allegedly-for-its-big-data-prowess/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Access Control and Enterprise Search Capabilities</title>
		<link>http://arnoldit.com/wordpress/2011/11/29/spotlight-access-control-and-enterprise-search-capabilities/</link>
		<comments>http://arnoldit.com/wordpress/2011/11/29/spotlight-access-control-and-enterprise-search-capabilities/#comments</comments>
		<pubDate>Tue, 29 Nov 2011 05:14:14 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=21222</guid>
		<description><![CDATA[Nuances of enterprise search and the challenges some searchers face are discussed in “Why is Enterprise Search more complex than web or desktop search?” “Access control to the data is a big difference between Enterprise search and the other 2 search types.  On the Web, everybody is allowed to see the data. On your desktop you [...]]]></description>
			<content:encoded><![CDATA[<p>Nuances of enterprise search and the challenges some searchers face are discussed in <a href="http://www.keepitsimpleandfast.com/2009/01/why-is-enteprise-search-more-complex.html" target="_blank">“Why is Enterprise Search more complex than web or desktop search?”</a></p>
<blockquote><p><em>“Access control to the data is a big difference between Enterprise search and the other 2 search types.  On the Web, everybody is allowed to see the data. On your desktop you are allowed to see all data, because you are the owner. Web and desktop search can index all the data without to take access control into account.”</em></p></blockquote>
<p>In an enterprise, access control is very important. But we prefer to spend more time finding than searching. To get the results you want, you need the right solution and the right search structure and support.</p>
<p>Access control is not an obstacle for <a href="http://www.mindbreeze.com/solutions/enterprise/more-than-a-search.html" target="_blank">Mindbreeze</a>. Their search technology maintains user rights while searching all company-relevant information within the enterprise and in the cloud.</p>
<p>Sara Wood, November 29, 2011</p>
<div>Sponsored by <a href="http://pandia.com/enterprise-search/" target="_blank">Pandia.com</a></div>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2011/11/29/spotlight-access-control-and-enterprise-search-capabilities/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Vivisimo Rolls Out Cross-domain Search with Enhanced Security</title>
		<link>http://arnoldit.com/wordpress/2011/05/12/vivisimo-rolls-out-cross-domain-search-with-enhanced-security/</link>
		<comments>http://arnoldit.com/wordpress/2011/05/12/vivisimo-rolls-out-cross-domain-search-with-enhanced-security/#comments</comments>
		<pubDate>Thu, 12 May 2011 05:33:53 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=17551</guid>
		<description><![CDATA[Top Hosting Service Information reveals that “Vivisimo Showcases Secure, Cross-domain Intelligence Solutions” at this week’s DoDIIS Worldwide conference in Detroit. Since Vivisimo serves the federal government, including the defense community, this is a welcome development. The defense and intelligence communities recognize the need to improve information sharing as a way to achieve true all-source analysis [...]]]></description>
			<content:encoded><![CDATA[<p>Top Hosting Service Information reveals that “<a href="http://hosting.servers-blog.com/domain-search-vivisimo-showcases-secure-cross-domain-intelligence-solutions/">Vivisimo Showcases Secure, Cross-domain Intelligence Solutions</a>” at this week’s <a href="http://www.ncsi.com/Home/Default.aspx">DoDIIS Worldwide</a> conference in Detroit. Since <a href="http://www.vivisimo.com" target="_blank">Vivisimo</a> serves the federal government, including the defense community, this is a welcome development.</p>
<blockquote><p>The defense and intelligence communities recognize the need to improve information sharing as a way to achieve true all-source analysis and deliver timely, objective, and actionable intelligence to our senior decision makers and war fighters,’ says Bob Carter, vice president and general manager, federal, of Vivisimo. ‘In an era where spending cuts are being made to improve efficiencies, Vivisimo helps streamline operations and ultimately costs by allowing analysts significantly better access, processing and sharing of critical data necessary to the defense of the U.S.</p></blockquote>
<p>Assembling the myriad of data gathered from around the globe into useful information is one of today’s biggest challenges for the intelligence community. Though the government often travels behind the curve in tech fields, it seems to be stepping up in this area.</p>
<p>Cynthia Murrell May 12, 2011</p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2011/05/12/vivisimo-rolls-out-cross-domain-search-with-enhanced-security/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>X1 Management Change</title>
		<link>http://arnoldit.com/wordpress/2011/02/23/x1-management-change/</link>
		<comments>http://arnoldit.com/wordpress/2011/02/23/x1-management-change/#comments</comments>
		<pubDate>Wed, 23 Feb 2011 06:25:51 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Business strategy]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[Management]]></category>
		<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=16066</guid>
		<description><![CDATA[We have noted a number of management changes in the search and content sector. Now X1 Technologies has appointed a new leader for their eDiscovery division. X1 Technologies Appoints John Patzakis as President of eDiscovery, citing his extensive background in eDiscovery and corporate compliance as well as his knowledge of the law. &#8220;I am pleased [...]]]></description>
			<content:encoded><![CDATA[<p>We have noted a number of management changes in the search and content sector.</p>
<p>Now X1 Technologies has appointed a new leader for their eDiscovery division. <a href="http://vps.x1.com/news/x1-technologies-appoints-john-patzakis-as-president-of-ediscovery">X1 Technologies Appoints John Patzakis as President of eDiscovery</a>, citing his extensive background in eDiscovery and corporate compliance as well as his knowledge of the law.</p>
<blockquote><p>&#8220;I am pleased to welcome someone as accomplished as John to the X1 team,&#8221; said John Waller, CEO of X1 Technologies. &#8220;John&#8217;s background as a senior software executive coupled with his deep understanding of compliance and discovery law make him a perfect fit to lead our efforts in the eDiscovery market.&#8221;</p></blockquote>
<p>X1’s eDiscovery Search Suite allows users to search data stored in over 500 different files types and applications. This allows for quick retrieval of electronically stored information (ESI) for early case assessment. X1’s support of social media applications will be released this quarter. In Patzakis, <a href="http://www.x1.com" target="_blank">X1</a> has found a leader with the experience and skill to push them forward in the eDiscovery sector.</p>
<p>Emily Rae Aldridge, February 243, 2011</p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2011/02/23/x1-management-change/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Wages of SEO Sin</title>
		<link>http://arnoldit.com/wordpress/2011/02/13/the-wages-of-seo-sin/</link>
		<comments>http://arnoldit.com/wordpress/2011/02/13/the-wages-of-seo-sin/#comments</comments>
		<pubDate>Sun, 13 Feb 2011 16:18:47 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Business strategy]]></category>
		<category><![CDATA[Feature]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Real time search]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Text processing]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=16020</guid>
		<description><![CDATA[So Google can be fooled. It’s not nice to fool Mother Google. The inverse, however, is not accurate. Mother Google can take some liberties. Any indexing system can. Objectivity is in the eye of the beholder or the person who pays for results. Judging from the torrent of posts from “experts”, the big guns of [...]]]></description>
			<content:encoded><![CDATA[<p>So Google can be fooled. It’s not nice to fool Mother Google. The inverse, however, is not accurate. Mother Google can take some liberties. Any indexing system can. Objectivity is in the eye of the beholder or the person who pays for results.</p>
<p>Judging from the torrent of posts from “experts”, the big guns of search are saying, “We told you so.” The trigger for this outburst of criticism is the New York Times’s write up about JC Penny. You can try <a href="http://www.nytimes.com/2011/02/13/business/13search.html?_r=1&amp;partner=rss&amp;emc=rss" target="_blank">this link</a>, but I expect that it and its SEO crunchy headline will go dark shortly. (Yep, the NYT is in the SEO game too.)</p>
<p>Everyone from <a href="http://techcrunch.com/2011/02/12/search-still-sucks/" target="_blank">AOL news</a> to <a href="http://searchengineland.com/new-york-times-exposes-j-c-penney-link-scheme-that-causes-plummeting-rankings-in-google-64529" target="_blank">blog-o-rama wizards</a> are reviling Google for not figuring out how to stop folks from gaming the system. Sigh.</p>
<p>I am not sure how many years ago I wrote the “search sucks” article for <em>Searcher Magazine</em>. My position was clear long before the JC Penny affair and the slowly growing awareness that search is anything BUT objective.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2011/02/image3.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2011/02/image_thumb3.png" border="0" alt="image" width="173" height="148" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Source: </span><a title="http://www.brianjamesnyc.com/blog/?p=157" href="http://www.brianjamesnyc.com/blog/?p=157"><span style="color: #800000; font-size: xx-small;">http://www.brianjamesnyc.com/blog/?p=157</span></a></p>
<p>In the good old days, database bias was set forth in the editorial policies for online files. You could disagree with what we selected for ABI/INFORM, but we made an effort to explain what we selected, why we selected certain items for the file, and how the decision affected assignment of index terms and classification codes. The point was that we were explaining the mechanism for making a database which we hoped would be useful. We were successful, and we tried to avoid the silliness of claiming comprehensive coverage. We had an editorial policy, and we shaped our work to that policy. Most people in 1980 did not know much about online. I am willing to risk this statement: I don’t think too many people in 2011 know about online and Web indexing. In the absence of knowledge, some remarkable actions occur.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2011/02/image4.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2011/02/image_thumb4.png" border="0" alt="image" width="168" height="181" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">You don’t know what you don’t know or the unknown unknowns. Source: </span><a title="http://dealbreaker.com/donald-rumsfeld/" href="http://dealbreaker.com/donald-rumsfeld/"><span style="color: #800000; font-size: xx-small;">http://dealbreaker.com/donald-rumsfeld/</span></a></p>
<p>Flash forward to the Web. Most users assume incorrectly that a search engine is objective. Baloney. Just as we set an editorial policy for ABI/INFORM each crawler and content processing system has similar decisions beneath it.</p>
<p>The difference is that at ABI/INFORM we explained our bias. The modern Web and enterprise search engines don’t. If a system tries to explain what it does, most of the failed Web masters, English majors working as consultants, and unemployed lawyers turned search experts just don’t care.</p>
<p>Search and content processing are complicated businesses, and the appetite for the gory details about certain issues are of zero interest to most professionals. Here’s a quick list of “decisions” that must be made for a basic search engine:</p>
<ul>
<li>How deep will we crawl? Most engines set a limit. No one, not even Google, has the time or money to follow every link.</li>
<li>How frequently will we update? Most search engines have to allocate resources in order to get a reasonable index refresh. Sites that get zero traffic don’t get updated too often. Sites that are sprawling and deep may get three of four levels of indexing. The rest? Forget it.</li>
<li>What will we index? Most people perceive the various Web search systems as indexing the entire Web. Baloney. Bing.com makes decisions about what to index and when, and I find that it favors certain verticals and trendy topics. Google does a bit better, but there are bluebirds, canaries, and sparrows. Bluebirds get indexed thoroughly and frequently. See Google News for an example. For Google’s Uncle Sam, a different schedule applies. In between, there are lots of sites and lots of factors at play, not the least of which is money.</li>
<li>What is on the stop list? Yep, a list can kill index pointers, making the site invisible.</li>
<li>When will we revisit a site with slow response time?</li>
<li>What actions do we take when a site is owned by a key stakeholder?</li>
</ul>
<p><span id="more-16020"></span></p>
<p>Is it possible to spoof Google? Sure. The JC Penny example is a good one, but I find examples of Google’s bumbling every day. I get auto generated pages of baloney. I get links to 404 errors at the Health &amp; Human Services Web site. I find examples of content in the index and not in the cache. I find in most results lists totally useless links.</p>
<p>Run a query for “information optimization”. What do you get for this meaningless phrase? You get a link to Vivisimo which uses “information optimization” instead of “search done right”, its 2007 catchphrase. You get links to Hewlett Packard, a blog about information optimization, and baloney about search engine optimization. The problem is that the phrase is essentially meaningless. I think it has been crafted to make it easier to locate outfits in the fuzz business.</p>
<p>Google falls for this joyfully. Google even lists the mind bogglingly expensive Google Search Appliance as associated with “information optimization” via another meaningless phrase, “knowledge management.”</p>
<p>The fact of the matter is that as the Web content diffuses and becomes more voluminous, the opportunities to play tricks increases. And the Web search engines continue to make their decisions behind the scenes. Google is proud of the fact that it keeps its method secret. In The Google Legacy, I summarized about 100 factors in use in 2004 and 2005. Each “factor” is a form of editorial policy.</p>
<p>The content of indexes is never objective. Whether one looks at a commercial database or a Web index, decisions inform the scope, depth, and approach of what’s available to a user.</p>
<p>SEO or search engine optimization is just a variant of indexing. I personally find SEO in general and SEO experts in particular annoying at best. I prefer that Web sites have content about a subject. That content can be casual like the information in this <a href="http://www.arnoldit.com/wordpress" target="_blank">Beyond Search</a> blog. It can be weaponized like the information on some government and political sites. It can be wacky like the humor sites. SEO injects links and words that are designed to fool the Web indexing systems.</p>
<p>So what we have is bias in the indexing systems. We have bias in the content. We have bias in the tagging and linking.</p>
<p>So now everyone is horrified that free Web search systems are not “objective”.</p>
<p>Give me a break.</p>
<p>The whole search sector is not objective. The algorithms execute but no one knows the weights, thresholds, or tweaks under the hood. Believe me. There is a lot of fiddling that must be done. In the first index of the US Federal government using the Inktomi system, considerable time was spent removing certain content from the index. No one paid much attention in Year 2000, and I don’t too many people pay much attention to the contents and content scrubbing in most indexes. Where did that education policy go on the company Intranet site? Answer: the technical team was told to delete the pointers. Good bye information. Few notice. Filtering, cleaning, and scrubbing are routine in each of the component of a search system. One vice president wanted his Web site’s content updated in near real time. No problem. The Railway Retirement Board? Well, every six months will probably do the trick.</p>
<p>Two different users of the same search system can derive different results via their own search methods. The “smart” systems can display different search results for different users. Web content creators can manipulate the Web indexing systems. Heck, the Web indexing systems can manipulate the results to their benefit or the benefit of their bottom line.</p>
<p>There is no free lunch for those who don’t know what they don’t know.</p>
<p>Fact: There are no objective search results. Just my view from Harrod’s Creek. Don’t believe me. Track down a person with a master’s degree in library science and ask that individual to run sample queries for you and then analyze the results.</p>
<p>Run the same queries on commercial systems and on free Web search system. No system have content congruence. No relevance method delivers exactly the same results unless a human intervenes. To get information, one must use multiple search systems, multiple sources, and talk to humans. The notion of “getting an answer” is popular. The problem is that the answer may be wrong, biased, or an ad. Most Web users are happy with whatever pops up. Intellectual laziness? Too much work? Convenience? Who knows.</p>
<p>The reason there is a degree in information science is a reaction to a need to understand provenance, precision, recall, editorial policies, and indexing.</p>
<p>English majors, lawyers, lousy Web masters, and money crazed SEO experts are different in many interesting ways. Their view of information often contrasts sharply with that of a person with deep experience in library and information science.</p>
<p>Web search has to generate revenue and only be “good enough.” Forget the superlatives and deal with the bias inherent in search, content processing, and indexing.</p>
<p>Stephen E Arnold, January 13, 2011</p>
<p><em>Freebie and not lunch</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2011/02/13/the-wages-of-seo-sin/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Kartoo Closes and Opens the Door for Yometa</title>
		<link>http://arnoldit.com/wordpress/2011/02/07/kartoo-closes-and-opens-the-door-for-yometa/</link>
		<comments>http://arnoldit.com/wordpress/2011/02/07/kartoo-closes-and-opens-the-door-for-yometa/#comments</comments>
		<pubDate>Mon, 07 Feb 2011 06:22:27 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Business strategy]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=15866</guid>
		<description><![CDATA[We learned from one of our readers that Kartoo has turned out its lights. According to Wikipedia, the company shut down after a nine year run. Kartoo relied on Flash to display search results. Novel? Yes. Useful. In some types of queries, yes. If you are interested in visual search, you can check out Yometa.com. [...]]]></description>
			<content:encoded><![CDATA[<p>We learned from one of our readers that Kartoo has turned out its lights. According to <a href="http://en.wikipedia.org/wiki/Kartoo" target="_blank">Wikipedia</a>, the company shut down after a nine year run. Kartoo relied on Flash to display search results. Novel? Yes. Useful. In some types of queries, yes.</p>
<p>If you are interested in visual search, you can check out <a href="http://www.yometa.com" target="_blank">Yometa.com</a>. This is a federating search system which taps results from Bing, Google, and Yahoo. A query for “Stephen E Arnold” returned this display.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2011/02/yometa.jpg"><img style="display: inline; border: 0px;" title="yometa" src="http://arnoldit.com/wordpress/wp-content/uploads/2011/02/yometa_thumb.jpg" border="0" alt="yometa" width="244" height="150" /></a></p>
<p><a href="http://www.yometa.com"><span style="color: #800000; font-size: xx-small;">http://www.yometa.com</span></a></p>
<p>Yometa displays the most relevant search results based on a combination of the three search engines ranking determined by the Yometa algorithm.</p>
<p>The company developed its approach based on research that showed that 97 percent of search results by the three search engines(Google, Yahoo and Bing) are different and there is only three percent overlap. The visual interface allows users to see results of Google, Bing and Yahoo individually and in various combinations. Users can see any combination of search results from Bing, Yahoo and Google in one screen and is displayed in a visual interface. The search results are displayed in a Venn Diagram, the results closer to the middle are more relevant.</p>
<p>For more information navigate to <a href="http://www.yometa.com/about/">www.yometa.com/about/</a> .</p>
<p>Stephen E Arnold, February 7, 2011</p>
<p><em>Freebie</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2011/02/07/kartoo-closes-and-opens-the-door-for-yometa/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Yippee Shouts Wii</title>
		<link>http://arnoldit.com/wordpress/2010/08/11/yippee-shouts-wii/</link>
		<comments>http://arnoldit.com/wordpress/2010/08/11/yippee-shouts-wii/#comments</comments>
		<pubDate>Wed, 11 Aug 2010 05:44:54 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Business strategy]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[Marketing]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=13458</guid>
		<description><![CDATA[Yippy, Inc. has good reason to rejoice. In “Yippy Releases Family Friendly Search For Nintendo Wii” http://www.tmcnet.com/usubmit/2010/07/28/4925824.htm VP Emily Parker says “the Yippee Wii search has been optimized for use with Nintendo Wii game controls and features Yippy content-blocking protocols.” The report also tells of a soon-to-be-released Yippee Wii Browser with cloud-based content management platforms. [...]]]></description>
			<content:encoded><![CDATA[<p>Yippy, Inc. has good reason to rejoice. In “Yippy Releases Family Friendly Search For Nintendo Wii” <a href="http://www.tmcnet.com/usubmit/2010/07/28/4925824.htm">http://www.tmcnet.com/usubmit/2010/07/28/4925824.htm</a> VP Emily Parker says “the Yippee Wii search has been optimized for use with Nintendo Wii game controls and features Yippy content-blocking protocols.” The report also tells of a soon-to-be-released Yippee Wii Browser with cloud-based content management platforms.</p>
<p>Let’s not get ahead of ourselves. A family friendly search was the focus of The Point (Top 5% of the Internet), developed by Beyond Search’s Stephen E. Arnold, his son, Erik S. Arnold, and business partner, Chris Kitze in 1993. the Point service sold to Lycos in 1996, and, alas,  Lycos lost its way. Now, a 17-yr-old idea is back, proving The Point was right on target almost two decades ago.</p>
<p>Brett Quinn, August 11, 2010</p>
<p><em>Freebie</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/08/11/yippee-shouts-wii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Deep Web Technology Nails Deal with SWETS</title>
		<link>http://arnoldit.com/wordpress/2010/02/14/deep-web-technology-nails-deal-with-swets/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/14/deep-web-technology-nails-deal-with-swets/#comments</comments>
		<pubDate>Sun, 14 Feb 2010 07:04:32 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Federated search]]></category>
		<category><![CDATA[Financial]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Online (general)]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Text processing]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10842</guid>
		<description><![CDATA[Abe Lederman (one of the founders of Verity) alerted me this morning that his company, Deep Web Technology, signed a deal and partnership agreement with SWETS. This Netherlands-based company is one of the world’s leading subscription services. SWETS helps government agencies and companies with subscriptions and related services. The firm has clients in over 160 [...]]]></description>
			<content:encoded><![CDATA[<p>Abe Lederman (one of the founders of <a href="http://www.autonomy.com" target="_blank">Verity</a>) alerted me this morning that his company, <a href="http://www.deepwebtech.com" target="_blank">Deep Web Technology</a>, signed a deal and partnership agreement with SWETS. This Netherlands-based company is one of the world’s leading subscription services. <a href="http://www.swets.com/" target="_blank">SWETS</a> helps government agencies and companies with subscriptions and related services. The firm has clients in over 160 countries and describes itself as “a long-talk powerhouse.”</p>
<p>Deep Web Technology provides the software and systems that fuel <a href="http://www.science.gov/" target="_blank">Science.gov</a>, a US government search and retrieval project. Science.gov taps into a wide range of data and information related to science and technology. The invention of the Deep Web method was an outgrowth of Dr. Lederman’s experience in providing a user with access to a broad range of structured and unstructured data. In my various reports on enterprise and special purpose search, I have given Dr. Lederman’s method high marks, and I even let him buy me a taco in a restaurant in Santa Fe, after I finished a lecture at Los Alamos. Dr. Lederman contributed at Los Alamos prior to founding Deep Web as I recall.</p>
<p>The deal brings Dr. Lederman’s federation technology to the SwetsWise Searcher. This service will be powered by Deep Web Technology. SwetsWise is designed to help librarians and their users meet the challenge of searching and finding relevant results from the ever-increasing catalog of content available online. The search system simplifies access to an organization’s diverse and valuable resources, along with the open Web content users are accustomed to searching. SWETS will deliver search results through the Deep Web ranking engine, providing incremental results for fast response times, scalability and flexibility. SwetsWise Searcher performs a rapid parallel search of all available sources or selected sources in real-time, ensuring fresh information and that documents are retrieved the minute they are published into a collection’s database. A simple search box to cover all sources can be integrated into any web page, blog or Intranet homepage.</p>
<p>A happy quack to Deep Web Technology. No more tacos in Santa Fe. I want a nuked burrito, a nod to our friends up the road.</p>
<p>Stephen E Arnold, February 14, 2010</p>
<p><em>No one paid me to write this. I do have a promise of a taco in Santa Fe, which I have just rejected. I will report this to the Food &amp; Drug Administration.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/14/deep-web-technology-nails-deal-with-swets/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Kartoo Tweaks Its Interface</title>
		<link>http://arnoldit.com/wordpress/2009/10/13/kartoo-tweaks-its-interface/</link>
		<comments>http://arnoldit.com/wordpress/2009/10/13/kartoo-tweaks-its-interface/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 09:05:15 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Federated search]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Text processing]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=8924</guid>
		<description><![CDATA[I have found the Kartoo.com service useful and innovative. I learned today that the company has rolled out a new interface and links that make it easier to locate the company’s other content processing technology. The new interface provides thumbnails of the top hits. You can explore other results by clicking on the links on [...]]]></description>
			<content:encoded><![CDATA[<p>I have found the <a href="http://www.kartoo.com" target="_blank">Kartoo.com</a> service useful and innovative. I learned today that the company has rolled out a new interface and links that make it easier to locate the company’s other content processing technology. The new interface provides thumbnails of the top hits. You can explore other results by clicking on the links on the page. The default interface for the query “text mining” appears below:</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2009/10/kartoonewinterface.jpg"><img style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" title="kartoo new interface" src="http://arnoldit.com/wordpress/wp-content/uploads/2009/10/kartoonewinterface_thumb.jpg" border="0" alt="kartoo new interface" width="244" height="158" /></a></p>
<p>Other new features include:</p>
<ul>
<li>E-reputation tools</li>
<li>Metasearch functions</li>
<li>Support for anonymous search</li>
<li>Support for French, English and Dutch language.</li>
</ul>
<p>If you have not explored the Kartoo service, give it a whirl.</p>
<p>Stephen Arnold, October 13, 2009, published because I like the French</p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2009/10/13/kartoo-tweaks-its-interface/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Vivisimo Issues Point Upgrade</title>
		<link>http://arnoldit.com/wordpress/2009/10/08/vivisimo-issues-point-upgrade/</link>
		<comments>http://arnoldit.com/wordpress/2009/10/08/vivisimo-issues-point-upgrade/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 08:04:33 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Business strategy]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[Federated search]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=8864</guid>
		<description><![CDATA[Vivisimo, http://www.vivismo.com, a company that works with email archiving, eDiscovery, and information management solutions, just released a revved-up version of the Velocity Enterprise Search Platform, which builds search-centric programs. The platform focuses on extensibility, scalability and performance; Vivisimo is using it to accelerate into OEM and reseller markets. Those programs are designed to add value [...]]]></description>
			<content:encoded><![CDATA[<p>Vivisimo, <a href="http://www.vivismo.com">http://www.vivismo.com</a>, a company that works with email archiving, eDiscovery, and information management solutions, just released a revved-up version of the Velocity Enterprise Search Platform, which builds search-centric programs. The platform focuses on extensibility, scalability and performance; Vivisimo is using it to accelerate into OEM and reseller markets. Those programs are designed to add value to existing applications and develop new solutions for sorting information assets, for example, it supports searching 1 billion emails on a single server. Vivisimo also says &#8220;With Velocity 7.5, new traceable accuracy metrics can accurately prove and defend that all data has been crawled and identify any documents that were not indexed due to corrupt file types.&#8221; This can be a big plus for companies dealing with growing regulation. A happy quack for Vivisimo (tagline: &#8220;Search Done Right!&#8221;). Any progress that can help enterprise business advance search and make sense of unstructured data is a good thing.</p>
<p>Jessica Bratcher, October 8, 2009</p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2009/10/08/vivisimo-issues-point-upgrade/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

