<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Beyond Search &#187; feature</title>
	<atom:link href="http://arnoldit.com/wordpress/category/feature/feed/" rel="self" type="application/rss+xml" />
	<link>http://arnoldit.com/wordpress</link>
	<description>by Stephen E. Arnold</description>
	<lastBuildDate>Mon, 22 Mar 2010 07:18:17 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The Discovery Hoax: Commercial Databases Make Big Promises</title>
		<link>http://arnoldit.com/wordpress/2010/03/08/the-discovery-hoax-commercial-databases-make-big-promises/</link>
		<comments>http://arnoldit.com/wordpress/2010/03/08/the-discovery-hoax-commercial-databases-make-big-promises/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 06:04:17 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[feature]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[online (general)]]></category>
		<category><![CDATA[publishing]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=11176</guid>
		<description><![CDATA[I was given a box lunch and a can of Pepsi as compensation for my one hour talk at a conference last week. I had an interesting conversation with a former big wheel in commercial database publishing. I thought the wizard was a retired poobah. I was wrong. The fellow had his shoulder pads on, [...]]]></description>
			<content:encoded><![CDATA[<p>I was given a box lunch and a can of Pepsi as compensation for my one hour talk at a conference last week. I had an interesting conversation with a former big wheel in commercial database publishing. I thought the wizard was a retired poobah. I was wrong. The fellow had his shoulder pads on, a sweatband, and Gucci cleats. He’s back on a commercial company’s publishing team. I am an old, cowardly goose, and it is with trepidation that I get too close to big people garbed for quasi-military re-enactments related to electronic information.</p>
<p>I asked the industry titan what his new gig involved. I recall one word, which he repeated several times to me, the addled goose. The word? “Discovery.” I thought I was having a <em>The Graduate</em> moment. In 2010, plastic was a loser. The winner? Discovery.</p>
<p>Yep, the lingo of the search and content processing market has reached the world of professional publishing and for-fee database access.</p>
<p>The idea, as I understood it, is that this commercial company will allow a user to enter a keyword; for example, <em>employee stock ownership</em>. The system will crunch away and present:</p>
<ol>
<li>Results from the firm’s for fee databases. Not just anyone can run a search. The user has to have access to an institutional account or sign up and pay. There is some free stuff, but this is a real, live make-money-or-die operation.</li>
<li>The system will also “discover” possibly related content and list that information in the form of links. I think the idea the titan was communicating is what <a href="http://www.endeca.com" target="_blank">Endeca</a> calls “Guided Navigation” in 1999! Not exactly yesterday! To see the Endeca system in action just go to <a href="http://www.officefurniture.com" target="_blank">OfficeFurniture.com</a>.</li>
<li>Content from the public Web.</li>
</ol>
<p>The idea is that a person using a commercial system will enter a search string and then see links to related content. This works for buying office furniture. I am not sure how a computational chemist would react to a suggestion she read a blog post about a meth lab that blows up.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/03/image2.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/03/image_thumb2.png" border="0" alt="image" width="244" height="184" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Yep, our professional grade service needs those custom chrome wheels. Image source: </span><a title="http://www.up.ac.za/organizations/movup/images/minefun/indian_haul_truck.jpg" href="http://www.up.ac.za/organizations/movup/images/minefun/indian_haul_truck.jpg"><span style="color: #800000; font-size: xx-small;">http://www.up.ac.za/organizations/movup/images/minefun/indian_haul_truck.jpg</span></a></p>
<p>I asked what happened if I used one of the company’s business databases and entered the search term “management.” I got a bit of double talk and the titan backed up, trying to get away from me. The reason I asked about this type of search is that I know from hands-on experience that the use of a general controlled term in his firm’s databases does not generate a usable results list. Thus, any “discovered” information is likely to be wide of the mark. Broad queries don’t often work too well in the for-fee, quite specific content in certain commercial systems. A single word like “management” in a Google search box generates what is highly ranked by clueless millions like a link to the <a href="http://en.wikipedia.org/wiki/Management" target="_blank">Wikipedia entry</a>.</p>
<p><span id="more-11176"></span></p>
<p>The commercial databases operate on a quite different premise. When a biochemist searches for a specific compound like 7-chloro-1,3-dihydro-1-methyl-5-phenyl-2H-1,4-benzodiazepin-2-one, the biochemist pretty much wants that compound, so suggestions from the Web sound good to someone who is not a biochemist but may not rev the motor of the researcher. Highly specialized services such as Chemical Abstracts can point to other content in which the compound is mentioned. But the difference between a fuzzy word like “management” and a chemical word like “methyl-5” is significant.</p>
<p>I next asked the titan how his discovery system differed from a federate search. The titan’s eyes narrowed, and he said, “Hey, I thought we were friends. This is not a debate.”</p>
<p>Wrong. When I converse with a person, I want that person to challenge me, and I want to challenge that person. I am not too interested in one’s golf game, killing deer, or the challenges facing Conan O’Brien.</p>
<p>I agreed but I pointed out that tossing around marketing buzzwords when describing the features of a professional information retrieval system catches the addled goose’s attention. Here’s why:</p>
<ol>
<li>Commercial systems are under significant pressure from demographic shifts. One example is the nature of research is changing. Some commercial database publishers are having to charge more money to keep their revenue up. When there are fewer clients (young lawyers or young researchers) who have the means to pay for information, the few paying customers have to pay more. Disguising price hikes with features that may be of little or no interest to a user is a marketing play. The reason a person uses a specialized information service is to get on point information. Fluff is easy. The hard core information is the purpose of the user’s search. Marketing won’t win the lawyer or biologist who won’t or can’t pay for the for fee service. Put the effort into the unique information, not lights in the wheel wells and 20 inch chrome wheels.</li>
<li>As I pointed out in my March column for Information World Review, putting junk like real time search results in a Google or Bing result set makes me do extra work. If I want specialized types of content, I want to use a specialized system for that information. I search a system to get the best that system has to offer. A bunch of results from different services does not help me. I suspect that if I saw a bunch of odds and ends when I was searching a for fee service, I would be really angry. Google is free. I don’t care what that outfit does. When I pay for a commercial service’s content, spare me the crap from unvetted Web sites. I can do the research myself, and I don’t want training wheels.</li>
<li>The notion of displaying related content is one of those short cuts that people under 30 really like. Government managers like software that tells folks what they need to know. The idea is that software should eliminate the need for browsing, reading, and filtering. Yep, that’s a great recipe for success. So called experts don’t know the history of online and give talks that describe stuff that won’t work as planned. The wheel gets reinvented again and again. The same errors are repeated with numbing regularity.</li>
</ol>
<p>The addled goose ventures out less and less. The world of commercial online is too frightening. The buzzwords strike me and make me even more addled. The titans scare me. The future is going to be shaped by people who mix up marketing and truly useful professional information. Yikes.</p>
<p>Stephen E Arnold, March 7, 2010</p>
<p>So I got paid for the talk and I got some food. This is a compensated post. Notice, however, that I did not include any links. Why? Mention these companies pushing deck chairs around the deck of the Steamship <a href="http://tenement-museum.blogspot.com/2010/01/tragedy-of-steamship-general-slocum.html" target="_blank">General Slocum.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/03/08/the-discovery-hoax-commercial-databases-make-big-promises/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When Domains Collide</title>
		<link>http://arnoldit.com/wordpress/2010/03/01/when-domains-collide/</link>
		<comments>http://arnoldit.com/wordpress/2010/03/01/when-domains-collide/#comments</comments>
		<pubDate>Mon, 01 Mar 2010 05:01:29 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[business strategy]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[online (general)]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=11092</guid>
		<description><![CDATA[Editors’s Note: This is a modified version of the lecture that Stephen E Arnold, ArnoldIT.com delivered in Philadelphia, March 1, 2010. The actual presentation was an extemporaneous talk based on this preliminary set of notes.
I want to thank NFAIS for inviting me to address the members of this professional organization. The world of bibliography, abstracting, [...]]]></description>
			<content:encoded><![CDATA[<h5><em><span style="color: #800000;">Editors’s Note: This is a modified version of the lecture that Stephen E Arnold, <a href="http://www.arnoldit.com" target="_blank">ArnoldIT.com</a> delivered in Philadelphia, March 1, 2010. The actual presentation was an extemporaneous talk based on this preliminary set of notes.</span></em></h5>
<p>I want to thank NFAIS for inviting me to address the members of this professional organization. The world of bibliography, abstracting, indexing, professional publishing and academic research has been shaken to its foundations in the last three or four years. The Richter scale measuring the waves pulsing through the bedrock of information access is being stretched. I find that talking about what is happening and what information professionals can do about those pulses difficult.</p>
<p>This morning I want to put the pulses into a context. I am cautiously optimistic about a finding my research has revealed. Specifically, the shocks are coming from the integration of formerly separate disciplines into new services. In short, the traditional methods are being put into software and hardware modules and used to build new, more efficient, and more flexible services. Complete information businesses are now a commodity component that a clever engineer can use like a building block. Good news for engineers skilled in integration. Not such good news for experts in a hand-craft like Linotype operation. By snapping together modules, domains collide and are reinvented.</p>
<p>That’s today’s world of information.</p>
<h4><strong>Where We Are</strong></h4>
<p>Today we live in a world of a number of global, possibly monopolistic online research services stands and literally a hundred million or more citizen journalists creating blogs and tweets.</p>
<p>Until recently, say about 1979 or 1980, a scholar transported from the 11<sup>th</sup> century scriptorium would have become familiar quickly with the hard copy research books painstakingly documented by Constance Winchell. But move that person to today’s world and the mental shift would be more difficult, perhaps impossible.</p>
<p>Bring that 11th century researcher to today’s world, and I think adjustment would be difficult. Since the advent of online (anyone remember <a href="http://en.wikipedia.org/wiki/NLS_%28computer_system%29" target="_blank">NLS</a>?), information is just “out there”. Today information is “here” when it appears on a screen. The display of information is evanescent until it is “written”—that is, copied—to a storage device which may be located “out there”. It is possible to print an item of information, but the digital instance is the “real information.” This is a significant conceptual shift since online became our common information currency.</p>
<p>In fact, I cannot begin work until I “find” the particular electronic instance on which I am to work. Without search and retrieval, I am a cooked goose.</p>
<p>And just finding a particular document can be difficult even with the many search systems available. If our time traveling 11th century research can print a document, the information needed may surrounded by unwanted images and advertisements. Without the ability to recognize the “real” information our 11th century scholar would be hard pressed to use today’s information retrieval systems. The monk comes from another time, and that time has its own domain of information. The domain includes ways to create information, way to access information, and ways to reference other information. The monk might be squashed when his domain collided with the domain of 2010 information access. When domains collide, methods are crushed, recycled, and remade. This is deeply disturbing to people who cling to specific ways of doing such things as research.</p>
<p>The implications of domain collision are important in my opinion. Economics, human behavior, work processes, and speed are defined by domains. Let’s run down a handful of the challenges domain collisions ignite. The good news is that domains that touch create a boundary condition in which opportunities can flourish.</p>
<h4><strong>Challenges of Domain Collisions</strong></h4>
<p>If you have a business school degree, you have studied the touchstone buggy whip reference in Theodore Levitt’s “Marketing Myopia” that appeared in the <em>Harvard Business Journal</em> in 1960. The idea is that a buggy whip manufacturer who anticipated the advent of the automobile could have expanded the product line to include a leather steering wheel wrap or automobile interiors.</p>
<p>Thus, the problem is that each domain has a certain way of perceiving phenomena. I won’t dwell on phenomenological existentialism, but I think it has quite a bit to teach us about what we can see when something “new” this way comes. We are, in the telling phrase of William James, stricken with “a certain blindness”. We simply cannot see beyond our domain. When domains collide, not only our vision is impaired we must deal with processes and methods that have been transformed by the forces involved.</p>
<p>Not surprising, the problems of apprehending have triggered a cascade of challenges. Vocabulary is an issue. One example is the use of abbreviated spelling and neologisms to communicate in Twitter “tweets” or short messages via a mobile device. Messages such as <strong><em>ru w/me</em></strong> grate on some. To those in the domain, the messages is clear and appropriate.</p>
<p>Other phenomena I have observed include:</p>
<ol>
<li>Work methods crafted for one domain such as copying a manuscript by hand on animal skin do not transfer to another domain such as copying information to a storage device. An entire lifetime of learning is irrelevant in the new domain.</li>
<li>The time required to assemble a document is measured by manual tasks that are often organized in a sequential manner. The digital domain allows many tasks to be handled quickly and, in some cases, in parallel.</li>
<li>The costs for manual, serialized work processes can be problematic. When software can be used to eliminate certain work previously done by humans, the economics change.</li>
</ol>
<p>I think you can see from these examples that our time traveling researcher from Mont St Michel in the Middle Ages would have a steep learning curve.</p>
<p>I have given quite a bit of thought to the implications of this type of domain collision. I know when I look at banking, retail, manufacturing, and finding the right person to marry that domain collisions are one of the defining attributes of today’s world.</p>
<h4><strong>Publishing</strong></h4>
<p>I want to comment about publishing because most NFAIS members are involved in the creation, selection, and dissemination of information. The domain collision began with the advent of the online search systems for the NASA RECON project, the work of Dr. Gerald Salton (Cornell University), and the non-linear increase in the capabilities of hardware and software.</p>
<p>What is interesting to me is that since this revolution began, arguably in the 1970s, publishing has been eager to embrace certain technologies yet reluctant to get too close to other technologies.</p>
<p>Let me give you an example. When I worked at the Courier Journal &amp; Louisville Times Co., we operated a rotogravure press and we printed the <em>New York Times</em> <em>Sunday Magazine</em>. We embraced traditional rotograveur printing technology and then we adopted technology that chopped the manual plate making process out of the work flow. We used computers, fancy software, and numerically-controlled presses as early as the early 1980s.</p>
<p>The Courier-Journal Board of Directors understood the importance of electronic information and created a separate separate business unit to build digital products. I was lucky to participate in the development a profitable online business with ABI/INFORM, Business Dateline, Pharmaceutical News Index, and the core technical databases that were the foundation of today’s Cambridge Scientific Abstracts. This work took place in the early 1980s and relied on traditional mainframes and timesharing businesses like <a href="http://en.wikipedia.org/wiki/Tymnet" target="_blank">Tymnet</a> and <a href="http://en.wikipedia.org/wiki/Dialcom" target="_blank">Dialcom</a> as service bureaus.</p>
<p>I know from first-hand experience that those who managed the technologies steeped in the domain of traditional newspaper production believed <em>their</em> unit of the company was in the thick of technological change. The electronic publishing technology was a radical and strange undertaking. The people running the state-of-the-art four color printing presses did not see how electronic information could be a viable business.</p>
<p>We know now that the electronic publishing technology has emerged as one of the key technologies for information companies today. In fact, the brutal struggles between Macmillan and Amazon, Apple and Sony, and Google and book publishers are anchored in the technology that was a second-class citizen in the 1980s.</p>
<p>What’s interesting is that within publishing the domain of the traditional products like books, music, motion pictures, and television programming is now colliding with the domain of the network computing infrastructure. Complete businesses and their nested processes are now a Web service. One can download a electronic publishing system as open source software. The key point is that anyone anywhere in the world can become a digital newsroom with a Web site, newsfeed, and a community.</p>
<p>What’s even more interesting is that the agents of change are the children of many publishing executives and in some cases, the former employees of established publishing and rich media companies.</p>
<p>Another interesting point is that the new domain of content production is surrounding the traditional information industry which Paul Zirkowski tried to capture in this diagram from the Information Industry Association in the mid-1980s, which, in my opinion, nicely summarizes what we now know as the Petri dish for Amazon, Apple, and Google, among other firms.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0021.jpg"><img style="display: inline; border: 0px;" title="clip_image002" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image002_thumb1.jpg" border="0" alt="clip_image002" hspace="12" width="243" height="244" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">This is a diagram created by the “old” Information Industry Association. Created in the mid 1980s, it is an attempt to show how the information world at that was beginning to develop. What’s interesting is that the successes of Amazon, Apple, and Google, among other companies is dependent to some degree on combining several of these “old” segments in one service.</span></p>
<p>When I look at this diagram, I can see that the success of <a href="http://www.amazon.com" target="_blank">Amazon</a>, <a href="http://www.apple.com" target="_blank">Apple</a>, and <a href="http://www.google.com" target="_blank">Google</a> in information comes from taking the building blocks from this 20-year-old diagram and combining pieces into new constructions. Keep in mind that these firms are not in the strict sense traditional publishing companies. These are technology-centric companies whose engineering uses information as a catalyst to create new functions.</p>
<p><span id="more-11092"></span></p>
<p>Let me identify and comment briefly on three and then provide some concluding observations. If we had more time, this is a rich area and warrants a more in-depth discussion.</p>
<p>First, this diagram makes it easy to see that Apple fused hardware, information, and software. Once Apple saw how the iPod fused with the iTunes online store and the monetization of digitally-protected content worked, Apple was quick to focus on supporting the Windows platform. None of the competitors in MP3 players hit upon the particular chemical-like combination Apple found. Like a pharmaceutical company, Apple came up with the correct formulation and none of the competitors have been able to crack the Apple formulation. Apple created a new domain and has been able to use product commoditization and different value and quality pairs to build a de facto monopoly. Apple has recently used its disruptive power to change the Amazon subsidizing approach to eBook sales. Amazon, it seems to me, is in a defensive position in online music, eBooks, and now in the hardware business itself. Amazon’s core competency and its domain expertise are not likely to be sufficient to deal with Apple.</p>
<p>Second, in the case of Google versus Microsoft in Web search we see that Google has expanded beyond simple indexing into technology infrastructure. The firm’s investment in next-generation computer hardware and software has created a “mini pre-Judge Green AT&amp;T”-style monopoly. With the infrastructure in place, Google has been able to diffuse—perhaps a better word is seep—into other business sectors. This tactic is difficult for incumbents to blunt because Google has an infrastructure in place and its services can be changed with a few lines of software code. Incumbents have to figure out what Google is doing, make sure the infrastructure is in place, and then respond to Google. The problem is that once an incumbent realizes Google is a threat, it is too late. Google now has scale and scale cannot be duplicated quickly, cheaply, and robustly. Scale becomes a competitive advantage just as it was for the “old” AT&amp;T.</p>
<p>Third, we can see that Amazon has moved from the online sale of fungible products like books into a very different space by setting up an online store and then selling tangible and intangible products. Of most interest to me is Amazon’s video rental business and its music business. Amazon has a striking array of free, unprotected MP3 files available. Amazon has blended Content Services, Facilitation Services, and Content Packages. Amazon gets involved in setting up storefronts for other vendors. Amazon is also a publisher because I can take my quite dull monographs and sell them there.</p>
<h4><strong>Some Lessons</strong></h4>
<p>What do these examples tell us about domains that brush one another? The IIA diagram shows quite separate segments of the information industry in 1980. What we now know is that the boundaries between once-distinct information sectors have touched. The interaction or collision has produced opportunities and three companies along with Facebook, Twitter, and a handful of others have trumped the incumbents in revenue, customer pull, and products.</p>
<p>First, I see significant opportunities for those who can look at the original IIA diagram and weave together solutions that cut across the boundaries the IIA identified 20 years ago. Instead of problems, I think quite a bit of opportunity exists by looking at ways to bridge domains.</p>
<p>Second, I think the examples of Amazon, Apple, and Google are not isolated. Each has a high profile and attracts a great deal of attention because of the economic clout to which there seems to be no easy rejoinder from a company lacking a cross-domain presence. When a company with a competence in a specific domain runs into a cross domain player, the buggy whip problem cracks once more.</p>
<p>Third, I believe lower cost and more efficient business processes will encroach and eventually supplant less efficient and more expensive business processes. Some information sectors may persist as the equivalent of couture fashion and luxury goods like the Rolls Royce, but for the majority of information functions, the software-centric systems will become dominant.</p>
<p>To conclude, let me make several observations about opportunities. One is that the success of the Apple iPod/iPhone/iPad App Store provides some ammunition for developing information services specifically for these customer segments. The idea that a publishing company can surf on Amazon, Apple, or Google is one that should be given serious thought. Pick one or pick all three. Ignore them all at your peril.</p>
<p>Another is that the notion of stopping the diffusion of technology into established domains has to be put aside. An attempt to stop lower cost and more efficient businesses from entering a market will not work. The reason is that both money and demographics are forcing the change. In today’s economic climate, investors will fund entities that use technology to lower costs, increase efficiency, and gain agility. Entities lacking these attributes will face significant challenges.</p>
<p>Finally, many information companies are adapting to the collision of the traditional information domain and the newer, software-enabled approaches. The popular press fixates on big name companies, but there are firms mounting significant challenges across the information spectrum. Examples range from YouTube.com distributing Sundance Festival films to Demand Media’s creating content at low cost for traditional publishers who do not know how to “manufacture” low-cost information. Facebook, not Google, is now one of the largest sources of rich media for its members. And Facebook is becoming a news referrer and a news generator of note. Not even Google has an answer to Facebook’s social content solutions.</p>
<p>When domains collide there is opportunity. Obviously great effort is needed to look objectively at existing business methods and processes, make changes, and then create new solutions. A changing of the guard is underway, and the building blocks of this revolution have been known for decades.</p>
<p>Now, time is short and purposeful, informed action is needed. If you want to talk about our work in identifying new opportunities, just contact me at sa at arnoldit dot com.</p>
<p>Stephen E Arnold, March 1, 2010</p>
<p><em>Yep, I was paid to write this. My opinion, though.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/03/01/when-domains-collide/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Is Content Management a Digital Titanic?</title>
		<link>http://arnoldit.com/wordpress/2010/02/25/is-content-management-a-digital-titanic/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/25/is-content-management-a-digital-titanic/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 06:05:14 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[eDiscovery]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[financial]]></category>
		<category><![CDATA[text processing]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=11045</guid>
		<description><![CDATA[Content management is a moving target. Unlike search, CMS is supposed to generate a Web page or some other type of content product. The “leaders” in content management systems or CMS seem to disappearing into larger organizations. Surprising. If CMS were healthy, why aren’t these technology outfits growing like crazy and spinning off tons of [...]]]></description>
			<content:encoded><![CDATA[<p>Content management is a moving target. Unlike search, CMS is supposed to generate a Web page or some other type of content product. The “leaders” in content management systems or CMS seem to disappearing into larger organizations. Surprising. If CMS were healthy, why aren’t these technology outfits growing like crazy and spinning off tons of cash?</p>
<p>I am no expert in CMS. In fact, I am not an expert in anything unlike the azure chip consultants, poobahs, and pundits who profess deep knowing at the press of a mouse button. In my experience, CMS emerged from people not having an easy way to produce HTML pages that could be displayed in a browser.</p>
<p>If HTML was too tough for some people, imagine the pickle barrel in which these folks find themselves today. In order to create a Web site, more than HTML is required. The crowd who relied on Microsoft’s Front Page find themselves struggling with the need to make Web pages work as applications or bundles of applications with some static brochureware thrown in for good measure.</p>
<p>To make a Web site today, technical know how is an absolute must. Even the very good point-and-click services from <a href="http://www.squarespace.com" target="_blank">SquareSpace.com</a> and <a href="http://www.weebly.com" target="_blank">Weebly.com</a> can baffle some people.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image25.png"><img style="display: inline; border-width: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb25.png" border="0" alt="image" width="244" height="158" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">The azure chip consultants, the mavens, and the poobahs want to be in the lifeboats. Women and children to the rear. Source: </span><a title="http://www.ronnestam.com/wp-content/uploads/2009/02/lifeboat_change_advertising_sinking.jpg" href="http://www.ronnestam.com/wp-content/uploads/2009/02/lifeboat_change_advertising_sinking.jpg"><span style="color: #800000; font-size: xx-small;">http://www.ronnestam.com/wp-content/uploads/2009/02/lifeboat_change_advertising_sinking.jpg</span></a></p>
<p>Move the need for a dynamic Web site into a big organization that is not good at technology, and you have a recipe for disaster. In fact, the wreckage created by some content management vendors, pundits, and integrators is of significant magnitude. There’s the big hassle in Australia over a blue chip CMS implementation that does not work. The US Senate went after the bluest of the blue chip integrators because a CMS could not generate a single Web page. Sigh.</p>
<p><span id="more-11045"></span></p>
<p>So why are big companies are buying CMS vendors. In my experience, CMS systems don’t work when licensees want to implement some new features and functions. Sure, the plumbing can be reinvented but that is difficult, time consuming, and expensive.</p>
<p>My hunch is that the companies buying CMS outfits want to cash in on these fee opportunities. After all, how touch can making some Web pages be? (Tip: Making Web pages can be tough.)</p>
<p>I have run across some situations where the CMS vendor created trouble for search vendors. The procurement teams and some of the scuttling CMS gurus are turning to search and content processing technology to make lemonade out of very expensive lemonade. I think in the white goods business, this is called bait-and-switch or selling damaged goods. Something along those lines I think.</p>
<p>Some vendors can rise to the challenge. I have been impressed with the performance of the engineers from <a href="http://www.exalead.com" target="_blank">Exalead</a> in Paris, for example. But other search vendors are pulling the old “let’s sell it and then code it” solution.</p>
<p>I keep wondering why companies with CMS systems keep buying more CMS systems. This question applies to a number of companies. For example, <a href="http://www.opentext.com" target="_blank">OpenText</a> showed me a LiveLink system years ago that was a CMS with collaboration. Now OpenText owns and has to support software with similar functions from RedDot, Vignette, and Nstein. Interesting financial and technical challenge I think.</p>
<p>Other  big outfits have grabbed CMS companies too. These include <a href="http://www.oracle.com" target="_blank">Oracle</a> which acquired Stellent. I don’t hear much about Stellent anymore. That may be a reflection of Oracle’s focus on making everything into a problem that Oracle database can solve. What happened to Triple Hop? What happened to SES10g? With Sun Microsystems, I think Oracle will offer hardware as a fix for certain performance related issues. EMC is playing the software and storage angle. EMC bought Documentum and recently Kazeon, a company focusing on eDiscovery. Maybe software and consulting can drive sales of storage devices? Iron Mountain has followed a similar path first with Stratify and <a href="http://www.ironmountain.com/mimosa/" target="_blank">Mimosa</a>. <a href="http://www.autonomy.com" target="_blank">Autonomy</a> snagged Interwoven.</p>
<p>I spoke with a company on couple of days ago with an interest in open source CMS systems. As part of a research project about open source software in 2009, I had to grind through the listings in the OpenSourceCMS service. You can find that information at the <a href="http://php.opensourcecms.com/general/ratings.php" target="_blank">OpenSourceCMS Ratings page</a>. After working through the ratings, it was clear that these software products were the equivalent of a model airplane kit with the pieces outlined in purple ink on balsa wood. To build the airplane, one had to cut out each piece, consult the schematics, assemble the plane, cover it with tissue, paint it, and then fly it. The knock against most CMS vendors is that the products are complex, tough to scale, and expensive to customize. Upstarts like <a href="http://www.wordpress.org" target="_blank">Wordpress</a>, <a href="http://www.squarespace.com" target="_blank">SquareSpace</a>, and <a href="http://www.weebly.com" target="_blank">Weebly</a> have won my affection.</p>
<p><a href="http://www.microsoft.com" target="_blank">Microsoft</a> seems intent on commoditizing further its SharePoint system. Also, in the mid-market are the open source “industrial strength system.” At each layer of the market, there is significant change. The big guys, confident of their resources and management expertise, are confident that the problems of the Vignette-, Documentum-, and Stellent-type systems are no big deal. (I think the problems are a big deal.) In the middle are the open source vendors who depend on license fees. At the bottom and pushing upwards are the drag-and-drop, do-it-yourself crowd. (Great stuff from Wordpress, SquareSpace, and Weebly by the way.)</p>
<p>Some CMS can be made to work. The clients are happy. Other clients can get the CMS to work but are unhappy with the costs of achieving that goal. But most CMS installations have one common characteristic: architectural changes are very difficult indeed. With SharePoint, one can look complexity straight in the eye.</p>
<p>For some organization, I don’t see an easy fix for CMS woes.</p>
<p>When <a href="http://dev.w3.org/html5/spec/Overview.html" target="_blank">HTML5</a> arrives, I think the big boys and the folks in the middle are going to face some major challenges. My bet is that the commercial outfits like Wordpress, SquareSpace, and Weebly will become more of a factor even in large companies. Heck, this stuff works and it is a cloud service. Someone else keeps the lights on and fixes the roof when it springs a leak.</p>
<p>What happens to the customers?</p>
<p>I think customers will be like the unfortunate women and children pushed out of the life boats by the bigger people. The CMS consultants will scramble. Some will reposition themselves and declare themselves experts in some other field. A few may invent a field and run boot camps to teach folks to hire them to learn about this “field”. What’s the fix? Easy-to-use cloud services from upstarts most likely.</p>
<p>Stephen E Arnold, February 26, 2010</p>
<p><em>No one paid me to write this article. Because I reference visually a sinking ship, I will report writing for free to the Maritime Administration and also to the Coast Guard.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/25/is-content-management-a-digital-titanic/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Buzz Search: Defaults Do Not Fly</title>
		<link>http://arnoldit.com/wordpress/2010/02/22/buzz-search-defaults-do-not-fly/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/22/buzz-search-defaults-do-not-fly/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 05:01:31 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[feature]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[real time search]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[social]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10987</guid>
		<description><![CDATA[Editor’s Note: Constance Ard, the Answer Maven, is one of the goslings. She wrote an overview of Google Buzz search functionality. Ms. Ard is active in the Special Libraries Association, heads up the legal interest group, and has an MLS with an emphasis on online search, taxonomies, and content processing. 
With the release of Buzz [...]]]></description>
			<content:encoded><![CDATA[<p><em><span style="color: #000080;">Editor’s Note: Constance Ard, the Answer Maven, is one of the goslings. She wrote an overview of Google Buzz search functionality. Ms. Ard is active in the Special Libraries Association, heads up the legal interest group, and has an MLS with an emphasis on online search, taxonomies, and content processing. </span></em></p>
<p>With the release of Buzz flapping everyone’s wings over the last Internet half-life, it’s time to consider some practical application for Buzz. Danny Sullivan at Search Engine Land has laid the <a href="http://searchengineland.com/how-to-search-google-buzz-36366">groundwork</a> for searching Buzz.</p>
<p>For the record, the type it in the box and trust the search results, aren’t enough with this service from Google. You can see below, that Buzz, a social media tool that gets food from Twitter, Google Reader, Friend Feed, and SMS display results from a typical box search that are surprisingly old in the real-time scheme of things.</p>
<p>These results are for a search done at approximately 8 p.m. EST on February 17, 2010, through the Buzz search box with the term: <em>Olympics</em>. The first result is time-stamped 4:50 p.m. The last result was stamped 9:41 a.m. and the second was stamped 8:23 a.m. These are not exactly real-time results and not even reverse chronological in display.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image002.jpg"><img style="display: inline; border: 0px;" title="clip_image002" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image002_thumb.jpg" border="0" alt="clip_image002" width="244" height="49" /></a></p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0024.jpg"><img style="display: inline; border: 0px;" title="clip_image002[4]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0024_thumb.jpg" border="0" alt="clip_image002[4]" width="244" height="24" /></a></p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0026.jpg"><img style="display: inline; border: 0px;" title="clip_image002[6]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0026_thumb.jpg" border="0" alt="clip_image002[6]" width="244" height="44" /></a></p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0028.jpg"><img style="display: inline; border: 0px;" title="clip_image002[8]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image0028_thumb.jpg" border="0" alt="clip_image002[8]" width="244" height="56" /></a></p>
<p>The same search on <a href="http://www.buzzzy.com/">Buzzzy.com</a> (selected results shown below) done at the same approximate time provides even more irritating displays. Has anyone heard of time, date stamps? I understand that in real-time search hours count but in search, pinpointing an accurate date and time is essential.</p>
<p><span id="more-10987"></span></p>
<p>As you look at the results, notice the first result was posted 7 hours prior to the search and the second hit was 5 days. How does this make sense?</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00210.jpg"><img style="display: inline; border: 0px;" title="clip_image002[10]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00210_thumb.jpg" border="0" alt="clip_image002[10]" width="244" height="111" /></a></p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00212.jpg"><img style="display: inline; border: 0px;" title="clip_image002[12]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00212_thumb.jpg" border="0" alt="clip_image002[12]" width="275" height="244" /></a></p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00214.jpg"><img style="display: inline; border: 0px;" title="clip_image002[14]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00214_thumb.jpg" border="0" alt="clip_image002[14]" width="287" height="136" /></a></p>
<p>On a positive note, the filter provided by Buzzy for hour (seen below) actually does get down to real-time results with the first hit being an item posted from Twitter 1 minute prior to search execution. Disappointing is the fact that no where on the page of results is a real date that can be used to capture data and show a point in time result.</p>
<p>Now perhaps it’s the law librarian in me that makes this lack of specific date and time particularly discouraging but even Buzz will find its way into the courtroom somehow, someway.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00216.jpg"><img style="display: inline; border: 0px;" title="clip_image002[16]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00216_thumb.jpg" border="0" alt="clip_image002[16]" width="244" height="129" /></a></p>
<p>Delving a bit deeper into Mr. Sullivan’s tutorial on searching Buzz he offers some advanced search features. I tried out Mr. Sullivan’s tip on searching Buzz for <strong>“has:link”</strong> This search gave me a bit of confirmation that the search engine company that is synonymous with finding information is actually behind Buzz. Every result on the first long page of results included links. <em>Note: This is an advanced feature and advanced features are not used by the majority of searchers.</em></p>
<p>The author search seems to work well although I’m slightly confused by one thing. A result that was retrieved when I searched for an author name and was designated as “uploaded by nickname” was not retrieved by a search for that same nickname. The commenter advanced search also seemed to work well and provided relevant results.</p>
<p>Buzzzy.com does not appear to offer up the same advanced search features as are provided by Buzz. It was interesting to note that the results for a name search that I ran which retrieved about a dozen results in Buzz, retrieved only 2 hits in Buzzzy.com and none of them were the same. Also the time stamp for one of those results was displayed:</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00218.jpg"><img style="display: inline; border: 0px;" title="clip_image002[18]" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/clip_image00218_thumb.jpg" border="0" alt="clip_image002[18]" width="244" height="31" /></a></p>
<p>The actual posting was made on February 12. This was neither an intuitive, nor accurate date/time stamp.</p>
<p>Mr. Sullivan also discusses the activity of Buzz vs. Twitter. He doesn’t give real numbers because that’s virtually impossible. What he does offer is the summation that Twitter still has more activity than Buzz which is a pretty safe summation considering the age difference in the two.</p>
<p>Here are my key take-outs for both Buzz and Buzzzy.com searching.</p>
<p><strong>Google Buzz</strong></p>
<ul>
<li>For accurate, results for “updates” follow Mr. Sullivan’s advise on using Google’s real-time search for the Google.com domain.</li>
<li>Advanced features for author, commenter, link/image/video work well.  Author’s Note:  I am not Buzzing so the email search for “is” that Mr. Sullivan recommends was not tested</li>
<li>A fatal flaw is the date, time display.  I’d rather have the results offered up with a true date and time than hours, minutes, seconds. Ideally both displays would be provided.</li>
<li>Relevance vs. chronological sorting is not apparent.  Neither seems to be the default.  Pick one and make it so.</li>
<li>“Googling” Buzz at this point in time does not mean users will “find” the answer.</li>
</ul>
<p><strong>Buzzzy.com</strong></p>
<ul>
<li>First out of the gate as a third party service but with some serious flaws.</li>
<li>Filter features are good and seem to work well.</li>
<li>Relevance vs. chronological sorting is an issue for this search engine too.</li>
<li>No good date time stamp here either.  And no date on the page for search results to pinpoint a starting place.</li>
<li>Test searches demonstrate missing content.</li>
</ul>
<p>I’m sure some will say that the whole point of Buzzing is the real-time nature of social media. I would answer that with: a) don’t provide search if updates are all that’s important and b) real-time obviously puts a premium on time so fix the date of origin.</p>
<p>Constance Ard, February 22, 2010</p>
<p><em>ArnoldIT.com paid Ms. Ard to write her views for this Web log.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/22/buzz-search-defaults-do-not-fly/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Jargon Means Shields Up for Consultants</title>
		<link>http://arnoldit.com/wordpress/2010/02/21/jargon-means-shields-up-for-consultants/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/21/jargon-means-shields-up-for-consultants/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 12:04:23 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[business strategy]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10947</guid>
		<description><![CDATA[I just read “Computer Jargon Baffles Users, Hinders Security.” This is a Thomson Reuters’ news story, and I don’t know if the wild and crazy url will work when you read this. Not my fault. Email Thomson Reuters, whose customer support crew is ready to help you.
The news story is one that runs every few [...]]]></description>
			<content:encoded><![CDATA[<p>I just read “<a href="http://www.reuters.com/article/idUSTRE61I2OB20100219?feedType=RSS&amp;feedName=internetNews&amp;utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+Reuters%2FInternetNews+%28News+%2F+US+%2F+Internet+News%29" target="_blank">Computer Jargon Baffles Users, Hinders Security</a>.” This is a Thomson Reuters’ news story, and I don’t know if the wild and crazy url will work when you read this. Not my fault. Email Thomson Reuters, whose customer support crew is ready to help you.</p>
<p>The news story is one that runs every few months. The idea is that jargon is pretty much impossible for the average person to figure out. The argument in the Thomson Reuters’ story pivots on security, but the journalist could have picked on search, business intelligence, or any other common enterprise application. Jargon is a defense mechanism. Magic.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image18.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb18.png" border="0" alt="image" width="223" height="244" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Source: </span><a title="http://s.bebo.com/app-image/7979726037/5411656627/PROFILE/i.quizzaz.com/img/q/u/08/04/08/Force_Field.jpg" href="http://s.bebo.com/app-image/7979726037/5411656627/PROFILE/i.quizzaz.com/img/q/u/08/04/08/Force_Field.jpg"><span style="color: #800000; font-size: xx-small;">http://s.bebo.com/app-image/7979726037/5411656627/PROFILE/i.quizzaz.com/img/q/u/08/04/08/Force_Field.jpg</span></a></p>
<p>For me, the key passage in the Thomson Reuters’ story was:</p>
<blockquote><p>&#8220;The malicious and criminal use of cyberspace today is stunning in its scope and innovation,&#8221; said Dell Services President Peter Altabef. One problem is that computer &#8220;geeks&#8221; use jargon to cloak their work in scholarly mystique, resulting in a lack of clarity in everything from instruction manuals and systems design to professional training, the experts said. &#8220;If you don&#8217;t demystify security, people become anxious about it and don&#8217;t want to do it,&#8221; former U.S. Homeland Security Secretary Michael Chertoff told Reuters on the sidelines of the EastWest Institute security meeting in Brussels.</p></blockquote>
<p>I had a conversation with a big wheel from a blue chip consulting firm. I really want to reveal which firm, but my legal eagle squawks when I provide certain information in this Web log. The guts of the conversation are easy to summarize.</p>
<p><span id="more-10947"></span></p>
<p>The blue-chip outfit needed my input regarding a certain high profile search vendor. I made my point about the firm. My recollection is that I said:</p>
<blockquote><p>System does not work as advertised. The firm has generated revenues by buying other companies. The revenue looks great but there has not been enough will power nor money to integrate the different acquisitions’ technology. In short, lots of difficulty configuring, tuning, customizing, and scaling this system.</p></blockquote>
<p>The blue-chip woman told me that she had never heard this in her previous interviews. She called some self-appointed experts, a couple of azure chip outfits, and read Web logs. That’s how she located me, she added.</p>
<p>I pointed out the following things I had learned in the last three or four decades:</p>
<p>First, it is easy to be positive when you have no hands on experience with search, content processing, indexing, and repurposing content. Everyone knows how to run a query on Google. Few know or care how Google generates the results or how a Google results list presents only one company’s index of content. Other sources have to be consulted because contrary to the average Web searchers’ perception Google does not have “everything”. Maybe someday. Just not now.</p>
<p>Second, vendors could use the Excalibur Technologies / Convera marketing collateral from 1981, and most experts and procurement teams would not be able to explain whether the system described worked or not. Even when given a demo, the majority of the tire kickers have little experience determining how long it takes to update an index, why users cannot locate information in the system, or how to convince the boss to provide the money to scale the system so users don’t wait several minutes for results. In short, a lack of knowledge contributes to the search problems many organizations face. Licensees don’t know what they are licensing. Some search vendor marketers don’t explain the buzz words. Result: errors, confusion, and cost overruns.</p>
<p>Third, users cannot explain what they want in terms that can be mapped to the actual functions of most search systems. In the last big search procurement the goslings handled for a non US company’s government wide system, the most common interview responses were “make it like Google” or “I need a report, not a laundry list.” When users veer between Google’s paternalistic and arrogant approach and stuff in science fiction movies, a knowledge gap is evident.</p>
<p>Again I don’t want to focus on specific consultants or experts, but I can characterize why the information about search and content processing often is misleading to enterprise procurement teams. You may find my comments at odds with the “feel good” cheerleading that passes as objective analysis of search and content processing vendors. That’s okay. I hear this and then when the system goes off the rails, I get an opportunity to work with wiser people. A massive foul up often has a focusing effect, not always, of course.</p>
<ol>
<li>The people writing about search, content processing, information management, or a related field may not have “deep craft”, a phrase originating with W. Brian Arthur. In short, there is some knowledge, just not enough. The analogy is to a person who can use an ATM machine without knowing how the bank authenticates the person’s debit card. Search or any information-related system requires “deep craft”.</li>
<li>The so called experts may be journalists who have found a new career, failed programmers who can talk better than they can code, or an art history major who landed a writing job and finds herself an expert by virtue of reading articles about information. Yep, that works really well until the “expert” has to make something work. One CMS expert installed Vignette to manage a bunch of links. That worked really well after a few million bucks were dumped on the problem.</li>
<li>The expert may know something—for example, which company in Seattle fired a specific vendor and then sent out an email to other vendors for a price quote only if the vendors’ system worked—but cannot reveal details for fear of litigation. The result is baloney like the information in this blog. I know stuff, but my legal eagle suggests I maintain radio silence. When a news item appears, like the one about the Middle Eastern company’s unhappiness with a certain vendor’s search system, then I comment. So, the “law” masks some potentially useful information by accident.</li>
<li>The vendors recycle lingo without thinking about its jargon quotient. I wrote about the drifting away of Convera. Take a moment and read what&#8217; Convera says about its search system. A marketing person at another company could recycle that lingo and it would work just fine. The words mean zero to most readers. I think that repeating jargon works like a mother’s heart beat. The baby is hard wired to a pattern. Run the pattern on a tape and the baby is a happy camper. Same with procurement teams in my opinion.</li>
</ol>
<p>On Thursday I had a long conference call about a certain vendor’s method of categorizing vendors as stars, contenders, dogs, and feral pigs. I pointed out that without some metrics or a repeatable method, the categorization was pretty much meaningless. One of the people on the call wanted to know if the consulting firm sold slots in its graphic. I have no idea, but I know that when the backlog goes to zero and the red ink flows, individuals make some darned interesting decisions. Remember Bernie Madoff?</p>
<p>Can you expect the “real story” from any azure chip consultant, poobah, self appointed expert, art history major turned information expert, or journalist going for the big payday as a poobah? Nope.</p>
<p>Direct talk, deep craft, and Malcolm Gladwell’s 10,000 hours are less plentiful than marketing baloney. It is much easier to use jargon, make the sale, and move on. Life in the 21st century is indeed amusing.</p>
<p>Just my opinion.</p>
<p>Stephen E Arnold, February 21, 2010</p>
<p><em>No one paid me to write this. Because I mention procurement, I will report my miserable condition that comes from writing for new money to the GSA. The GSA just spent $70 million to improve its information technology systems. Great project.</em></p>
<ol>
<li></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/21/jargon-means-shields-up-for-consultants/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Search Engine Convera Drifts Off</title>
		<link>http://arnoldit.com/wordpress/2010/02/16/search-engine-convera-drifts-off/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/16/search-engine-convera-drifts-off/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 07:05:38 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[business strategy]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[financial]]></category>
		<category><![CDATA[online (general)]]></category>
		<category><![CDATA[semantic]]></category>
		<category><![CDATA[technology]]></category>
		<category><![CDATA[text processing]]></category>
		<category><![CDATA[vertical search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10878</guid>
		<description><![CDATA[The journey was a long one, beginning with scanning marketing brochures in the 1990s has filed for a certificate of dissolution. I think this means that Convera has moved from the search engine death watch to the list which contains Delphes, Entopia, and other firms.

Convera splash page on February 15, 2010
You can read the official [...]]]></description>
			<content:encoded><![CDATA[<p>The journey was a long one, beginning with scanning marketing brochures in the 1990s has filed for a certificate of dissolution. I think this means that Convera has moved from the search engine death watch to the list which contains Delphes, Entopia, and other firms.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/converasplash.jpg"><img style="display: inline; border: 0px;" title="convera splash" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/converasplash_thumb.jpg" border="0" alt="convera splash" width="244" height="195" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Convera splash page on February 15, 2010</span></p>
<p>You can read the official statement for a few more days on the PRNewswire site. The title of the announcement is / was, “<a href="http://www.prnewswire.com/news-releases/convera-corporation-files-certificate-of-dissolution-trading-of-common-stock-to-cease-after-february-8-2010-payment-date-set-83848737.html" target="_blank">Convera Corporation Files Certificate of Dissolution, Trading of Common Stock to Cease after February 8, 2010 Payment Date Set</a>.” I am no attorney so maybe my lay understanding of “dissolution” is flawed, and Convera under another name will come roaring back. For the purposes of this round up of my thoughts, I am going to assume that <a href="http://www.convera.com" target="_blank">Convera</a> is comatose. I hope it bounces back with one of those miracles of search science. I am crossing my wings, even thought each has a dusting of snow this morning. Harrod’s Creek has become a mid south version of Nord Kap.</p>
<p>For me, the key passage in the write up was:</p>
<blockquote><p>Convera Corporation announced today that it filed its Certificate of Dissolution with the Delaware Secretary of State on February 8, 2010, in accordance with its previously announced plan of complete dissolution and liquidation.  As a result of such filing, the company has closed its stock transfer books and will discontinue recording transfers of its common stock, except by will, intestate succession or operation of law.  Accordingly, and as previously announced, trading of the company&#8217;s stock on the NASDAQ Stock Market will cease after the close of business on February 8, 2010.</p></blockquote>
<p>My <a href="http://www.arnoldit.com/overflight" target="_blank">Overflight</a> search archive suggested that Excalibur Technologies was around in the 1980s. The founder was Jim Dowe, who was interested in neural networks. The notion of pattern matching was a good one. The technology has been successfully exploited by a number of vendors ranging from <a href="http://www.autonomy.com" target="_blank">Autonomy</a> to <a href="http://www.autonomy.com" target="_blank">Verity</a>. <a href="http://www.brainware.com" target="_blank">Brainware’s</a> approach to search owes a tip of its Prince Heinrich hat to the early content snow plowing at Excalibur. Excalibur used most of the buzzwords and catchphrases that bedevil me today, including “semantic technology.”</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image10.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb10.png" border="0" alt="image" width="244" height="184" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Sample of a category search on the Retrieval Ware system. The idea is that you would click a category.</span></p>
<p>One of my former Booz, Allen &amp; Hamilton colleagues made some dough by selling his ConQuest Software search-related technology to Excalibur Technologies. The reason was that the original Excalibur search system did not work too well. Excalibur, according to my Overflight archive, described itself as “leading provider of knowledge and media asset management solutions.”</p>
<p><span id="more-10878"></span></p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image11.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb11.png" border="0" alt="image" width="244" height="184" /></a></p>
<p>Retrieval Ware search result, which you could see via a key word query, a “natural language query,” or by clicking one of the categories in the category “view” interface.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image12.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb12.png" border="0" alt="image" width="244" height="184" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">This is a “report style” interface.</span></p>
<p>The ConQuest search system was essentially a manual approach to query expansion. If you had a document about “trucks” and there was a surge in “SUVs”, then a person had to hook the words together. Excalibur’s management team though this approach was the cat’s pajamas. Manual indexing is a deal breaker unless you use the hybrid method supplemented with semi autonomous agents like our pals in Mountain View do. (Sorry. No footnote because this is in my new Google study.) My recollection is that the ConQuest deal was in the $30 million range, maybe as high as $33 million. Not bad for a system that relied on manual indexing of word lists.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image13.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb13.png" border="0" alt="image" width="244" height="179" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Example of a Convera “visual search”.</span></p>
<p>The name Convera emerged from a tie up between Intel Corporation and Excalibur Technologies in 2000. The idea was for Intel to build big data centers according to my notes from that time. The Intel crowd looked at a number of search systems, including one developed by Seymour Rubenstein (founder of WordStar) and Inktomi but decided on the Excalibur approach, its manual synonym expansion, and its buzzwords. (I think that note to myself was humorous, but it was a decade ago.)</p>
<p>The deal moved forwards and then about 36 months later became a financial event. Convera report a loss of about $18 million on its 2003 revenues of about $30 million.  By 2004, Convera, according to my notes to myself, had a net operating loss north of $150 million.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image14.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb14.png" border="0" alt="image" width="244" height="190" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Convera as an eCommerce search system. This is the one example I had in my files. The customer was </span><a href="http://www.eruggallery.com"><span style="color: #800000; font-size: xx-small;">www.eruggallery.com</span></a><span style="color: #800000; font-size: xx-small;">. The double g’s and the double el’s are tricky.</span></p>
<p>Okay. That’s no surprise. Search is a difficult business and the loss provided me with some supplemental information about the cost of servicing search and retrieval systems, the cost of coding to eliminate deal breaker bugs, the cost of marketing, the cost of customer support, and the cost of research and development.</p>
<p>Convera offered one of the first rich media indexing services. The company cut a deal with the National Basketball Association, and it was a bit of a challenge to the Convera management team and the company’s engineers. I noted that Convera could market but it could not deliver an affordable system.</p>
<p><strong><em>The Core Search System</em></strong></p>
<p>I am in the mood for a walk down memory lane. Let’s consider what Retrieval Ware offered a licensee. We need to keep in mind the financial scorched earth that the system seemed to create since 2001 or so. The idea was a “kitchen sink” approach to search and content processing. Convera’s technologies could do any information process a customer needed. Even today I can name a couple of vendors who make the same assertions to their prospects and customers. In my experience, this kitchen sink approach makes sense to the poobahs, satraps, and azure chip consultants but not to me. Frankly I think that the idea of buying one system, having it handle the different content processing tasks in an organization, presenting the results so busy mangers can get answers, and generating ad hoc reports is a big, expensive job for all but a handful of outfits.</p>
<p>The core of the Convera system, based on my experience, was a series of numerical recipes that worked as long as there were sufficient system resources. Without the infrastructure, the Convera system was sluggish, so functions had to be turned off. Chopping down functions and choking content produces a number of challenges for the APRP or Adaptive Pattern Recognition Processing foundation. In fact, my work with Convera provided me with the view that semantic methods should be plumbing and kept out of sight. Furthermore, these methods should be used * only * when there were sufficient resources; otherwise, other parts of the system would either not work or not work in a useful manner.</p>
<p>I have a note about APRP that reminded me, “Why can’t these Convera guys just tell their customers about the bandwidth, hardware and storage demands before signing a deal?”</p>
<p>Convera offered a “distributed process architecture,” a phrase which reminds me of Google’s plumbing.</p>
<p>Convera was a layered search and content processing system. At the base were the methods of the original pattern recognition and neural network technologies. Then there was the ConQuest system. I think Convera’s engineers coded additional components and may have licensed technology from some other vendors.</p>
<p>The result was a system that might have had satisfactory price/performance numbers if the computing infrastructure available today was available in the 2001 to 2005 period. As it turned out, the costs were too great and there was the financial melt down.</p>
<p>Convera kept making deals in US Federal government as the Convera investors and management team tried to find a money making approach. One angle was to index the Web for “vertical search”. I never understood this but the company whipped up a Web indexing system and invested in servers and storage to index the Web. The technology was okay, but it was the cost analysis that was the problem.</p>
<p>Unlike Google, Convera had to charge customers for the service. Google figured out how to charge advertisers to reach a user base. Convera never figured out a business model that would work. Convera caught the eye of smart money firm Allen &amp; Co., and the rest of the Convera story is one that is anchored in financial activities, not search technology.</p>
<p>I used the phrase “Convera approach” to refer to describing great functionality and then struggling to deliver what the sales professionals and marketing collateral described. This is a common problem and it is one that is used even today.</p>
<p>Retrieval Ware</p>
<p>It is entertaining to think about how Convera&#8217;s Retrieval Ware solution worked.</p>
<p>The system was described as having modules or subsystems. For example, you as a Retrieval Ware customer, could tap into:</p>
<p><em>Classification of Processed Content</em></p>
<p>The idea is that a licensee would shove content into the Retrieval Ware system and the Retrieval Ware software would assign consistent, standardized index terms to the document or other content object. Keep in mind that Convera pitched its video indexing capabilities to the NBA, and the NBA became a non-customer after a period of time.</p>
<p><em>Profiling</em></p>
<p>The idea is that Retrieval Ware would perform Selective Dissemination of Information (SDI). The thought was that a manager could plug in the word or phrase germane to his work and then the Retrieval Ware system would pass documents matching that “profile” to the manager. Today we know that this PointCast, BackWeb, Desktop Data approach is problematic. But in the early 2000s, this pitch sounded great to folks who had a problem but did not know what they did not know.</p>
<p><em>A Workbench</em></p>
<p>This was the ConQuest component which a licensee would use to fiddle with the indexing dictionaries. The idea was that long before the azure chip crowd discovered taxonomies as a consulting business, Convera provided its licensees with a way to stick a subject matter expert into the system. It doesn’t take much thinking to understand that a human intermediated approach is not much more consistent that automated classification and more humans are needed when content volume goes up.</p>
<p><em>Cartridges</em></p>
<p>The idea for a cartridge, according to my notes, came from the database crowd. The idea is that a clunky Codd database could be given some new life by creating code that would add functionality to the row and column framework. Convera offered canned dictionaries for certain disciplines such as law enforcement, defense, etc. These were described as cartridges, which I think appealed to some of the procurement people at the Department of Defense.</p>
<p><em>Synchronizers</em></p>
<p>Convera said that it had technology that would access different repositories of content. This is now called federated search or indexing via connectors (close enough for horseshoes). The idea is that Retrieval Ware could “read” the documents in different types of file systems, file formats, and third party systems. When you ran a query, the Retrieval Ware system would deliver results from all these sources. The idea sounded great in 2001 and it still sounds great today. Some companies can deliver this functionality; in 2001 no company could deliver this functionality. (Marketing collateral is easier to create than affordable, stable systems I have learned.)</p>
<p><em>Web Spider</em></p>
<p>Convera had a Web spider, but not just a spider that could index text. Nope. The Convera spider, according to my notes, was able to index audio, video, and images. (I found this assertion specious in 2001, and I am today able to point to a couple of outfits who can deliver this functionality; for example, <a href="http://www.google.com" target="_blank">Google</a> and <a href="http://www.exalead.com" target="_blank">Exalead</a>.</p>
<p><em>File Room</em></p>
<p>Convera offered a component that would be a document repository. A user could view a document processed by Convera. <a href="http://www.ibm.com" target="_blank">FileNet</a>, <a href="http://www.teratext.com" target="_blank">Teratext</a>, and <a href="http://www-01.ibm.com/software/data/enterprise-search/omnifind-discovery/" target="_blank">iPhrase</a> were among companies offering a similar repository function when Convera was making this assertion. Today this is a big business so Convera knew about an opportunity, but it could not nail it.</p>
<p>The Convera Web site is still up, and I would not be surprised if the company continues in some form in the months ahead. The Web site lists “success stories” and offers a free trial. The spin, however, is not search but Web traffic. I think that Convera is mostly a system for generating clicks, not a leading provider of</p>
<p><em>Screening Room</em></p>
<p>Convera said it could processing video with proprietary functions for capturing video such as TV news programs, index the rich media, provide a search and retrieval function, and serve the video. Great idea which Google and other companies are still working on this functionality. I don’t recall if the search system was called Visual Retrieval Ware or something else, but the note I made to myself was, “What’s the cost of supporting multiple search systems?” This is a question that I could pose to OpenText and Yahoo today. I wonder if these firms have analyzed the Convera case?</p>
<p><em>Retrieval Ware SDK</em></p>
<p>Convera offered a software development kit. The idea was that licensees and developers could build new functionality and tap into the Convera ecosystem. In my files, I have lists of buzzwords attributed to the SDK but like much of the company’s descriptions, the words are surprisingly fresh in 2010 and still in use by some search vendors. (If you look at the write up for a certain azure chip firm’s four square graph of the search industry, you can find the Convera language passed as fact today.)</p>
<p><strong><em>Observations</em></strong></p>
<p>I want to capture some of my thoughts after my romp down memory lane.</p>
<p>First, Convera is a great case study about the risks of over promising and then hitting the customer with the costs of building an infrastructure, hiring the system administrators, and paying for the engineering services to make the marketing literature sync with reality. The Convera case shares some elements with the Fast Search &amp; transfer SA business. My position is that serious management skill, technical expertise, and a lot of money will be needed to keep any search vendor afloat who follows the Convera marketing and sales method that does not make clear the resources required to get the system working as advertised.</p>
<p>Second, the Convera case underscores that those people who are confident in their knowledge of search have essentially zero expertise that bears on search and content processing success. Clients continued if I understand the Convera history to license the firm’s technology despite the yellow flashing lights that I saw a decade ago. I find it amusing that self appointed search experts do not understand the basics of figuring out what works and what does not work when it comes to search and content processing. The Convera case reminds us of the weakness of certain procurement processes and the failure of basic financial analyses prior to licensing a system.</p>
<p>Third, Convera provides a useful library of marketing buzzwords. In fact, when I flipped through the Convera information in my Overflight system, I was impressed with the wording and the timeliness of some of the phrases. In fact, I thought I was reading about one of today’s search and content processing systems, not one that was in “dissolution”, whatever that term really means.</p>
<p>To wrap up, if you know more about Convera, its support for its licensees, the status of its code base, please, use the comments section of this Web log to update me. I have a complete work up on Convera, and if you want to talk about a for fee run down, write me at seaky2000 at yahoo dot com. The briefing addresses the question, “Is there something to buy in the Convera property?”</p>
<p>Stephen E Arnold, February 16, 2010</p>
<p><em>No one paid me to write this. I did have to review Convera’s technology years ago, and I recall objecting in one meeting to the Intel decision to go with Convera. But no money for this opinion piece. I will report this miserable reality to the FAA where high flying is the core focus of that Federal agency.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/16/search-engine-convera-drifts-off/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Microsoft Fast on Linux and Unix Innovation</title>
		<link>http://arnoldit.com/wordpress/2010/02/15/microsoft-fast-on-linux-and-unix-innovation/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/15/microsoft-fast-on-linux-and-unix-innovation/#comments</comments>
		<pubDate>Mon, 15 Feb 2010 05:01:22 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Cost]]></category>
		<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[business strategy]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10849</guid>
		<description><![CDATA[It’s Valentine’s Day. I feel quite a bit of affection for the system professionals who have licensed Fast Search ESP, and I hope each finds search love. I think there will be a “tough” element to this love. And like other types of love, there will be ups and downs. Microsoft practiced some “tough love” [...]]]></description>
			<content:encoded><![CDATA[<p>It’s Valentine’s Day. I feel quite a bit of affection for the system professionals who have licensed Fast Search ESP, and I hope each finds search love. I think there will be a “tough” element to this love. And like other types of love, there will be ups and downs. Microsoft practiced some “tough love” for licensees of the Linux and Unix versions of Fast Search &amp; Transfer’s Enterprise Search Platform recently. I am in a discursive frame of mind, and I will share my opinion about the “tough love” for the Linux and Unix licensees of the 1997 technology that comprises some of Fast Search &amp; Transfer’s system.</p>
<p>The not-too-surprising announcement that Microsoft would stop supporting Fast Search &amp; Transfer’s Linux and Unix customers surprised some folks. I think a handful of resellers were delighted because customers with non-Windows versions of Fast Search cannot change horses in the middle of the Tigris River, as Alexander the Great discovered in 331 BCE. Some poobahs pointed out that open source search would become a hot ticket for Fast Search Linux and Unix licensees. Others took a more balanced view of figuring out whether to rip and replace or supplement the aging Fast Search system with one of the more specialized solutions now available; for example, Exalead’s system could be snapped in without much hassle, based on my research for <em><a href="http://www.exalead.com" target="_blank">Successful Enterprise Search Management</a></em>, published by Galatea in the UK last year. (Martin White was my co-author.)</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image8.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb8.png" border="0" alt="image" width="244" height="184" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Source: </span><a title="http://www.zastavki.com/pictures/1024x768/2008/Saint_Valentines_Day_St.Valentine_004959_.jpg" href="http://www.zastavki.com/pictures/1024x768/2008/Saint_Valentines_Day_St.Valentine_004959_.jpg"><span style="color: #800000; font-size: xx-small;">http://www.zastavki.com/pictures/1024&#215;768/2008/Saint_Valentines_Day_St.Valentine_004959_.jpg</span></a></p>
<p>What I found interesting is that the Microsoft Enterprise Search blog contained some information from Bjørn Olstad, CTO, FAST and Distinguished Engineer, Microsoft. The write up’s title is “<a href="http://blogs.msdn.com/enterprisesearch/" target="_blank">Innovation on Linux and Unix,</a>” and it appeared on February 4, 2010.</p>
<p>Mr. Olstad wrote:</p>
<blockquote><p>When we announced the acquisition two years ago, we said that we were committed to cross-platform innovation—that we’d “continue to offer stand-alone versions of ESP that run on Linux and UNIX,” and that we would provide updates to these versions to address customer concerns and add new features.  Over the last two years, we’ve done just that.</p></blockquote>
<p>The deal was consummated in April 2008. In October 2008, the Norwegian authorities seized some company information, but there has not been much news about the investigation into the pre-acquisition Fast Search &amp; Transfer’s activities. At any event, it is now February 2010, so Microsoft has been operating Fast Search for the period between April 2008 and February 2010. That’s not quite two years, which is a nit, but software works when details are correct. What’s clear is that Fast Search and its Enterprise Search Platform or ESP is pared down and focused on the Windows platform.</p>
<p>I also noted this passage:</p>
<blockquote><p>When we announced the acquisition two years ago, we said that we were committed to cross-platform innovation—that we’d “continue to offer stand-alone versions of ESP that run on Linux and UNIX,” and that we would provide updates to these versions to address customer concerns and add new features.  Over the last two years, we’ve done just that.</p></blockquote>
<p><span id="more-10849"></span></p>
<p>Another detail. The categorical affirmative “always” does not match with the shift in the direction of the Fast Search &amp; Transfer technology. The “always”, in my opinion, means “10 years.” Based on my experience in enterprise search, I am not certain that the present version of Fast Search for Linux and Unix will have many users in 2020. The core is almost as old as Google’s. Unlike Google, Fast Search has added features by using multiple methods. Google, despite its management methods, has engineered enhancements into the plumbing of Google. As a result, the framework is more cohesive, modular, and in my opinion, more in tune with state of the art search and content processing methods. You may, of course, disagree. In the same league with Google is Exalead because both firms have their knowledge tentacles looped into the learnings of the original Alta Vista team. Frankly Fast Search has not kept pace with either Google or Exalead and I can name a number of other companies who have also outpaced Microsoft Fast in the last 18 months.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image9.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb9.png" border="0" alt="image" width="244" height="178" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Image source: </span><a title="http://www.wordseye.com/sl/webpage-db/2006-12-24/7860.jpg" href="http://www.wordseye.com/sl/webpage-db/2006-12-24/7860.jpg"><span style="color: #800000; font-size: xx-small;">http://www.wordseye.com/sl/webpage-db/2006-12-24/7860.jpg</span></a></p>
<p>I marked this segment on my hard copy too:</p>
<blockquote><p>When FAST was founded back in 1997, we were told that it was too late to start a search company. The prevailing wisdom back then was that search was already a commodity: Verity had won the enterprise and AltaVista had won the web.  More than ten years later, it’s clear that we’re just getting started.</p></blockquote>
<p>I would respectfully submit that the Web search game is Google’s at the present time. Now Google is seeping into the enterprise, and I think Microsoft’s dependence on Windows 7’s revenues underscores that there is a fragility in the Microsoft balance sheet. I am no financial analyst, but if Google opens a vein in the enterprise and in the Windows 7-type market, Microsoft could suffer and suffer in a painful way. I would also point out that organizations have a number of options when it comes to search and content processing. I track more than 300 vendors in my Overflight service, and I think that there are several risks Microsoft and its Fast unit will have to ameliorate:</p>
<ol>
<li>Giving away a free or low cost search system to SharePoint customers will provide a short term payoff but over the course of a period of time, the costs of stabilizing and scaling Fast ESP may open the door to competitors. Believe me, there are lots of snap ins available for SharePoint. There’s BA-Insight, Coveo, Exalead, Mindbreeze, and more. You can turn to Bitext which has a utility that makes SharePoint search better quickly. And more options will be coming in the months ahead. Proliferating a complex system like Fast ESP will be the equivalent of teaching customers to look for third party solutions to get users happy and control costs. That’s my opinion and you are welcome to disagree. Just bring facts to your push back.</li>
<li>Fast ESP is complex. I have mentioned the 20 page white paper from Microsoft that details the settings that must be configured to deliver results tailored to a specific search implementation. I am not going to rehash that list. Keep in mind that there is a 300 page technical document that I had to work through to deal with some of those Fast ESP settings. Maybe that document is no longer needed. I sure needed it, and I thought I was pretty good at search systems.</li>
<li>Google is targeting Microsoft, but in its own Googley way. My hunch is that as Google pushes out more enterprise solutions, the pressure on Microsoft will increase very gradually. The analogy I use in my client presentations is that of a scuba diver who goes to deep. Everything is great until the diver surfaces and discovers that the wrong air mixture was used. Microsoft is in the euphoric state of having a  new Fast release in the next few months. The resignations and annoyance of the Linux and Unix crowd are essentially irrelevant. But Google, in my opinion, knows the diver’s air mixture is wrong. Google will just let Microsoft Fast ESP happen. Google will be there for those who want to use <a href="http://www.google.com/apps/intl/en/business/savecosts.html#utm_campaign=en&amp;utm_source=en-ha-na-us-bk&amp;utm_medium=ha&amp;utm_term=google%20apps" target="_blank">Google Apps</a>, <a href="http://earth.google.com/" target="_blank">Maps</a>, whatever.</li>
</ol>
<p>Please, read the February 4, 2010, blog post. I may be wrong. I hope I am wrong because quite a few information technology professionals have identified themselves closely with Microsoft SharePoint and Microsoft Fast search. If the search system chews through user satisfaction and produces unexpected cost overruns, the consequences could reshape the search landscape. <a href="http://www.autonomy.com" target="_blank">Autonomy</a>, <a href="http://www.coveo.com" target="_blank">Coveo</a>, <a href="http://www.endeca.com" target="_blank">Endeca</a>, <a href="http://www.exalead.com" target="_blank">Exalead</a>, <a href="http://www.mindbreeze.com" target="_blank">Fabasoft Mindbreeze</a>, <a href="http://www.surfray.com" target="_blank">Ontolica</a>, and others don’t have to do anything.</p>
<p>Just wait. This time next year it will be interesting to review the enterprise search landscape. By the way, I will be covering enterprise search from my new tie up with Informed Market Intelligence in London. You can find the Web site at <a title="http://www.globaletm.com/" href="http://www.globaletm.com/">http://www.globaletm.com/</a>. More details will appear in Beyond Search when I get the url to the search section to which I will contribute.</p>
<p>Stephen E Arnold, February 14, 2010</p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/15/microsoft-fast-on-linux-and-unix-innovation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Free Pass for Open Source Search?</title>
		<link>http://arnoldit.com/wordpress/2010/02/11/a-free-pass-for-open-source-search/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/11/a-free-pass-for-open-source-search/#comments</comments>
		<pubDate>Thu, 11 Feb 2010 06:02:17 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[eDiscovery]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10801</guid>
		<description><![CDATA[Dateline: Harrod’s Creek, February 11, 2010
I read Gavin Clarke’s “Microsoft Drops Open Source Birthday Gift with Fast Lucidly Imaginative?” I think that the point of the story was “a free pass” to “open source search providers like Lucid Imagination” is interesting. However, I am not willing to accept “free pass”, a variant of the “free [...]]]></description>
			<content:encoded><![CDATA[<p><em>Dateline: Harrod’s Creek, February 11, 2010</em></p>
<p>I read Gavin Clarke’s “<a href="http://www.theregister.co.uk/2010/02/10/fast_microsoft_lucid/" target="_blank">Microsoft Drops Open Source Birthday Gift with Fast Lucidly Imaginative</a>?” I think that the point of the story was “a free pass” to “open source search providers like Lucid Imagination” is interesting. However, I am not willing to accept “free pass”, a variant of the “free lunch” in my opinion.</p>
<p>Here’s my view from the pleasant clime of snowy Harrod’s Creek.</p>
<p>First, in my opinion, most of the Fast Search &amp; Transfer licensees bought into the “one size fits all” approach to search: facets, reports, access to structured and unstructured data, etc. As many of these licensees discovered, the <strong>cost</strong> of making Fast’s search technology deliver on the marketing PowerPoints was high. Furthermore, some like me learned how difficult it was for certain licensees to get the moving parts in sync quickly. Fast ESP consisted, prior to the Microsoft buy out, of keyword search, semantics from a team in Germany, third-party magic from companies like <a href="http://www.lexalytics.com" target="_blank">Lexalytics</a>, home brew code from Norwegian wizards, and outright acquisitions for publishing and content management functionality. Wisely, many search vendors have learned to steer clear of the path that Fast Search &amp; Transfer chopped through the sales wilderness. This means that orphaned Fast Search licensees may be looking at procurements that narrow the scope of search and content processing systems. In fact, there are only a handful vendors who are now pitching the “kitchen sink” approach to search.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/nofreelunchcopycopy.jpg"><img style="display: inline; border: 0px;" title="no free lunch copy copy" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/nofreelunchcopycopy_thumb.jpg" border="0" alt="no free lunch copy copy" width="241" height="244" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Source: </span><a title="http://www.graceforlife.com/uploaded_images/no_free_lunch-772769.jpg" href="http://www.graceforlife.com/uploaded_images/no_free_lunch-772769.jpg"><span style="color: #800000; font-size: xx-small;">http://www.graceforlife.com/uploaded_images/no_free_lunch-772769.jpg</span></a></p>
<p>Second, open source search solutions are not created equal. Some are tool kits; others are ready-to-run systems. <a href="http://www.lucidimagination.com" target="_blank">Lucid Imagination</a> has a good public relations presence in certain places; for example, San Francisco. For those who monitor the search space, there are some other open source vendors that may provide some options. I particularly like the open source version of <a href="http://lucene.apache.org/java/docs/" target="_blank">Lucene</a> available from <a href="http://www.tesuji.eu/" target="_blank">Tesuji.eu</a>. Ah, never heard of the outfit, right? I also find the <a href="http://www.flax.co.uk/index.shtml" target="_blank">FLAX system</a> available from <a href="http://www.lemurconsulting.com" target="_blank">Lemur Consulting</a> useful as well. I think the issues with Fast Search &amp; Transfer are not going to be resolved by ringing up a single vendor and saying, “We’re ready to go with your open source solution.” The more prudent approach is going to be understanding what the differences among various open source search solutions are and then determining if an organization’s specific requirements match up to one of these firms’ service offerings. Open source, therefore, requires some work and I don’t think a knee jerk reaction or a sweeping statement that the Microsoft announcement will deliver a “free pass” is accurate.</p>
<p><span id="more-10801"></span></p>
<p>Third, a number of competent search and content processing vendors offer solutions that can be swapped for an existing Fast Search &amp; Transfer installation. In the first three editions of the Enterprise Search Report which I wrote, I took care to point out which vendors offered Unix and Linux versions of their systems. The PR and marketing noise about Microsoft SharePoint is making it difficult for some organizations to see Linux-compatible search systems as acceptable alternatives to the Microsoft SharePoint approach. Fortunately some firms like <a href="http://www.exalead.com" target="_blank">Exalead</a> in France have taken a fact-based approach to their marketing. I know of one procurement where Exalead insisted that the customer run head-to-head comparisons among various enterprise search solutions. Exalead won the competition because the analysis was based on analysis and facts, not marketing baloney. In search, some vendors sell a job and then return to headquarters to tell the programmers what they have to create. Based on the limited information I have about this Exalead contract win, Lucene did not scale gracefully; that is, the scaling was possible but the time and money required were not in line with the value delivered by the Exalead solution.</p>
<p>Fourth, for more focused search and findability problems, there are more choices than at any other time in my last 15 years of paying attention to search and content processing. Examples range from the baked in search solutions that are available from <a href="http://www.ironmountain.com" target="_blank">Iron Mountain</a> (a document management firm) to the specialists in eDiscovery like <a href="http://www.clearwellsystems.com" target="_blank">Clearwell Systems</a>. Clearwell is interesting because it bundles a rocket docket function with an audit trail so that an attorney can print out a report of what steps were followed to identify a particular document or documents. Are these search systems? Well, it depends on which vantage point one takes. A records management company that licenses Fast Search for its storage hardware ended up looking at Lucene and then purchased a company that had search and clustering technology. The companies jumping into eDiscovery are discovering that specialist expertise is needed in order to keep corporate attorneys happy. Generalists run aground in special purpose search situations.</p>
<p>To wrap up, I think that the Microsoft decision to dump Unix and Linux versions of Fast ESP makes business sense within  Microsoft’s context. The Microsoft organization needs a search solution that works right now to stop the loss of sales to search vendors who have a “snap in” solution to the problems of SharePoint search.</p>
<p>My hunch is that Microsoft did not understand the architecture and its implications for the Fast ESP platform. Microsoft now knows quite a bit about Fast ESP. In my opinion, Microsoft has to contain costs and reduce time to market by making tough decisions and using expedient methods.</p>
<p>The search space in undergoing considerable change at this time, and you can look at any of the azure chip consultants’ reports about search to see the rationalizing and scrambling underway.</p>
<p>My position is that there is no valid way to simplify the complexity of information retrieval. One helpful step is to narrow the procurement to specific requirements and then running bake offs to determine which system delivers for the client.</p>
<p>And what about Google?</p>
<p>The Google continues to waddle forward, but that company seems content to let bright young sparklers discover the Google functionality. Google will not be significantly affected by Microsoft’s actions in enterprise search but Microsoft will be affected by Google’s meandering into the enterprise.</p>
<p>In short, there is no free lunch for search and there is no free pass for vendors whether delivering open source or proprietary findability solutions.</p>
<p>Just my opinion.</p>
<p>Stephen E Arnold, February 11, 2010</p>
<p><em>No one paid me to write this article. Who would? I am old. I will report this to the Social Security Administration, which may need me to keep working if I understand the Republican financial analyses of Social Security funding.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/11/a-free-pass-for-open-source-search/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Online Pricing: Disruption Is the Game</title>
		<link>http://arnoldit.com/wordpress/2010/02/08/online-pricing-disruption-is-the-game/</link>
		<comments>http://arnoldit.com/wordpress/2010/02/08/online-pricing-disruption-is-the-game/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 05:01:18 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[business strategy]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[financial]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[online (general)]]></category>
		<category><![CDATA[publishing]]></category>
		<category><![CDATA[vertical search]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10732</guid>
		<description><![CDATA[It’s Monday morning. The Super Bowl is over, but the world football ecosystem is unfazed. The same cannot be said of for-fee content. I want to point out two seemingly unrelated developments and link them to one of the keystones of doing business in an online, Web-centric world. I am working on a couple of [...]]]></description>
			<content:encoded><![CDATA[<p>It’s Monday morning. The Super Bowl is over, but the world football ecosystem is unfazed. The same cannot be said of for-fee content. I want to point out two seemingly unrelated developments and link them to one of the keystones of doing business in an online, Web-centric world. I am working on a couple of oh-so-secret write ups, and I will make oblique references to research findings by the goslings here in Harrod’s Creek that will be more widely known in the spring.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image2.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/02/image_thumb2.png" border="0" alt="image" width="240" height="244" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">When world’s collide. The boundary is the exciting spot in my opinion. Image source: </span><a title="http://www.sciencedaily.com/images/2008/01/080112152249-large.jpg" href="http://www.sciencedaily.com/images/2008/01/080112152249-large.jpg"><span style="color: #800000; font-size: xx-small;">http://www.sciencedaily.com/images/2008/01/080112152249-large.jpg</span></a></p>
<p>First, consider the plight of Google Books. Suddenly the Department of Justice is showing some moxie. That’s a good thing, but I think the reality of derailing Google Books is like to have some interesting repercussions going forward. For now, the big story is that Google Books has become the poster child of Google being Google. You can get the received wisdom in the UK newspaper The Telegraph and its write up “<a href="http://www.telegraph.co.uk/technology/google/7164224/Justice-Department-criticises-Google-Books-Settlement.html" target="_blank">Justice Department Cr5iticises Google Books Settlement</a>.” The glee is evident to me in this write up, but perhaps I am jaded and worn down by the approach certain publications take to Google. The company is essentially the first examples of what will be a growing line up of firms that use technology to alter business processes. I will be talking about this in my NFAIS speech on March 1, 2010. I am the luncheon speaker, and I think some of those in the room will get indigestion. The reason is that Google comes from a domain that people within 20 years of my age of 65 don’t fully understand. The Telegraph doesn’t get it either, and I think this passage highlights that generational divide:</p>
<blockquote><p>The ruling is a blow to Google and authors&#8217; groups who had supported the search giant&#8217;s ambitious plan to create a vast online library of digitised books. The controversial Google Book Search project attracted fierce criticism from authors, who believed their rights were being eroded, while winning praise from other quarters for helping to widen access to classic, rare or useful works of literature.</p></blockquote>
<p>Too bad the writer, a real journalist, omitted the word “goodie”. My hunch is that since national libraries have not shown any interest in creating digital collections, students and researchers will be doing their work the way John Milton and Andrew Marvell did. Great for those who have the time, money, and cursive writing skills. Not so great for those who need to sift through lots of content quickly. With library budgets shrinking and librarians forced to decide which books to keep, which to store, and which to trash, I think the failure of national libraries is evident. Google made a Googley and somewhat immature attempt to step into the breach and look what has resulted? A bureaucratic, legal eagle snarl. Books are an intellectual resource and I keep asking, “If not Google who?” Reed Elsevier? The British government? The National Library of China? A consortium of publishers? The answer is, in my opinion, now clear, “No one.” Maybe Google will keep going with this project. Hard to tell. Life might be easier to shift gears, go directly to authors, and cut specific deals for their future work. In a decade or so, end of problem. Also, end of traditional publishing. If Google actually talked to me, I would offer this advice, “Go for it, dudes.”</p>
<p><span id="more-10732"></span></p>
<p>The second development is the dust up over the pricing of electronic books or eBooks. You can get a run down of the matter in “<a href="http://gizmodo.com/5464742/the-999-ebook-is-dead-third-major-publisher-hachette-dumps-on-amazon" target="_blank">The $9.99 Ebook Is Dead: third Major Publisher Hachette Dumps on Amazon</a>.” I think that Amazon, like Google, is a different domain. My hunch is that the closed world of Amazon thought it was being clever by offering books at a lower price than anyone else. You know the discount stuff that a Dartmouth professor has been explaining for the last 15 years. Well, the folks at Apple figured out that with a little cash sweetener, publishers would go for the iPad and then all by themselves crush Amazon’s $9.99 deal. The problem with this type of pricing battle is that the publishing sector is dead no matter what Amazon or Apple do. In fact, toss in Google. I don’t think it makes much of a difference either. Here’s why. Whoever gets the most eyeballs can go directly to authors and sign them up. The deals can be exclusive, embargoed, or something that makes these new “publishers” and their content providers happy. Assume that Amazon, Apple, and Google become the de facto publishers of books and content segments in their respective client environments. As an author, I would shop for the best deal. Period. I don’t really care much anymore who publishes my books. I just want to get whatever revenue is possible. I write a free Web log because it is too much of a hassle to monetize content, so I just use this as a marketing vehicle. So what happens to the traditional publishers? Domains collide and some of the companies get crushed in the contact. Game over.</p>
<p>These two developments illustrate that more is at stake than an old business model. New methods and business processes are in the process of self assembly. Remember disintermediation, a word I used in an Online Magazine article I did when Jeff Pemberton owned the property. I took a lot of heat from librarians who objected to my assertion that online would squeeze the information specialist. End users would just do the research themselves. Ignorance is bliss. Hello, Enron. Bad decisions result from misunderstanding information. Not much I could do in the 1980s and there’s not much I can do now.</p>
<p>So these two trends mean disintermediation. Who will be disintermediated? The companies that cannot adapt to the new domains that are emerging. In this context, the Google Books hassle and the Amazon pricing dust up are single, important data points on a map that is going to fill in quickly.</p>
<p>Stephen E Arnold, February 8, 2010</p>
<p><em>No one paid me to write this. I thought make this clear in the text above. But in the interest of disclosure I herewith report I received no money for this write up to Senate’s Sergeant at Arms.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/02/08/online-pricing-disruption-is-the-game/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Microsoft and Mikojo Trigger Semantic Winds across Search Landscape</title>
		<link>http://arnoldit.com/wordpress/2010/01/28/microsoft-and-mikojo-trigger-semantic-winds-across-search-landscape/</link>
		<comments>http://arnoldit.com/wordpress/2010/01/28/microsoft-and-mikojo-trigger-semantic-winds-across-search-landscape/#comments</comments>
		<pubDate>Thu, 28 Jan 2010 05:01:28 +0000</pubDate>
		<dc:creator>Stephen E. Arnold</dc:creator>
				<category><![CDATA[business strategy]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantic]]></category>
		<category><![CDATA[technology]]></category>
		<category><![CDATA[text processing]]></category>

		<guid isPermaLink="false">http://arnoldit.com/wordpress/?p=10601</guid>
		<description><![CDATA[Semantic technology is blowing across the search landscape again. The word “semantic” and its use in phrases like “semantic technology” has a certain trendiness. When I see the word, I think of smart software that understands information in the way a human does. I also think of computationally sluggish processes and the complexity of language, [...]]]></description>
			<content:encoded><![CDATA[<p>Semantic technology is blowing across the search landscape again. The word “semantic” and its use in phrases like “semantic technology” has a certain trendiness. When I see the word, I think of smart software that understands information in the way a human does. I also think of computationally sluggish processes and the complexity of language, particularly in synthetic languages like English. Google has considerable investment in semantic technology, but the company wisely tucks it away within larger systems and avoiding the technical battles that rage among different semantic technology factions. You can see Google’s semantic operations tucked within the Ramanathan Guha inventions disclosed in February 2007. Pay attention to the discussion of the system and method for “context”.</p>
<p><a href="http://arnoldit.com/wordpress/wp-content/uploads/2010/01/image15.png"><img style="display: inline; border: 0px;" title="image" src="http://arnoldit.com/wordpress/wp-content/uploads/2010/01/image_thumb15.png" border="0" alt="image" width="244" height="152" /></a></p>
<p><span style="color: #800000; font-size: xx-small;">Gale force winds from semantic technology advocates. Image source: </span><a title="http://www.smh.com.au/ffximage/2008/11/08/paloma_wideweb__470x289,0.jpg" href="http://www.smh.com.au/ffximage/2008/11/08/paloma_wideweb__470x289,0.jpg"><span style="color: #800000; font-size: xx-small;">http://www.smh.com.au/ffximage/2008/11/08/paloma_wideweb__470&#215;289,0.jpg</span></a></p>
<p><em><strong>Microsoft’s Semantic Puff</strong></em></p>
<p>Other companies are pushing the semantic shock troops forward. I read yesterday in Network World’s “<a href="http://www.networkworld.com/news/2010/012710-microsoft-talks-up-semantic-search.html" target="_blank">Microsoft Talks Up Semantic Search Ambitions</a>.” The article reminded me that Fast Search &amp; Transfer SA offered some semantic functionality which I summarized in the 2006 version of the original <em>Enterprise Search Report </em>(the one with real beef, not tofu inside). Microsoft also purchased Powerset, a company that used some of Xerox PARC’s technology and its own wizardry to “understand” queries and create a rich index. The Network World story reported:</p>
<blockquote><p>With semantic technologies, which also are being to referred to as Web 3.0, computers have a greater understanding of relationships between different information, rather than just forwarding links based on keyword searches.  The end game for semantic search is &#8220;better, faster, cheaper, essentially,&#8221; said Prevost, who came over to Microsoft in the company&#8217;s 2008 acquisition of search engine vendor <a href="http://www.infoworld.com/d/developer-world/update-microsoft-and-powerset-confirm-deal-801">Powerset</a>. Prevost is still general manager of Powerset.  Semantic capabilities get users more relevant information and help them accomplish tasks and make decisions, said Prevost.</p></blockquote>
<p>The payoff is that software understands humans. Sounds good, but it does little to alter the startling dominance of Google in general Web search and the rocket like rise of social search systems like Facebook. In a social context humans tell “friends” about meaning or better yet offer an answer or a relevant link. No search required.</p>
<p>I reported about the complexities of configuring the enterprise search system that Microsoft offers for SharePoint in an <a href="http://arnoldit.com/wordpress/2010/01/17/sharepoint-sunday-a-calm-week/" target="_blank">earlier Web log post</a>. The challenge is complexity and the time and money required to make a “smart” software system perform to an acceptable level in terms of throughput in content processing and for the user. Users often prefer to ask someone or just use what appears in the top of a search results list.</p>
<p><span id="more-10601"></span></p>
<p><strong><em>Mikojo’s Financing Gale</em></strong></p>
<p>I also learned that this Internet search engine company secured additional lines of credit. The figure is in the $23 million range, which is healthy indeed. Mikojo, according to the <a href="http://finance.yahoo.com/news/Mikojo-Inc-Secures-23-Million-bw-991806948.html?x=0&amp;.v=1" target="_blank">news story</a> on Yahoo:</p>
<blockquote><p><a href="http://www.mikojo.com/" target="_blank">Mikojo</a> is a provider of Internet search services and technology. Mikojo is implementing technology that it believes will improve the search experience for consumers by providing mechanisms for users to specify search queries that integrate multiple data sources on the Internet. Based in Foster City, California, with offices also located in Los Angeles and Australia, Mikojo has a history of technology innovation in data integration and Internet search. The Company is incorporated in Delaware and trades on the OTC Bulletin Board under the symbol MKJI.</p></blockquote>
<p>Google’s link to Mikojo references a “semantic search engine”, and the company’s splash page has the tag line “the intelligent search engine.” My hunch is that Mikojo is also indicating that it uses technology that understands content. Here’s what the company says on its About Us page:</p>
<blockquote><p>Mikojo is focused on users who have a specific purpose or decision in mind when they utilize Internet search. For such users, Mikojo provides ways to search for criteria across multiple data sources by correlating and integrating these data sources together. Results are presented as tables, allowing users to understand the information more easily and render decisions rapidly. The company is dedicated to setting the standard for search technology and how people find information. As the Web becomes larger and more complex, finding relevant information efficiently has become increasingly critical to Internet users. Mikojo uses the virtual relation platform called TriggerWare as the foundation for its search technology. By innovating the proven TriggerWare© search technology and adapting it to the complex problem of Internet search, Mikojo helps users find what they need quickly and intuitively. The new search facilities are fully integrated into Mikojo&#8217;s standard search engine, thereby helping users refine their searches and quickly access the most pertinent and useful ways to gain the information that drives their insight and decision-making.</p></blockquote>
<p><a href="http://www.altsearchengines.com/2009/12/11/" target="_blank">AltSearchEngines</a> comes right out and says:</p>
<blockquote><p>The TriggerWare™ query language, similar to SQL (Structured Query Language), provides a more semantically-based paradigm for search than a simple keyword search. As queries involve different tables from different web pages, answers to queries require semantic processing of information from different web pages. The company believes that no current search engine can integrate information from multiple web pages in a semantically significant manner. TriggerWare™ provides smart alerts over complex queries, enabling users to see the answer to their queries, and be notified when such information changes.</p></blockquote>
<p><strong><em>The Goose’s View</em></strong></p>
<p>My hunch is that the semantic winds are picking up. Microsoft is mentioning semantics and investors have voted with their business savvy on the Mikojo’s method.</p>
<p>Several thoughts:</p>
<ol>
<li>Semantics is a term that needs to be defined and anchored in a specific technical context within a software system and method. The word alone has limited meaning for me. Others may be different in their understanding of “semantic”.</li>
<li>Semantic systems can be computationally intensive. The explanation of semantic technology, therefore, requires some indication of content throughput on specific hardware, index latency, and scaling methodologies. “It scales, no problem” is not enough for me.</li>
<li>Search is undergoing significant mutation. On one end of the spectrum are “one size fits all” systems. As I heard in Europe last week, many European companies prefer an SAP type approach with responsibility fixed on one company for enterprise software solutions. On the other end of the spectrum, azure chip consultants are in a tizzy over the use of search technology in specific problem solving use cases; for example, eDiscovery or customer support applications. The problem is that a two dimensional spectrum does not embrace the environmental changes that companies like Google, Microsoft, and Exalead are making.</li>
</ol>
<p>I wonder if the semantic winds will pick up or die down or just come and go. Buzzwords come and go as do trends in search. Put on your ear muffs. Some winds can howl and chill one’s bones. Do not wear a jacket stuffed with goose feathers, please.</p>
<p>Stephen E Arnold, January 28, 2010</p>
<p><em>A freebie. I will report the inclement weather and the non compensation mode of this article to NOAA, an outfit that controls many things just not howling winds.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://arnoldit.com/wordpress/2010/01/28/microsoft-and-mikojo-trigger-semantic-winds-across-search-landscape/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
