Dominoes Circa 2010
March 24, 2010
I was a college student when the “domino theory” was the firewood for many heated conversations. My memory is dim, but I recall that the idea was that if one country fell to a non-democratic, non-market based system, then other adjacent countries would go the same way.
Source: http://community.middlebury.edu/~scs/maps/oilnames.gif
The metaphor is that a line of dominos can be converted into a brief, but somewhat entertaining, event. I never played dominos so I did not relate to the metaphor.
But when I read “Remaining Google Units Exposed: Analysts”, I had a flashback, saw an image of Robert McNamara (Ford executive and strategist par excellence), and a row of dominos set up by a bright 10 year old on the kitchen table. Weird how the mind makes associations that defy time and logic.
The article appeared in The Globe and Mail, which is a pretty good newspaper but not available in hard copy in Louisville. Online reading is hard on my 65 year old eyes but I worked through this article. The most important segment in my opinion was:
Other stakeholders exposed to Google’s actions include cell phone makers like Dell and Lenovo, which are both developing Android-based phones for China, as well as the hundreds of people who independently sell ads and develop software for Google’s products. Spokeswomen at Lenovo and China Mobile, which is planning to offer the Dell Android phones on its network, had no immediate comment. Meantime, other search sites operators stand ready to benefit most form Google’s withdrawal, most notably Baidu – which has 60 per cent of China’s search market – and others such as fast-growing Tencent, analysts said.
What happens if other and unanticipated interactions among Google, China, its partners, its suppliers, and its customers take place?
Source: http://www.insidesocal.com/clippers/dominoeffect.jpg
The Discovery Hoax: Commercial Databases Make Big Promises
March 8, 2010
I was given a box lunch and a can of Pepsi as compensation for my one hour talk at a conference last week. I had an interesting conversation with a former big wheel in commercial database publishing. I thought the wizard was a retired poobah. I was wrong. The fellow had his shoulder pads on, a sweatband, and Gucci cleats. He’s back on a commercial company’s publishing team. I am an old, cowardly goose, and it is with trepidation that I get too close to big people garbed for quasi-military re-enactments related to electronic information.
I asked the industry titan what his new gig involved. I recall one word, which he repeated several times to me, the addled goose. The word? “Discovery.” I thought I was having a The Graduate moment. In 2010, plastic was a loser. The winner? Discovery.
Yep, the lingo of the search and content processing market has reached the world of professional publishing and for-fee database access.
The idea, as I understood it, is that this commercial company will allow a user to enter a keyword; for example, employee stock ownership. The system will crunch away and present:
- Results from the firm’s for fee databases. Not just anyone can run a search. The user has to have access to an institutional account or sign up and pay. There is some free stuff, but this is a real, live make-money-or-die operation.
- The system will also “discover” possibly related content and list that information in the form of links. I think the idea the titan was communicating is what Endeca calls “Guided Navigation” in 1999! Not exactly yesterday! To see the Endeca system in action just go to OfficeFurniture.com.
- Content from the public Web.
The idea is that a person using a commercial system will enter a search string and then see links to related content. This works for buying office furniture. I am not sure how a computational chemist would react to a suggestion she read a blog post about a meth lab that blows up.
Yep, our professional grade service needs those custom chrome wheels. Image source: http://www.up.ac.za/organizations/movup/images/minefun/indian_haul_truck.jpg
I asked what happened if I used one of the company’s business databases and entered the search term “management.” I got a bit of double talk and the titan backed up, trying to get away from me. The reason I asked about this type of search is that I know from hands-on experience that the use of a general controlled term in his firm’s databases does not generate a usable results list. Thus, any “discovered” information is likely to be wide of the mark. Broad queries don’t often work too well in the for-fee, quite specific content in certain commercial systems. A single word like “management” in a Google search box generates what is highly ranked by clueless millions like a link to the Wikipedia entry.
When Domains Collide
March 1, 2010
Editors’s Note: This is a modified version of the lecture that Stephen E Arnold, ArnoldIT.com delivered in Philadelphia, March 1, 2010. The actual presentation was an extemporaneous talk based on this preliminary set of notes.
I want to thank NFAIS for inviting me to address the members of this professional organization. The world of bibliography, abstracting, indexing, professional publishing and academic research has been shaken to its foundations in the last three or four years. The Richter scale measuring the waves pulsing through the bedrock of information access is being stretched. I find that talking about what is happening and what information professionals can do about those pulses difficult.
This morning I want to put the pulses into a context. I am cautiously optimistic about a finding my research has revealed. Specifically, the shocks are coming from the integration of formerly separate disciplines into new services. In short, the traditional methods are being put into software and hardware modules and used to build new, more efficient, and more flexible services. Complete information businesses are now a commodity component that a clever engineer can use like a building block. Good news for engineers skilled in integration. Not such good news for experts in a hand-craft like Linotype operation. By snapping together modules, domains collide and are reinvented.
That’s today’s world of information.
Where We Are
Today we live in a world of a number of global, possibly monopolistic online research services stands and literally a hundred million or more citizen journalists creating blogs and tweets.
Until recently, say about 1979 or 1980, a scholar transported from the 11th century scriptorium would have become familiar quickly with the hard copy research books painstakingly documented by Constance Winchell. But move that person to today’s world and the mental shift would be more difficult, perhaps impossible.
Bring that 11th century researcher to today’s world, and I think adjustment would be difficult. Since the advent of online (anyone remember NLS?), information is just “out there”. Today information is “here” when it appears on a screen. The display of information is evanescent until it is “written”—that is, copied—to a storage device which may be located “out there”. It is possible to print an item of information, but the digital instance is the “real information.” This is a significant conceptual shift since online became our common information currency.
In fact, I cannot begin work until I “find” the particular electronic instance on which I am to work. Without search and retrieval, I am a cooked goose.
And just finding a particular document can be difficult even with the many search systems available. If our time traveling 11th century research can print a document, the information needed may surrounded by unwanted images and advertisements. Without the ability to recognize the “real” information our 11th century scholar would be hard pressed to use today’s information retrieval systems. The monk comes from another time, and that time has its own domain of information. The domain includes ways to create information, way to access information, and ways to reference other information. The monk might be squashed when his domain collided with the domain of 2010 information access. When domains collide, methods are crushed, recycled, and remade. This is deeply disturbing to people who cling to specific ways of doing such things as research.
The implications of domain collision are important in my opinion. Economics, human behavior, work processes, and speed are defined by domains. Let’s run down a handful of the challenges domain collisions ignite. The good news is that domains that touch create a boundary condition in which opportunities can flourish.
Challenges of Domain Collisions
If you have a business school degree, you have studied the touchstone buggy whip reference in Theodore Levitt’s “Marketing Myopia” that appeared in the Harvard Business Journal in 1960. The idea is that a buggy whip manufacturer who anticipated the advent of the automobile could have expanded the product line to include a leather steering wheel wrap or automobile interiors.
Thus, the problem is that each domain has a certain way of perceiving phenomena. I won’t dwell on phenomenological existentialism, but I think it has quite a bit to teach us about what we can see when something “new” this way comes. We are, in the telling phrase of William James, stricken with “a certain blindness”. We simply cannot see beyond our domain. When domains collide, not only our vision is impaired we must deal with processes and methods that have been transformed by the forces involved.
Not surprising, the problems of apprehending have triggered a cascade of challenges. Vocabulary is an issue. One example is the use of abbreviated spelling and neologisms to communicate in Twitter “tweets” or short messages via a mobile device. Messages such as ru w/me grate on some. To those in the domain, the messages is clear and appropriate.
Other phenomena I have observed include:
- Work methods crafted for one domain such as copying a manuscript by hand on animal skin do not transfer to another domain such as copying information to a storage device. An entire lifetime of learning is irrelevant in the new domain.
- The time required to assemble a document is measured by manual tasks that are often organized in a sequential manner. The digital domain allows many tasks to be handled quickly and, in some cases, in parallel.
- The costs for manual, serialized work processes can be problematic. When software can be used to eliminate certain work previously done by humans, the economics change.
I think you can see from these examples that our time traveling researcher from Mont St Michel in the Middle Ages would have a steep learning curve.
I have given quite a bit of thought to the implications of this type of domain collision. I know when I look at banking, retail, manufacturing, and finding the right person to marry that domain collisions are one of the defining attributes of today’s world.
Publishing
I want to comment about publishing because most NFAIS members are involved in the creation, selection, and dissemination of information. The domain collision began with the advent of the online search systems for the NASA RECON project, the work of Dr. Gerald Salton (Cornell University), and the non-linear increase in the capabilities of hardware and software.
What is interesting to me is that since this revolution began, arguably in the 1970s, publishing has been eager to embrace certain technologies yet reluctant to get too close to other technologies.
Let me give you an example. When I worked at the Courier Journal & Louisville Times Co., we operated a rotogravure press and we printed the New York Times Sunday Magazine. We embraced traditional rotograveur printing technology and then we adopted technology that chopped the manual plate making process out of the work flow. We used computers, fancy software, and numerically-controlled presses as early as the early 1980s.
The Courier-Journal Board of Directors understood the importance of electronic information and created a separate separate business unit to build digital products. I was lucky to participate in the development a profitable online business with ABI/INFORM, Business Dateline, Pharmaceutical News Index, and the core technical databases that were the foundation of today’s Cambridge Scientific Abstracts. This work took place in the early 1980s and relied on traditional mainframes and timesharing businesses like Tymnet and Dialcom as service bureaus.
I know from first-hand experience that those who managed the technologies steeped in the domain of traditional newspaper production believed their unit of the company was in the thick of technological change. The electronic publishing technology was a radical and strange undertaking. The people running the state-of-the-art four color printing presses did not see how electronic information could be a viable business.
We know now that the electronic publishing technology has emerged as one of the key technologies for information companies today. In fact, the brutal struggles between Macmillan and Amazon, Apple and Sony, and Google and book publishers are anchored in the technology that was a second-class citizen in the 1980s.
What’s interesting is that within publishing the domain of the traditional products like books, music, motion pictures, and television programming is now colliding with the domain of the network computing infrastructure. Complete businesses and their nested processes are now a Web service. One can download a electronic publishing system as open source software. The key point is that anyone anywhere in the world can become a digital newsroom with a Web site, newsfeed, and a community.
What’s even more interesting is that the agents of change are the children of many publishing executives and in some cases, the former employees of established publishing and rich media companies.
Another interesting point is that the new domain of content production is surrounding the traditional information industry which Paul Zirkowski tried to capture in this diagram from the Information Industry Association in the mid-1980s, which, in my opinion, nicely summarizes what we now know as the Petri dish for Amazon, Apple, and Google, among other firms.
This is a diagram created by the “old” Information Industry Association. Created in the mid 1980s, it is an attempt to show how the information world at that was beginning to develop. What’s interesting is that the successes of Amazon, Apple, and Google, among other companies is dependent to some degree on combining several of these “old” segments in one service.
When I look at this diagram, I can see that the success of Amazon, Apple, and Google in information comes from taking the building blocks from this 20-year-old diagram and combining pieces into new constructions. Keep in mind that these firms are not in the strict sense traditional publishing companies. These are technology-centric companies whose engineering uses information as a catalyst to create new functions.
Is Content Management a Digital Titanic?
February 25, 2010
Content management is a moving target. Unlike search, CMS is supposed to generate a Web page or some other type of content product. The “leaders” in content management systems or CMS seem to disappearing into larger organizations. Surprising. If CMS were healthy, why aren’t these technology outfits growing like crazy and spinning off tons of cash?
I am no expert in CMS. In fact, I am not an expert in anything unlike the azure chip consultants, poobahs, and pundits who profess deep knowing at the press of a mouse button. In my experience, CMS emerged from people not having an easy way to produce HTML pages that could be displayed in a browser.
If HTML was too tough for some people, imagine the pickle barrel in which these folks find themselves today. In order to create a Web site, more than HTML is required. The crowd who relied on Microsoft’s Front Page find themselves struggling with the need to make Web pages work as applications or bundles of applications with some static brochureware thrown in for good measure.
To make a Web site today, technical know how is an absolute must. Even the very good point-and-click services from SquareSpace.com and Weebly.com can baffle some people.
The azure chip consultants, the mavens, and the poobahs want to be in the lifeboats. Women and children to the rear. Source: http://www.ronnestam.com/wp-content/uploads/2009/02/lifeboat_change_advertising_sinking.jpg
Move the need for a dynamic Web site into a big organization that is not good at technology, and you have a recipe for disaster. In fact, the wreckage created by some content management vendors, pundits, and integrators is of significant magnitude. There’s the big hassle in Australia over a blue chip CMS implementation that does not work. The US Senate went after the bluest of the blue chip integrators because a CMS could not generate a single Web page. Sigh.
Buzz Search: Defaults Do Not Fly
February 22, 2010
Editor’s Note: Constance Ard, the Answer Maven, is one of the goslings. She wrote an overview of Google Buzz search functionality. Ms. Ard is active in the Special Libraries Association, heads up the legal interest group, and has an MLS with an emphasis on online search, taxonomies, and content processing.
With the release of Buzz flapping everyone’s wings over the last Internet half-life, it’s time to consider some practical application for Buzz. Danny Sullivan at Search Engine Land has laid the groundwork for searching Buzz.
For the record, the type it in the box and trust the search results, aren’t enough with this service from Google. You can see below, that Buzz, a social media tool that gets food from Twitter, Google Reader, Friend Feed, and SMS display results from a typical box search that are surprisingly old in the real-time scheme of things.
These results are for a search done at approximately 8 p.m. EST on February 17, 2010, through the Buzz search box with the term: Olympics. The first result is time-stamped 4:50 p.m. The last result was stamped 9:41 a.m. and the second was stamped 8:23 a.m. These are not exactly real-time results and not even reverse chronological in display.
The same search on Buzzzy.com (selected results shown below) done at the same approximate time provides even more irritating displays. Has anyone heard of time, date stamps? I understand that in real-time search hours count but in search, pinpointing an accurate date and time is essential.
Jargon Means Shields Up for Consultants
February 21, 2010
I just read “Computer Jargon Baffles Users, Hinders Security.” This is a Thomson Reuters’ news story, and I don’t know if the wild and crazy url will work when you read this. Not my fault. Email Thomson Reuters, whose customer support crew is ready to help you.
The news story is one that runs every few months. The idea is that jargon is pretty much impossible for the average person to figure out. The argument in the Thomson Reuters’ story pivots on security, but the journalist could have picked on search, business intelligence, or any other common enterprise application. Jargon is a defense mechanism. Magic.
For me, the key passage in the Thomson Reuters’ story was:
“The malicious and criminal use of cyberspace today is stunning in its scope and innovation,” said Dell Services President Peter Altabef. One problem is that computer “geeks” use jargon to cloak their work in scholarly mystique, resulting in a lack of clarity in everything from instruction manuals and systems design to professional training, the experts said. “If you don’t demystify security, people become anxious about it and don’t want to do it,” former U.S. Homeland Security Secretary Michael Chertoff told Reuters on the sidelines of the EastWest Institute security meeting in Brussels.
I had a conversation with a big wheel from a blue chip consulting firm. I really want to reveal which firm, but my legal eagle squawks when I provide certain information in this Web log. The guts of the conversation are easy to summarize.
Search Engine Convera Drifts Off
February 16, 2010
The journey was a long one, beginning with scanning marketing brochures in the 1990s has filed for a certificate of dissolution. I think this means that Convera has moved from the search engine death watch to the list which contains Delphes, Entopia, and other firms.
Convera splash page on February 15, 2010
You can read the official statement for a few more days on the PRNewswire site. The title of the announcement is / was, “Convera Corporation Files Certificate of Dissolution, Trading of Common Stock to Cease after February 8, 2010 Payment Date Set.” I am no attorney so maybe my lay understanding of “dissolution” is flawed, and Convera under another name will come roaring back. For the purposes of this round up of my thoughts, I am going to assume that Convera is comatose. I hope it bounces back with one of those miracles of search science. I am crossing my wings, even thought each has a dusting of snow this morning. Harrod’s Creek has become a mid south version of Nord Kap.
For me, the key passage in the write up was:
Convera Corporation announced today that it filed its Certificate of Dissolution with the Delaware Secretary of State on February 8, 2010, in accordance with its previously announced plan of complete dissolution and liquidation. As a result of such filing, the company has closed its stock transfer books and will discontinue recording transfers of its common stock, except by will, intestate succession or operation of law. Accordingly, and as previously announced, trading of the company’s stock on the NASDAQ Stock Market will cease after the close of business on February 8, 2010.
My Overflight search archive suggested that Excalibur Technologies was around in the 1980s. The founder was Jim Dowe, who was interested in neural networks. The notion of pattern matching was a good one. The technology has been successfully exploited by a number of vendors ranging from Autonomy to Verity. Brainware’s approach to search owes a tip of its Prince Heinrich hat to the early content snow plowing at Excalibur. Excalibur used most of the buzzwords and catchphrases that bedevil me today, including “semantic technology.”
Sample of a category search on the Retrieval Ware system. The idea is that you would click a category.
One of my former Booz, Allen & Hamilton colleagues made some dough by selling his ConQuest Software search-related technology to Excalibur Technologies. The reason was that the original Excalibur search system did not work too well. Excalibur, according to my Overflight archive, described itself as “leading provider of knowledge and media asset management solutions.”
Microsoft Fast on Linux and Unix Innovation
February 15, 2010
It’s Valentine’s Day. I feel quite a bit of affection for the system professionals who have licensed Fast Search ESP, and I hope each finds search love. I think there will be a “tough” element to this love. And like other types of love, there will be ups and downs. Microsoft practiced some “tough love” for licensees of the Linux and Unix versions of Fast Search & Transfer’s Enterprise Search Platform recently. I am in a discursive frame of mind, and I will share my opinion about the “tough love” for the Linux and Unix licensees of the 1997 technology that comprises some of Fast Search & Transfer’s system.
The not-too-surprising announcement that Microsoft would stop supporting Fast Search & Transfer’s Linux and Unix customers surprised some folks. I think a handful of resellers were delighted because customers with non-Windows versions of Fast Search cannot change horses in the middle of the Tigris River, as Alexander the Great discovered in 331 BCE. Some poobahs pointed out that open source search would become a hot ticket for Fast Search Linux and Unix licensees. Others took a more balanced view of figuring out whether to rip and replace or supplement the aging Fast Search system with one of the more specialized solutions now available; for example, Exalead’s system could be snapped in without much hassle, based on my research for Successful Enterprise Search Management, published by Galatea in the UK last year. (Martin White was my co-author.)
Source: http://www.zastavki.com/pictures/1024×768/2008/Saint_Valentines_Day_St.Valentine_004959_.jpg
What I found interesting is that the Microsoft Enterprise Search blog contained some information from Bjørn Olstad, CTO, FAST and Distinguished Engineer, Microsoft. The write up’s title is “Innovation on Linux and Unix,” and it appeared on February 4, 2010.
Mr. Olstad wrote:
When we announced the acquisition two years ago, we said that we were committed to cross-platform innovation—that we’d “continue to offer stand-alone versions of ESP that run on Linux and UNIX,” and that we would provide updates to these versions to address customer concerns and add new features. Over the last two years, we’ve done just that.
The deal was consummated in April 2008. In October 2008, the Norwegian authorities seized some company information, but there has not been much news about the investigation into the pre-acquisition Fast Search & Transfer’s activities. At any event, it is now February 2010, so Microsoft has been operating Fast Search for the period between April 2008 and February 2010. That’s not quite two years, which is a nit, but software works when details are correct. What’s clear is that Fast Search and its Enterprise Search Platform or ESP is pared down and focused on the Windows platform.
I also noted this passage:
When we announced the acquisition two years ago, we said that we were committed to cross-platform innovation—that we’d “continue to offer stand-alone versions of ESP that run on Linux and UNIX,” and that we would provide updates to these versions to address customer concerns and add new features. Over the last two years, we’ve done just that.
A Free Pass for Open Source Search?
February 11, 2010
Dateline: Harrod’s Creek, February 11, 2010
I read Gavin Clarke’s “Microsoft Drops Open Source Birthday Gift with Fast Lucidly Imaginative?” I think that the point of the story was “a free pass” to “open source search providers like Lucid Imagination” is interesting. However, I am not willing to accept “free pass”, a variant of the “free lunch” in my opinion.
Here’s my view from the pleasant clime of snowy Harrod’s Creek.
First, in my opinion, most of the Fast Search & Transfer licensees bought into the “one size fits all” approach to search: facets, reports, access to structured and unstructured data, etc. As many of these licensees discovered, the cost of making Fast’s search technology deliver on the marketing PowerPoints was high. Furthermore, some like me learned how difficult it was for certain licensees to get the moving parts in sync quickly. Fast ESP consisted, prior to the Microsoft buy out, of keyword search, semantics from a team in Germany, third-party magic from companies like Lexalytics, home brew code from Norwegian wizards, and outright acquisitions for publishing and content management functionality. Wisely, many search vendors have learned to steer clear of the path that Fast Search & Transfer chopped through the sales wilderness. This means that orphaned Fast Search licensees may be looking at procurements that narrow the scope of search and content processing systems. In fact, there are only a handful vendors who are now pitching the “kitchen sink” approach to search.
Source: http://www.graceforlife.com/uploaded_images/no_free_lunch-772769.jpg
Second, open source search solutions are not created equal. Some are tool kits; others are ready-to-run systems. Lucid Imagination has a good public relations presence in certain places; for example, San Francisco. For those who monitor the search space, there are some other open source vendors that may provide some options. I particularly like the open source version of Lucene available from Tesuji.eu. Ah, never heard of the outfit, right? I also find the FLAX system available from Lemur Consulting useful as well. I think the issues with Fast Search & Transfer are not going to be resolved by ringing up a single vendor and saying, “We’re ready to go with your open source solution.” The more prudent approach is going to be understanding what the differences among various open source search solutions are and then determining if an organization’s specific requirements match up to one of these firms’ service offerings. Open source, therefore, requires some work and I don’t think a knee jerk reaction or a sweeping statement that the Microsoft announcement will deliver a “free pass” is accurate.
Online Pricing: Disruption Is the Game
February 8, 2010
It’s Monday morning. The Super Bowl is over, but the world football ecosystem is unfazed. The same cannot be said of for-fee content. I want to point out two seemingly unrelated developments and link them to one of the keystones of doing business in an online, Web-centric world. I am working on a couple of oh-so-secret write ups, and I will make oblique references to research findings by the goslings here in Harrod’s Creek that will be more widely known in the spring.
When world’s collide. The boundary is the exciting spot in my opinion. Image source: http://www.sciencedaily.com/images/2008/01/080112152249-large.jpg
First, consider the plight of Google Books. Suddenly the Department of Justice is showing some moxie. That’s a good thing, but I think the reality of derailing Google Books is like to have some interesting repercussions going forward. For now, the big story is that Google Books has become the poster child of Google being Google. You can get the received wisdom in the UK newspaper The Telegraph and its write up “Justice Department Cr5iticises Google Books Settlement.” The glee is evident to me in this write up, but perhaps I am jaded and worn down by the approach certain publications take to Google. The company is essentially the first examples of what will be a growing line up of firms that use technology to alter business processes. I will be talking about this in my NFAIS speech on March 1, 2010. I am the luncheon speaker, and I think some of those in the room will get indigestion. The reason is that Google comes from a domain that people within 20 years of my age of 65 don’t fully understand. The Telegraph doesn’t get it either, and I think this passage highlights that generational divide:
The ruling is a blow to Google and authors’ groups who had supported the search giant’s ambitious plan to create a vast online library of digitised books. The controversial Google Book Search project attracted fierce criticism from authors, who believed their rights were being eroded, while winning praise from other quarters for helping to widen access to classic, rare or useful works of literature.
Too bad the writer, a real journalist, omitted the word “goodie”. My hunch is that since national libraries have not shown any interest in creating digital collections, students and researchers will be doing their work the way John Milton and Andrew Marvell did. Great for those who have the time, money, and cursive writing skills. Not so great for those who need to sift through lots of content quickly. With library budgets shrinking and librarians forced to decide which books to keep, which to store, and which to trash, I think the failure of national libraries is evident. Google made a Googley and somewhat immature attempt to step into the breach and look what has resulted? A bureaucratic, legal eagle snarl. Books are an intellectual resource and I keep asking, “If not Google who?” Reed Elsevier? The British government? The National Library of China? A consortium of publishers? The answer is, in my opinion, now clear, “No one.” Maybe Google will keep going with this project. Hard to tell. Life might be easier to shift gears, go directly to authors, and cut specific deals for their future work. In a decade or so, end of problem. Also, end of traditional publishing. If Google actually talked to me, I would offer this advice, “Go for it, dudes.”