Featured
Collective Intelligence Anthology AvailableThe Arnoldit.com mascot admires the new collection of essay by Mark Tovey. Collective Intelligence: Creating a Prosperous World at Peace, published by the Earth Intelligence Network in Oakton, Virginia (ISBN: 13: 978-0-97-15661-6-3) contains more than 50 essays by analysts, consultants, and intelligence practitioners. You can obtain a copy from the publisher, Amazon, or your bookseller.
The ArnoldIT mascot completed reading the 600-page book with remarkable alacrity for a duck.
The collection of essays is likely to find many readers among those interested in social phenomena of networks. Many of the essays, including the one I contributed, talk about information retrieval in our increasingly inter connected world.
This essay will provide a synopsis of my contribution, “Search–Panacea or Play. Can Collective Intelligence Improve Findability”, which I wrote shortly before completing Beyond Search: What to Do When Your Search System Doesn’t Work“. My essay begins on page 375.
Social Search
The dominance of Google forces other vendors to look for a way over, under, around, or through its grip on the Web search. The vendor landscape now offers search and content processing systems that arguably do a better job of manipulating XML (Extensible Markup Language) content, figuring out who knows whom (the social graph initiative), and the “real” meaning of content (semantic search). There are more than 100 vendors who have technology that offers, if one believes the marketing collateral and conference presentations, a way to squeeze more information from information.
Social search is the name given to an information retrieval system that incorporates one or more of these functions:
- Users can suggest useful sites. Examples: Delicious.com and StumbleUpon.com
- The system discovers relationships between and among processed documents and links: Powerset.com and Kartoo Visu
- The system analyzes information extracts entities and identifies individuals and their relationships: i2 Ltd (now part of ChoicePoint) and Cluuz.com
- Monitoring of user behavior and using data to guide relevance, spidering and other system functions: public Web indexing companies
There are other types of social functions, but these provide sufficient salt and pepper for this information side dish. The reason I say side dish is that social functions are not going to displace the traditional functions on which they are based. Social search has been in the mainstream from the moment i2 Ltd. introduced its workbench product to the intelligence community more than a decade ago. “Social” functions, then, are a recent add-on to the main diet in information retrieval.
Old Statistics and Cheap, Powerful Computers
What’s overlooked in the rush to find a Google “killer” is that the new companies are using some well-known technologies. For example, the inner workings of Autonomy’s “black box” is somewhat dependent on the work of a slightly unusual Englishman, Thomas Bayes. Mr. Bayes left the world a couple of centuries ago, but his math has been a staple in college statistics courses for many years. To deploy Bayesian techniques on a large scale is, therefore, not exactly a secret to the thousands of mathematicians who followed his proofs in pursuit of their baccalaureate.
Interviews
Former Clandestine Operative Says Automated Systems Not Good EnoughEditor’s Note: Robert Steele, former Marine Corp. officer and intelligence operative, was one of the first, if not the first, intelligence professional since World War II to question the relative value of secret sources and technologies in relation to open sources and technologies. Mr. Steele agreed to meet me near his office in suburban Washington, D.C. The full text of the interview appears below. After we spoke, Mr. Steele provided me with illustrations he referenced in our conversation. I have included these in the transcript at the point where Mr. Steele references them. You can read more about Mr. Steele at his Web site, OSS.Net.
How did you get interested in using information that’s readily available to anyone in a library, in newspapers, and online as a source of useful intelligence?
I went into the international spy program at CIA with a Master’s in International Relations, and knew quite a bit about citation analysis and primary research. What I was not expecting over the course of my clandestine career was the obsession with stealing secrets to the exclusion of all that could be known from open sources.
Robert D. Steele
The clandestine officers also refused to interact with the analysts—before leaving for my first overseas assignment, the Chief of Station took me to the analysis side of the house, and on my way there he said something along the lines of “these folks know nothing useful, and we tell them nothing.”
When the Marine Corps asked me to leave CIA to create the Marine Corps Intelligence Center in 1988, I promptly did what I thought the government wanted; that is, I spent $20 million on a codeword analysis center, including a Special Intelligence Communications (SPINTCOM) work station. I thought it would do everything except kill the terrorist.
Was I in for a shock. I had put a PC with Internet access in an isolated room, not connected to any government network. The PC had a modem. I was curious about online and bulletin board systems. In a short time, analysts were leaving their super charged workstations to stand in line to use the PC. These professionals were looking for information that was not in the government system and not known to our officers in the field (including diplomats and commercial or defense attaches).
What a wake up call.
That is when I learned that expensive systems are as good as their sources—narrow casting into the secret world made much of our multi-billion dollar technology virtually worthless. Analysts using the PC showed me that 80 to 90 percent of the information we needed could be obtained using the PC and public information to include direct calls to overt human experts. I also learned that useful information was available in 183 other languages no one in the US Government can speak or understand. Even today, a large number of Washington officials don’t understand the intelligence value of open sources of information including commercial imagery, foreign-language broadcasts that must be accessed locally, and gray literature, such as university yearbooks for a photo of a terrorist. Washington is completely out of touch with human experts that are not US citizens eligible for a secret clearance—the spies don’t want them unless they agree to commit treason, and the analysts are not allowed to talk to them by paranoid ignorant security officials.
Almost every vendor asserts that their systems can “do” business or competitive intelligence. In your experience is this accurate?
Look. BI and CI are not really intelligence.
BI or business intelligence is commonly used as a descriptor for what is nothing more than internal knowledge management, spiced up with a point-and-click graphics dashboard. Not only are most of these system non-interoperable with everything else, they are as smart or as stupid as the digital data they can access.
The reality of information in most organizations is that most of what is really valuable is not digital. And, most CEOs have zero idea what intelligence (decision support) actually means.
CI or competitive intelligence focuses on competitors. What I practice, Commercial Intelligence, focuses on
- External information
- Collaborative work
- Knowledge management
- Organizational intelligence.
Commercial intelligence leverages what can be drawn from the human social networks interacting with an organization and the other sources of information. External information is not information about competitors. It includes such factors as “true cost” of goods and next-generation “cradle to cradle” opportunities. You have to factor in the art and science of retaining Organizational Intelligence. I will send you a diagram that shows my view of this commercial intelligence space.
In my experience, today’s systems are edging toward failure. The systems aren’t very good, useful, or usable. As the Gartner Group recently said about Windows, it is untenable. I like Microsoft for its cash flow—they need to dump the legacy and launch an open source network with shared call centers and Blue Cube power processing.
Profiles
Apple Going Its Own Way in SearchOn May 6, 2008, the USPTO granted US 7,369,987 to Apple Inc. In my research for Beyond Search, one source told me that Apple was having some “difficulties” with its search-and-retrieval system for iTunes and OS X. I dismissed the comment because I had no corroboration. Apple is paranoid about what it does and how it does it. I was, therefore, intrigued by the invention disclosed as a “Multi-Language Document Search and Retrieval System”.
I’m no attorney, so you will need to download the document from the wonderful search system provided without charge by the US Patent & Trademark Office. Please, pay close attention to the syntax the USPTO’s outstanding search system requires. Google-style queries won’t work on this puppy.
Apple’s invention, according to US 7,369,987 is:
A multi-lingual indexing and search system … that performs tokenization and stemming in a manner which is independent of whether index entries and search terms appear as words in a dictionary.
The disclosures in this document make it clear that Apple, like Google and Microsoft, are poking around in similar algorithmic gardens. The claims put Apple in the search game. The document makes for interesting reading if you like legalese and information retrieval jargon. Maybe the iTunes’ search system will be juiced. I’m pretty happy with the built-in search function on my trusty Mac.
Stephen Arnold, May 8, 2008
Latest News
The Library of Congress and Semantic SearchThe buzz about semantic search is rising. Powerset’s demonstration using Wikipedia data has triggered interest in searching in more intuitive ways. I received... Read more »
Sybase Jumps into the Content Processing Appliance FraySybase announced on May 12, 2008, the roll out of its Sybase Analytic Appliance. The hardware is an IBM Power System preconfigured with Sybase IQ, Sybase PowerDesigner,... Read more »
Commercial Intelligence: A Better Way to Do Competitive IntelligenceBusiness intelligence and competitive intelligence are “not really intelligence”, asserts Robert D. Steele, well-known advocate of open source information... Read more »
Intelligenx Discloses Referrals Fuel Rapid GrowthIn an exclusive interview, Iqbal and Zubair Talib, senior managers of Intelligenx, reveal that referrals have fueled the company’s rapid growth. Intelligenx... Read more »
Powerset AvailableNavigate to Powerset.com and try out the much-publicized Web search system. Using proprietary technology plus third-party components, Powerset is a semantic search... Read more »
Google: A Brace of Media Analyzer InventionsOn May 8, 2008, the USPTO, an outstanding organization with a stellar search system, published two Google patent applications. US2008/0107337 is “Methods and... Read more »
Let’s Assume Microsoft Acquires PowersetI read Dan Farber’s most intriguing post “Is Microsoft Stalking Powerset’s Search Technology?” I have Saturday chores to do, and I was sweeping... Read more »

