Featured
Video Search Bragging Rights: Blinkx Says It Is Bigger Than Google VideoFor those stuck in northbound traffic on the slow moving river of traffic that is Highway 101, a quite large billboard that told me that Blinkx is the world’s largest video search engine.” In mid-May 2008, a rumor swirled across the Internet that News Corp. was kicking Blinkx’s tires. Was an acquisition in the wind? Was this billboard part of an acquisition campaign? Was it a reminder to Silicon Valley that Google’s span of control did not include video search?
I was sensitive to digitized video for two reasons. The Auto Channel told me that it has thousands of hours of automotive-related video. One interesting aspect of this is that when a video gets “hot”, it gets a great deal of traffic. What’s mystifying, if I understood what The Auto Channel told me, is that it’s very hard to predict what will strike the user’s fancy.
The other reason is that I spoke with a programmer who once did a bit of work for a couple of the large European video services. I can’t reveal the name of the project this person worked on, but it rhymes with “goosed”. The point was that video is flooding the Internet, and it is difficult to generate enough revenue to keep up with the research, development, programming, and bandwidth charges. Video on a metered line is important to many users, but, if I understood his comments, those users don’t pay. Advertisers want “tight” demographics, and the usage data aren’t compelling enough to allow some video sites to generate enough cash to stay alive at this time.
I am not sure how much video Blinkx has indexed. I heard from one of my sources that Google receives more than 1.2 million video uploads per month. I recall reading that the GOOG accounts for more than 60 percent of video search traffic, but since the ComScore traffic flap, it’s tough to know just how much traffic Google has. Could be 70 percent, maybe more. A few days ago, ComScore said Google was the number one Web site on earth. Maybe? Maybe not? Google knows because it does not have to estimate its traffic. My sources tell me that Google just counts traffic, no sampling necessary, to skew the data.
The Blinkx tag line is “Over 26 million hours of video. Search it all.” Their system appears to have a slather of patent documents in place. I tallied more than 100 when I stopped counting. Its conceptual search that includes speech recognition, neural networks, and machine learning to create text transcripts. That text is then searched.
Interviews
Former Clandestine Operative Says Automated Systems Not Good EnoughEditor’s Note: Robert Steele, former Marine Corp. officer and intelligence operative, was one of the first, if not the first, intelligence professional since World War II to question the relative value of secret sources and technologies in relation to open sources and technologies. Mr. Steele agreed to meet me near his office in suburban Washington, D.C. The full text of the interview appears below. After we spoke, Mr. Steele provided me with illustrations he referenced in our conversation. I have included these in the transcript at the point where Mr. Steele references them. You can read more about Mr. Steele at his Web site, OSS.Net.
How did you get interested in using information that’s readily available to anyone in a library, in newspapers, and online as a source of useful intelligence?
I went into the international spy program at CIA with a Master’s in International Relations, and knew quite a bit about citation analysis and primary research. What I was not expecting over the course of my clandestine career was the obsession with stealing secrets to the exclusion of all that could be known from open sources.
Robert D. Steele
The clandestine officers also refused to interact with the analysts—before leaving for my first overseas assignment, the Chief of Station took me to the analysis side of the house, and on my way there he said something along the lines of “these folks know nothing useful, and we tell them nothing.”
When the Marine Corps asked me to leave CIA to create the Marine Corps Intelligence Center in 1988, I promptly did what I thought the government wanted; that is, I spent $20 million on a codeword analysis center, including a Special Intelligence Communications (SPINTCOM) work station. I thought it would do everything except kill the terrorist.
Was I in for a shock. I had put a PC with Internet access in an isolated room, not connected to any government network. The PC had a modem. I was curious about online and bulletin board systems. In a short time, analysts were leaving their super charged workstations to stand in line to use the PC. These professionals were looking for information that was not in the government system and not known to our officers in the field (including diplomats and commercial or defense attaches).
What a wake up call.
That is when I learned that expensive systems are as good as their sources—narrow casting into the secret world made much of our multi-billion dollar technology virtually worthless. Analysts using the PC showed me that 80 to 90 percent of the information we needed could be obtained using the PC and public information to include direct calls to overt human experts. I also learned that useful information was available in 183 other languages no one in the US Government can speak or understand. Even today, a large number of Washington officials don’t understand the intelligence value of open sources of information including commercial imagery, foreign-language broadcasts that must be accessed locally, and gray literature, such as university yearbooks for a photo of a terrorist. Washington is completely out of touch with human experts that are not US citizens eligible for a secret clearance—the spies don’t want them unless they agree to commit treason, and the analysts are not allowed to talk to them by paranoid ignorant security officials.
Almost every vendor asserts that their systems can “do” business or competitive intelligence. In your experience is this accurate?
Look. BI and CI are not really intelligence.
BI or business intelligence is commonly used as a descriptor for what is nothing more than internal knowledge management, spiced up with a point-and-click graphics dashboard. Not only are most of these system non-interoperable with everything else, they are as smart or as stupid as the digital data they can access.
The reality of information in most organizations is that most of what is really valuable is not digital. And, most CEOs have zero idea what intelligence (decision support) actually means.
CI or competitive intelligence focuses on competitors. What I practice, Commercial Intelligence, focuses on
- External information
- Collaborative work
- Knowledge management
- Organizational intelligence.
Commercial intelligence leverages what can be drawn from the human social networks interacting with an organization and the other sources of information. External information is not information about competitors. It includes such factors as “true cost” of goods and next-generation “cradle to cradle” opportunities. You have to factor in the art and science of retaining Organizational Intelligence. I will send you a diagram that shows my view of this commercial intelligence space.
In my experience, today’s systems are edging toward failure. The systems aren’t very good, useful, or usable. As the Gartner Group recently said about Windows, it is untenable. I like Microsoft for its cash flow—they need to dump the legacy and launch an open source network with shared call centers and Blue Cube power processing.
Profiles
Apple Going Its Own Way in SearchOn May 6, 2008, the USPTO granted US 7,369,987 to Apple Inc. In my research for Beyond Search, one source told me that Apple was having some “difficulties” with its search-and-retrieval system for iTunes and OS X. I dismissed the comment because I had no corroboration. Apple is paranoid about what it does and how it does it. I was, therefore, intrigued by the invention disclosed as a “Multi-Language Document Search and Retrieval System”.
I’m no attorney, so you will need to download the document from the wonderful search system provided without charge by the US Patent & Trademark Office. Please, pay close attention to the syntax the USPTO’s outstanding search system requires. Google-style queries won’t work on this puppy.
Apple’s invention, according to US 7,369,987 is:
A multi-lingual indexing and search system … that performs tokenization and stemming in a manner which is independent of whether index entries and search terms appear as words in a dictionary.
The disclosures in this document make it clear that Apple, like Google and Microsoft, are poking around in similar algorithmic gardens. The claims put Apple in the search game. The document makes for interesting reading if you like legalese and information retrieval jargon. Maybe the iTunes’ search system will be juiced. I’m pretty happy with the built-in search function on my trusty Mac.
Stephen Arnold, May 8, 2008
Latest News
Love SharePoint Search, Hate What It Can IndexThere are upwards of 85 million Microsoft SharePoint licensees. Some of them rely on SharePoint search. A bit of fiddling with the plain vanilla SharePoint search... Read more »
LTU: Challenging the Thomson Reuters Trademark FortressLTU Technologies in France is putting its well-regarded image search technology to work in a proprietary trademark database. The LTU system compare a submitted digital... Read more »
Google Mini Signals Maxi ChangeIn San Francisco earlier this week, I spent some time with one of my tech pals. In the course of the conversation, we talked about the lousy margins on hardware,... Read more »
Data Harmony Update a Suite ReleaseAccess Innovations Inc., a data management systems company, is releasing version 3.4 of its Data Harmony software suite, and it sounds like a sweet deal. The five-component... Read more »
Google TranslateThe Google Search Appliance is a pretty nifty gizmo when you know how to “pimp” your GSA with the One Box API. On May 15, 2008, the GOOG confirmed what... Read more »
New Contract for ClarabridgeClarabridge, a “customer experience management vendor,” recently scored a posh client in Gaylord Hotels, who wants to utilize text analysis to review customer... Read more »
Semantra and Conversational AnalyticsSemantra asserts that it is a “pioneering developer of conversational analystics software”, or so it says in the news release a helpful person sent me. The... Read more »

