Content Risks and Rewards
July 28, 2010
My field is open source intelligence. I can’t reveal my sources, but I have heard that an intelligence unit can duplicate anywhere from 80 to 90 percent of its classified information from open sources. The trick of course is to know what is important. Most people can look at an open source document, dismiss it, and go about their day unaware of the key item of information that was right in front of them.
For that reason, this blog and my other blogs are open source. I use my Overflight system to suck in publicly accessible content. I look at what the system spits out and I highlight the important stuff. The magic in the system is not the software nor the writers whom I pay to create most of the content in Beyond Search and my other writings. I am sufficiently confident in my method that when I talk with a so called expert or an executive from a company, I am skeptical about what that person asserts. In most cases, experts lack the ability to put their information in context. Without context, even good information is useless.
When i read about Wikileaks publishing allegedly classified information, I wondered about the approach. Point your browser at “Next Step for Wikileaks: Crowdsourcing Classified Data” and learn what is ahead for information dissemination. The idea is that lots of people will contribute secrets.
Baloney.
The more stuff that is described as secret and sensitive, the more difficult it will be to figure out what is on the money and what is not. I have some nifty software, but I know from my tests that when information is weaponized, neither humans nor software can pinpoint where the train went off the tracks.
In my view, folks publishing allegedly classified information are looking for some rough sledding. Furthermore, the more baloney that gets pumped into the system, the greater the likelihood for disinformation.
If these documents had become known to me, I would have kept the puppies to myself. I would have used my Overflight system to verify points that my method identified as important. I would not accept any assertion, fact, or argument as valid until some more work was done.
Wikileaks is now famous, and sometimes fame can be tough. Just ask John Belushi if you can find him. People ask me what I don’t provide some color for some of my remarks. Well, that is because some information is not appropriate for a free blog. This is a lesson that I think some folks are going to learn in the School of Hard Knocks.
Stephen E Arnold, July 29, 2010
Freebie and open source
Summer Search Rumor Round Up
July 26, 2010
The addled goose has been preoccupied with some new projects. In the course of running around and honking, he has heard some rumors. The goose wants to be clear. He is not sure if these rumors are 100 percent rock solid. He does want to capture them before the mushy information slips away:
Source: http://oneyearbibleimages.com/rumors.gif
First, the goose heard that there will be some turnover at Microsoft Fast. The author of some of the posts in the Microsoft Enterprise Search Blog may be leaving for greener pastures. You can check out the blog at this link. What does this tell the goose? More flip flopping at Microsoft? Not sure. Any outfit that pays $1.2 billion for software that comes with its own police investigation is probably an outfit that would scare the addled goose to death. The blog is updated irregularly with such write ups as “Crawling Case Sensitive Repositories Using SharePoint Server 2010” and “SharePoint 2010 Search ‘Dogfood’ Part 3 – Query Performance Optimization.” Ah, the new problem of upper and lower case and the ever present dog food regarding performance. I thought Windows most recent software ran as fast as a jack rabbit. Guess not.
Second, a number of traditional search vendors are poking around for semantic technology. The notion that key words don’t work particularly well seems to be gaining traction. The problem is that some of the high profile outfits have been snapped up. For example, Powerset fell into the Microsoft maw and Radar Networks was gobbled by Paul Allen’s love child, Evri. Now the stampede is on. The problem is that the pickings seem to be slim, a bit like the t shirts after a sale at the Wal-Mart up the road from the goose pond here in Harrods’s Creek. For some lucky semantic startups, Christmas could come early this year. Anyone hear, a sound like “hack, hack”. Oh, that must be short for Hakia. You never know.
Third, performance may have forced a change at HMV.co.uk in merrie olde England. Dieselpoint was the incumbent. I heard that Dieselpoint is on the look out for partners and investors. The addled goose tried to interview the founder of the company but a clever PR person sidelined the goose and shunted him to the drainage ditch that runs through Blue Island, Illinois. Will Dieselpoint land the big bucks as Palantir did.
Fourth, the goose heard that a trio of Microsoft certified partners with snap in SharePoint search components were looking for greener pastures. What seems to be happening is that the easy sales have dried up since Microsoft started its current round of partner cheerleading. The words are there, but the sales are not. Microsoft seems to want the money to flow to itself and not its partners. Who is affected? The goose cannot name names without invoking the wrath of Redmond and a pride of PR people who insist that their clients are knocking the socks off the competition. However, does the enterprise need a half dozen companies pitching metatagging to SharePoint licensees? I think not. If sales don’t pick up, the search engine death watch list will pick up a few new entries before the leaves fall. Vendors in the US, Denmark, Germany, Austria, and Canada are likely to watching Beyond Search’s death watch list. Remember Convera? It spawned Search Technologies. Remember the pre Microsoft Fast? It spawned Comperio? When a search engine goes away, the azurini flower.
Fifth, what’s happened to the Oracle killers? I lost track of Speed of Mind years ago. There was a start up with a whiz bang method of indexing databases. I haven’t heard much about killing Oracle lately. In fact, stodgy old Oracle is once again poking around for search and content processing technology according to one highly unreliable source. With SES11g now available to Oracle database administrators, perhaps the time is right to put some wood behind a 21st century search solution.
If you want to complain about one of these rumors, use the comments section of this blog. Alternatively, contact one of the azurini outfits and get “real” verification. Some of their consultants use this blog as training material for the consultants whom you compensate. No rumor this. Fact.
Stephen E Arnold, July 26, 2010
Freebie
iPad and Enterprise IT
July 26, 2010
CIO Magazine ran a story that evoked the irony of a sophomore world literature class’s discussion of “Death in Venice”. On the surface, the old dude is trying to ease into the coffin. Below the surface, the tensions of northern and southern Europe create a flurry of post pubescent analyses.
Navigate to “Global CIO: Top 10 Reasons Steve Jobs & Apple Are The Future Of IT”. You can zip through the 10 reasons and understand that Apple’s iPad is not a toy for lean back content consumption. Nope. The iPad is the future of information technology. CIO Magazine has spoken.
A moment’s reflection reveals that *if* CIO Magazine is correct, CIO Magazine an its readers will be out of a job. No pun intended. The iPad limits the damage a user can do. Crashes are rare. Even a clueless tyro can locate content. The notion of docking to the big Apple itself reduces the likelihood of losing data. Installing software does not require a degree from MIT. Even the most conceptually challenged MBA can figure out how to work most of the device’s functions. What’s the argument for an expensive, often cranky information technology specialist. For that matter, why is a magazine needed to explain why information technology is so important to an organization. Most CEOs whom I know see IT as one big reason the company is not making headway in tough economic seas.
Consider these reasons offered up by CIO Magazine and its editorial engine sitting around struggling for a feature:
- Virtualization in general and VMWare specifically. Wow. I never would have thought of the iPad’s importance gated by VMWare. Fresh idea and one that underscores why CEOs want to be rid of information technology pundits.
- The iPad is a hot product. Yep, but what’s that say about the hostility to the clunky information technology solutions foisted on BMW crazed MBAs for many years? I think it says that complexity has made a toaster style computer the next big thing.
- The Apple desktop computers are selling. No kidding. The systems generally work as advertised. I don’t have space to explain the craziness of the Windows 7 desktop. Let me say that USB support is less than outstanding.But what’s the iPad and the CIO list mean for search.
Four points in my opinion:
First, search vendors have to come to grips with complexity and quick. Push back regarding the Rube Goldbeg systems can do them in
Second, the price point becomes an issue. When complexity is kicked to the curb, commoditization may grab the brass ring. Google had this idea years ago but has not been able to capitalize. Now it may be Lucene/Solr that gets the prize.
Third, users go their own way just as they did when bootlegging PCs into companies in the 1980s. I heard on a conference call that Google’s success is due to its opening Pandora’s Box, not from its brilliant marketing efforts.
Fourth, management becomes impotent. I have examples of senior managers who can no longer manage. The evidence is everywhere. Can you name a big company that has lost its sense of direction and the confidence of its shareholders. Need a hint?
Will CIO Magazine survive as a gadget publication? Probably not. Will traditional IT survive? In some outfits, the deck chairs floated when the Titanic sank. Outlook for those with buoyancy is good. Ah, irony of death in Venice digital style.
Stephen E Arnold, July 24, 2010
Freebie
White House Tech Chiefs Set Strategy?
July 24, 2010
The title, Bringing Government Up to Data, might sound promising but we’re not convinced that the technology that drove Barak Obama’s campaign will translate well into the White House.
Vivek Kundra, Jeffrey Zients and Aneesh Chopra, have been tasked with making the presidents vision of a data driven White House a reality and making government websites that look as cool and appealing as a Apple apps store, but job is larger than it looks and not all that easy. Still they believe that technology may be the answer when it comes to the trail of paper work that the new health care system and other things will bring.
Good luck. The government can’t even budget or control costs offline. Computers cannot help bad management techniques. In fact, computerizing flawed processes produces more errors more quickly. With the tech strategy shifts coming fast and furious, one wonders how the silos of technology within the US government will do much more than boost costs and increase inefficiency.
Rob Starr, July 24, 2010
Freebie
Google Metaweb Deal Points to Possible Engineering Issue
July 19, 2010
Years ago, I wrote a BearStearns’ white paper “Google’s Semantic Web: the Radical Change Coming to Search and the Profound Implications to Yahoo & Microsoft,” May 16, 2007, about the work of Epinions’ founder, Dr. Ramanathan Guha. Dr. Guha bounced from big outfit to big outfit, landing at Google after a stint at IBM Almaden. My BearStearns’ report focused on an interesting series of patent applications filed in February 2007. The five patent applications were published on the same day. These are now popping out of the ever efficient USPTO as granted patents.
A close reading of the Guha February 2007 patent applications and other Google technical papers make clear that Google had a keen interest in semantic methods. The company’s acquisition of Transformics at about the same time as Dr. Guha’s jump to the Google was another out-of-spectrum signal for most Google watchers.
With Dr. Guha’s Programmable Search Engine inventions and Dr. Alon Halevy’s dataspace methods, Google seemed poised to take over the floundering semantic Web movement. I recall seeing Google classification methods applied in a recipe demo, a headache demo, and a real estate demo. Some of these demos made use of entities; for example, “skin cancer” and “chicken soup”.
Has Google become a one trick pony? The buy-technology trick? Can the Google pony learn the diversify and grow new revenue tricks before it’s time for the glue factory?
In 2006, signals I saw flashed green, and it sure looked as if Google could speed down the Information Highway 101 in its semantic supercar.
Is Metaweb a Turning Point for Google Technology?
What happened?
We know from the cartwheels Web wizards are turning, Google purchased computer Zen master Danny Hillis’ Metaweb business. Metaweb, known mostly to the information retrieval and semantic Web crowd, produced a giant controlled term list of people, places, and things. The Freebase knowledgebase is a next generation open source term list. You can get some useful technical details from the 2007 “On Danny Hillis, eLearning, Freebase, Metaweb, Semantic Web and Web 3.0” and from the Wikipedia Metaweb entry here.
What has been missing in the extensive commentary available to me in my Overflight service is some thinking about what went right or wrong with Google’s investments and research in closely adjacent technologies. Please, keep in mind that the addled goose is offering his observations based on his research for this three Google monographs, The Google Legacy, Google Version 2.0, and Google: the Digital Gutenberg. If you want to honk back, use the comments section of this Web log.
First, Google should be in a position to tap its existing metadata and classification systems such as the Guha context server and the Halevy dataspace method for entities. Failing these methods, Google has its user input methods like Knol and its hugely informative search query usage logs to generate a list of entities. Heck, there is even the disambiguation system to make sense of misspellings of people like Britney Spears. I heard a Googler give a talk in which the factoid about hundreds of variants of Ms. Spears’s name were “known” to the Google system and properly substituted automagically when the user goofed. The fact that Google bought Metaweb makes clear that something is still missing.
Partner News from BA-Insight
July 18, 2010
I received a link to a news story about BA-Insight, Microsoft SharePoint, and the Fast search system. You can read the material at this link. What interested me is not the endorsement of BA Insight by Microsoft. BA Insight, like other vendors, is a “partner” of Microsoft. Love is expected in this tie ups. What surprised me was that the page on which the story about BA Insight as a partner ran a video featuring a pirate flag, a trip to the commode, and a tour of ESA’s Mars500 and a video about turtle hatchings. I was confused because of the welter of distracting audio and video messages running live and via a link to a webinar. Interesting content-based marketing approach.
Stephen E Arnold, July 18, 2010
Freebie
Autonomy: A Real Success. CMSWatch: Maybe Another Real Miss?
July 12, 2010
In Harrod’s Creek, I can easily spot the real squirrel hunters. They have food. Mostly laconic, these hunters have a big pile of dead squirrels as proof of their competence. There is also the smell of fresh burgoo wafting from their log cabins. I can smell ability from my goose pond.
Lousy hunters have empty gun belts and squirrels shot when snacking on store bought food used to lure the critters. That’s a real danger — cheap tricks or just shooting wildly, often putting bird shot in an innocent’s backsides or the face like the 2006 incident between Vice President Dick Cheney and Texas lawyer Harry Whittington. Some faux hunters have just shot themselves in the foot. Ouch!
Azure chip consultants is a synonym for “bad hunter” in my opinion. Source: http://api.ning.com/files/LCP2NCaWo-ptCqGncB3hGsX8vuh8dnDzSJ0iLnkibas_/18holeinhandG.jpg
One of my two or three readers sent me a link to a write up called “Don’t Ogle Search If You Really Want Content Management”. In my opinion, the write up relies on insinuation, not facts. (I think that some folks are immune to facts, but I find facts useful.) In the article’s headline, the word “ogle”, for example, is one I don’t associate with information retrieval. (The publisher of this “ogle” opinion piece caught my attention in July 2008 with its similar assault on Attivio. My response to that misleading article is here.)
Yet another example of factless criticism of a vendor appears in this segment of the “ogle” write up about Autonomy, one of a very small number of search and content processing vendors with a consistent track record of technical breadth, sales, revenue, and profit:
From an initial focus on enterprise search tools, Autonomy has become a roll-up vendor after acquiring a variety of other information management suppliers such as Interwoven. As a financial strategy this can be successful, and investors seem to cotton to Autonomy. As a technology strategy, vendor roll-ups are problematic. Autonomy’s technology strategy is to rip legacy search subsystems from acquired products, replace them with some pieces from its own IDOL toolset, and then promote its particular approach to search as a distinct advantage for you. Specifically, Autonomy will try to sell you on the value of “meaning-based computing.” Even if you can get your mind around what meaning-based means, you should remain skeptical that Autonomy has technically spectacular or original services here. More importantly, you risk getting sidetracked from your original goal of, say, creating a user-friendly repository for your 50,000 Office documents.
These statements are presented without verifiable foundation to support the allegations in my opinion.
Autonomy is on track to hit $1.0 billion by the end of calendar 2010. The company has a proven track record of improving the performance of the companies it acquires. Autonomy’s management has demonstrated its ability to integrate quickly its acquired products with IDOL (the firm’s integrated data operating layer). The result is Autonomy’s knack of transforming the acquired companies’ position in their markets.
But there are other data that shed light on Autonomy’s track record, which I have documented Autonomy’s technology in my writings such as Beyond Search (Gilbane, 2009), the Enterprise Search Report (CMSWatch.com, 2004-2006), and Successful Enterprise Search Management (Galatea, 2009). Here are three points that must not be overlooked:
- Autonomy has 20,000 plus customers plus around 1,000 licensees of its technologies for use in other enterprise software and systems
- Autonomy has made intelligent acquisitions that has given the firm a strong presence in eDiscovery, rich media, and fraud detection. Autonomy has recently pushed into online marketing using capabilities from Ineterwoven and its IDOL framework. My research reveals that Autonomy has acquired companies to bring its technology to new markets so more content can be understood.
- Autonomy has grown its revenues and generated a profit, making it possible for other UK based technology companies to ride the Autonomy horse in the race for government and venture funding.
In December a year or so ago, at the International Online Conference, in my for-fee, end note debate, I challenged Andrew Kanter (Autonomy), Charlie Hull (Lemur Consulting), and Dr. Charles Oppenheim (Loughborough University) about their views of search, content processing, and related fields. In front of an audience of about 300 search professionals, I pointed out that key word search was dead. I pointed out that most search systems did not understand the meaning of processed information. Autonomy’s Andrew Kanter strongly and politely disagreed with me. As I recall, he said to the audience and me:
Autonomy IDOL is the only product in the market that can understand the meaning and concepts of all information in any language, including audio and video. This has big implications for the content management market as no other vendor can do this.
I demanded some concrete examples to support his position. Mr. Kanter without missing a beat gave me four concrete examples drawn from Autonomy’s work in intelligence, search enabled applications, fraud detection, and rich media.
What did I do?
Gizmos and Concentration: The Odd Couple
July 8, 2010
So I am in a convenient store near Harrods’s Creek. There are two people in line in front of me. One of my neighbors is paying the stupidity tax by snagging a fist full of lottery chances. The other person in full yuppie boating regalia is buying a 24 can carton of beer from the giant beer cooler. I have a lousy bottle of chocolate milk. The clerk. A high school junior. Tomorrow’s leader here today.
The clerk is talking on the phone, yapping at motorists trying to get the 1950 vintage gasoline pumps to work, thumbing through the mind boggling number of lottery tickets, and casting furtive glances at the front door. Robbers who dropped out of grade school think that Harrods Creek convenient stores have bags of cash ready to hand out on demand.
The clerk gave the person paying the stupidity tax the wrong tickets. The guy with the beer put the carton on the floor and left. I stood there waiting with exact change. I have no idea if the guy beating the gas pump with the nozzle was trying to get the pump to work or venting rage because the clerk did not turn on the pump. When my turn came, the guy clerk did not interrupt his conversation, took my $1.45, and turned his attention to the water running in the sink. It was overflowing.
Get the picture. One convenient store guy unable to do one thing at a time and do one thing correctly.
Too much anecdote and not enough data. Point your browser at “Excessive ‘Screen Time’ Said to Affect Children’s Attention Span: Report.” Here is a keeper of a finding:
Researcher Edward Swing, a graduate student at Iowa State University, along with his colleagues assessed 1,323 children in the third, fourth and fifth grades over a 13-month time period. Swing said: “Those who exceeded the AAP recommendation were about 1.6 times to 2.2 times more likely to have greater than average attention problems.” What’s interesting is the study also included a one-time survey of 210 college students. The middle school students, he reported, were a slightly less likely than the college students to have attention problems.
Maybe the real life experiences on a college campus are different from what I witness everyday. It is good to know that the future convenient clerks will exemplify unfocused behavior. Is that not special? No wonder search systems are giving online users what the numerical recipes determine the user wants. Hey, it’s fast and correct because it is from a computer. For sure – UX.
Stephen E Arnold, July 8, 2010
Freebie
Who Will Publishers Blame Next?
July 6, 2010
Google has been the target for some media companies for years. Most of the hostility has been ignited because Google indexes content. Users want to find information. Anyone who uses a computer wants to shorten the distance between A (what is needed) and B (where the information is). Simple and a constant problem for Google’s engineers to understand.
Now publishers have to face some painful facts.
First, Yahoo – after years of inattention – has figured out that clicks yield valuable information. According to the New York Times, Yahoo will use these data to deliver “news.” (Note that this link will go dead because the New York Times is trying to cope with online. Helpful, right?) Gee, I do that in this lousy blog. I look at usage reports from Blossom, AWStats, and other analytics sources. I write about what gets clicks. If the addled goose, aged 65, figured this out years ago, what took Yahoo so long? Interesting how those young wizards overlook the message of the purloined letter?
Second, the US Postal Service is going to raise its rates. (Same deal. The link will be dead in a nonce.) The USPS has been the print publishers’ pal for decades. My grandfather, after World War I let him out of the trenches, delivered mail. He complained long and loud about the crap he delivered at bargain basement rates. In Harrods Creek, the postmistress and I talk about the volume of junk that flows through the system. One conference promoter sends me dozens of fliers at bulk rate prices. Last year, I gathered up these fliers and mailed them to the company president. Guess what? No change. It is cheaper to pump baloney through the USPS than clean the mailing list. How long will this outfit be in business?
Hopefully the media titans will direct their ire at Yahoo and the US government. Why not blame others instead of oneself? Isn’t that the modern MBA way?
Stephen E Arnold, July 6, 2010
Freebie
MondayNote Spelling Out Some Truths
July 6, 2010
Many Americans, confident of their high school French, confront French culture in a food crisis. Perhaps it is the cloak of invisibility that shrouds the American in a cafe opposite Opéra? Maybe it is the two scoop ice cream confrontation? Whatever it is, Americans find themselves baffled by the French. French American business relations are often as confusing. The French executives explain how the deal will work. The Americans look confused. The French shrug, sometimes offering a little smile, and walk away. The Americans look like they have lost their iPhone or car keys. The software or business method that could have helped the US outfit has walked out the door. Too much of a hill to climb.
If you think you understand the French or Europeans in general, you will not benefit so much from “The Poison of Arrogance.” That’s too bad. I think the write up is accurate and underscores a potentially fatal weakness in how American companies and executives deal with other cultures.
I am the quintessential ugly American. From my clumsy interaction with Brazilians when I was in grade school to my inept behavior in dozens of countries, I have no illusions about myself. I am putting myself in the barrel that M. Filloux has assembled from the facts and observations in his argument. The question is, “Will others take heed?”
I don’t want to summarize the well reasoned article. I do want to highlight one key passage and offer a comment. First, consider this passage:
Many American companies suffer from vision impairment: they consider the Rest of the World as an aggregation of second-class people. What I called in a previous column the “Burundi Syndrome”, leads to zero delegation of authority. This leads to terrible results. Each attempt from a European subsidiary to adapt company policies to its local market conditions hits a wall of a soviet-like centralization, this time epicentered on the West Coast of the United States.
As painful as it is to accept criticism, I agree with this statement. You can consult the original write up to read about the companies that M. Filloux uses as examples. You will know them well. I have worked for some of them as you may have.
My observation is simple: Change is needed. Companies that evoke regulatory officials’ ire are doing something to create a fire storm, not put the fire storm out. Companies that tell other countries how to run their rail roads are not going to get first class tickets under any circumstances.
My concern is that time has run out for some companies, and I think the push back and confrontations are likely to escalate. Unnecessary and perhaps unfixable.
Stephen E Arnold, July 6, 2010
Freebie