December 3, 2013
Before I lose the thought, I want to capture one of the important lessons from the Topsy sale. You can get the basic story at “Apple Acquires A Social Media Analytics Company For ~$200 Million.” None of the write ups emphasized the important shift at Topsy that made the deal possible. The company abandoned its Web log and social media index and focused on Twitter. Once the change was made, Topsy had something to sell; namely, an easy to use system that made figuring out what was hot and what was not on Twitter. With the shift, an important search and retrieval resource was lost to people like me. For the investors in Topsy, the shift delivered $200 million big ones.
Was it technology? Nope.
Was it better search? Nope.
Was it spiffier analytics? Nope.
It was positioning. As I learned at a recent conference, the same old Topsy is still there covered up with the Twitter baked on enamel and clear coat. And Apple bit.
Moral: Figure out the positioning. That seems to be one key to big paydays. In short, words matter.
With one less resource to use, Google’s control of “information” grows stronger. I will review my initial thoughts in a few months to see if I was right or wrong. In the meantime, party on, Topsy.
Stephen E Arnold, December 3, 2013
December 3, 2013
A new profile is available on the Xenky site today. SchemaLogic is a controlled vocabulary management system. The system combines traditional vocabulary management with an organization wide content management system specifically for indexing words and phrases. The analysis provides some insight into how a subsystem can easily boost the cost of a basic search system’s staff and infrastructure.
Taxonomy became a chrome trimmed buzzword almost a decade ago. Indexing has been around a long time, and indexing has a complete body of practices and standards for the practitioner to use when indexing content objects.
Just what an organization needs to make sense of its text, images, videos, and other digital information/data. At a commercial database publsihing company, more than a dozen people can be involved in managing a controlled term list and classification coding scheme. When a term is misapplied, finding a content object can be quite a challenge. If audio or video are misindexed, the content object may require a human to open, review, and close files until the required imnage or video can be located. Indexing is important, but many MBAs do not understand the cost of indexing until a needed content object cannot be found; for example, in a legal discovery process related to a patent matter. A happy quack to http://swissen.in/swictingsys.php for the example of a single segment of a much larger organization centric taxonomy. Consider managing a controlled term list with more than 20,000 terms and a 400 node taxononmy across a Fortune 500 company or for the information stored in your laptop computer.
Even early birds in the search and content processing sector like Fulcrum Technologies and Verity embraced controlled vocabularies. A controlled term list contains forms of words and phrases and often the classification categories into which individual documents can be tagged.
The problem was that lists of words had to be maintained. Clever poobahs and mavens created new words to describe allegedly new concepts. Scientists, engineers, and other tech types whipped up new words and phrases to help explain their insights. And humans, often loosey goosey with language, shifted meanings. For example, when I was in college a half century ago, there was a class in “discussion.” Today that class might be called “collaboration.” Software often struggles with these language realities.
What happens when “old school” search and content processing systems try to index documents?
The systems can “discover” terms and apply them. Vendors with “smart software” use a range of statistical and linguistic techniques to figure out entities, bound phrases, and concepts. Other approaches include sucking in dictionaries and encyclopedias. The combination of a “knowledgebase” like Wikipedia and other methods works reasonably well.
November 30, 2013
I read “Google’s Growing Patent Stockpile.” There is nothing like a search of commercial databases for patent information. The write up points out that Google is filing more patents. Only outfits like IBM and Microsoft are doing more filing and inventing, or is it inventing and filing?
Tucked into the article was this paragraph which is a quote to note in my opinion:
Gregory Aharonian, a technical analyst who works with lawyers to overturn patents, says that Google, like other big companies, knows that if it swamps the overworked patent office with applications, it will win patents, even if its ideas aren’t necessarily that novel. “The general rule is, the more patents a company has, the more closely the quality of their patent portfolio approaches the quality of all patents, which is to say the majority of all of these patents are invalid,” says Aharonian.
Good point. Google patents are useful for many reasons. One function for me is to gauge how quickly Google is becoming more like IBM and Microsoft. Is that a good thing? Just look at search. Google’s search innovations are redefining relevance and bringing a new connotation to precision and recall.
Progress is evident.
Stephen E Arnold, November 30, 2013
November 29, 2013
Orson Scott Card turned up in a Hacker News item this morning. I followed the link because I was curious about a sci-fi novelist’s views about the death of software companies. As I scanned the short essay, I realized that Mr. Card had touched on several points germane to search, content processing, and analytics companies. I recommend you read the essay that is available on a Carnegie Mellon Web site.
The main idea is:
The environment that nurtures creative programmers kills management and marketing types – and vice versa.
The essay concludes with a passage that is particularly thought provoking:
He [the programmer] suddenly finds that alien creatures control his life. Meetings, Schedules, Reports. And now someone demands that he PLAN all his programming and then stick to the plan, never improving, never tweaking, and never, never touching some other team’s code. The lousy young programmer who once worshiped him is now his tyrannical boss, a position he got because he played golf with some sphincter in a suit. The hive has been ruined. The best coders leave. And the marketers, comfortable now because they’re surrounded by power neckties and they have things under control, are baffled that each new iteration of their software loses market share as the code bloats and the bugs proliferate.
The essay edges up to one of the characteristics of search, content processing, and analytics companies. A talented individual may have a great idea and there may be one, two, or a handful of others who convert capability into creation.
My team and I are reviewing profiles— actually case studies — of search and content processing vendors we have assembled over the last 20 or so years. Most of the vendors begin with a passion to solve a particular problem. For example, the Fulcrum Technologies company in Ottawa, Ontario. A group of innovators left one company, set up another, and then proceeded to bedevil the then market leader Verity.
Fulcrum opened in 1983. The product appears to be available today as a component in an OpenText enterprise solution. But who thinks about Fulcrum today? I am not sure if many, if any, of the original programmers are still working at the company 30 years later. Who runs the Fulcrum unit? What are the innovations it offers? Where is the tension and excitement of the Fulcrum – Verity face offs. Verity has been absorbed into Autonomy and Autonomy has been gobbled by Hewlett Packard. Fulcrum Ful/Text and the WAIS-based SearchServer migrated through an Italian outfit, PC Docs, then Hummingbird, and finally into OpenText.
Image from Cranberry Township. http://bit.ly/180M1il
- The vendors offering search and content processing solutions seem to have a very distinct trajectory that follows Mr. Card’s essay. I can’t think of many search and content processing vendor that has avoided some type of Gravity’s Rainbow trajectory
- Search technology seems to be resistant to innovation. The assertions of Fulcrum and Verity are as fresh and buzzwordy today as they were decades ago.
November 28, 2013
I read “Are Big Data Vendors Forgetting History?” I worked through five observations about Big Data and realized that history is essentially irrelevant to Big Data vendors and to some pundits.
I was encouraged by the opening paragraph; to wit:
With any new hot trend comes a truckload of missteps, bad ideas and outright failures. I should probably create a template for this sort of article, one in which I could pull out a term like “cloud” or “BYOD” and simply plug in “social media” or “Big Data.”
My confusion mounted as I worked through the five “history lessons” Datamation sought to teach me:
- Little failures “portend” sometimes big failures
- Fuzzy terminology can “poison the well”
- Details can sidetrack a project
- Technical details are important
- Big Data matters.
Okay, let me address items 3 and 4, the paradox of “details matter” and “details don’t matter.” I am not sure how to resolve these opposites. In my experience, the result, particularly in technology, depends on details. But the details have to fit into some “frame.” A random detail lacks context. Perhaps the lesson is to balance the “vision” with the “execution.” Get one wrong and the other is dragged down. Big Data requires trimming; that is, chopping the data down so that a question can be answered. Once the data set is created and conforms to textbook statistical tests, then a cascade of details take center stage. Big Data often lacks this organic flow between the two opposites.
With regard to item 1, failure on any scale predicting the future, I am not sure what history teaches. Napoleon hoofed it to Moscow and then a German military leader followed in Napoleon’s footsteps. Er, winter. Food. Resupply. History, like the stock market, does not do much to make prediction a dead certain process. Do technology managers learn from the “past”? In my experience, technology managers do what is necessary to keep their job and make money. Excellence is not as high on this list as one would hope. Tomorrow is like today. “Progress” based on reading tea leaves may be a difficult to achieve.
I think that fuzzy terminology, item 2, is an emergent function in technology. Making up words and coining buzzwords performs three jobs. First, it creates an air of specialty or I know something you need to know. Second, it allows an in crowd to form so that outsiders have a tough time getting in the club. Third, marketers can hook vague promises of value to a with it term to close a deal. In the last five years, the technical innovations have been more like refinements than breakthroughs.
Item 5 which suggests that anyone who questions the value of Big Data is taking the easy path forward is interesting. Big Data, in my view, has been a constant issue. What’s new is the number of companies using the term to describe what have been standard functions. Sure, the aging Hadoop “revolution” eliminates some of the hassles and costs associated with a Codd database. The reality is that most organizations lack the staff, the resources, and the time to convert Big Data into meaningful business activities. (Meaningful means “revenue producing.”)
In short, I find the list interesting, but I don’t think there are many history lessons for me. The write up is more of an apologia for a buzzword that is teaching some people that making sense of available information is dog work, expensive, and often tough to connect to a specific payback.
The reason? Big Data requires trained professionals with expertise in math, statistics, and business processes. Last time I checked, individuals with these capabilities were in short supply. Big Data just gets bigger when there are too few sculptors to chop down the ever growing mountain of bits and bytes.
Stephen E Arnold, November 28. 2013
November 27, 2013
I read “HP’s Meg Whitman Ordered to Face Autonomy Charges.” Hard on the heels of Hewlett Packard’s quarterly results, the company has to explain to one disgruntled shareholder why the Autonomy deal went south.
The write up states:
In the latest $1 billion (£647m) lawsuit, HP shareholders accused HP’s management team of ignoring warnings before it bought Autonomy for $11.3 billion (£7.3bn) in 2011 and that the company’s financial numbers had been exaggerated. It is also claimed that HP tried to get out of the deal before it closed. The company later took a nearly $9 billion write-down largely connected with the purchase.
The deal put a burr under some digital cowpokes’ saddles. HP paid $11 billion for Autonomy. At the time of the deal, Autonomy was an $800 to $900 million a year company. Some months after the deal closed, the canny HP management took an $8 billion write down on the Autonomy deal.
According to the Tech Week Europe article:
The investors allege that HP’s management was negligent because of the $8.8 billion (£5.7bn) write-down on the deal HP announced in November 2012. HP officials blamed ‘accounting irregularities’ by Autonomy executives in the months leading up to the deal. The investors allege that the resulting drop in HP’s stock price effectively wiped billions of dollars from the company’s market value. The FBI are said to be investigating the allegations, as is the UK’s Serious Fraud Office (SFO).
In the meantime, the HP deal has not generated the big time payoff that someone at HP assumed would result from the deal. HP, like many other search vendor buyers, seems to be learning that:
- Search is an expensive business to fund. Those marketing, research, and support costs are brutal. Most of the failed search vendors ran into financial trouble despite the ministrations of different CEOs. Maybe Autonomy was managed better? Interesting question.
- Search, by itself, is not a compelling product or service to many potential customers. As a result, search is no longer search. Search embraces dozens of functions from text mining to the ubiquitous and fuzzy Big Data. HP is now trying to market lots of search related products and services. My hunch is that this is a bigger job than trying to sell $11 billion worth of key word search licenses.
- Companies that are not really software centric do not understand the oddities of the enterprise search sector. My view is that MBAs at outfits like HP assume that their Swiss Army knife budgeting and managing skills are going to “fix up” an outfit like Autonomy. Billions will flow as a result of the MBA approach. Who needs a PhD with an aptitude for math to run a mere search company. HP is coming to grips with its own shortcomings in the vision and motivation departments of Autonomy.
An ironic twist to the tale is that HP licensed the hugely complex, expensive, and cumbersome Verity system. With the purchase of Autonomy, HP became the owner of Verity’s technology. The six figure license deal for Verity is now free when viewed one way. On the other hand, that Verity technology cost HP billions of dollars.
And what about the founder of Autonomy? Dr. Michael Lynch has set up an investment company called invoke capital. The company took an interest in Darktrace, a security firm. Dr. Lynch, according to the Financial Times,
…is also a defendant in a suit by HP’s shareholders relating to the acquisition. A court in San Francisco this month gave HP a deadline of January to complete an internal audit, a decision welcomed by Mr Lynch.
The year 2014 may hold more fodder for business school case studies about Hewlett Packard and Autonomy. I am eager.
Stephen E Arnold, November 27, 2013
November 21, 2013
I get a Yahoo Alert. My single Alert topic is “enterprise search.” I want a bound phrase match. Like the other alert services I use, there are usually some obvious “false hits.” A “false hit” is an off topic story. The problem with key word alerts is that words have different meaning. A story with the word “search” for a new president often turns up with a story about Oracle’s Secure Enterprise Search system. Most of these “false hits” are easily ignored. Another problem is that some “experts” want a user to see something, so the query is relaxed. That’s a problem for me. For you, maybe not. For spammers, relaxation means more content baloney whether generated by an azure chip consultant, search engine optimization maven, or an organization desperate for visibility. In case you have not noticed, traffic to most Web sites is undergoing quite a change. One Web site owner told me, “We averaged 250,000 uniques a month in 2012. This year we are down to 48,000. What am I going to do?”
Go out of business? Change your Web site? Get a different job?
Perhaps the answer is, “Anything.
Desperation generates some darned interesting business actions in my experience.
There is another problem, particularly with the word “search.” I am interested in enterprise search, and I want to learn about new, substantive information related to information retrieval. The poor word “search” has been sucked dry of meaning. The wispy husk carries zero meaning. For most people search means Google or taking what an app delivers.
I noticed in my Yahoo Alert this morning these two items listed as the number one and number two most relevant stories for me:
Both of these are about an outfit that delivers search engine optimization services. The problem is that this sense of the word “search” is of little interest to me.
What is more interesting is that the outfit generating these items for Yahoo is called PRWeb. I don’t know much about PRWeb. My hunch is that one of the PR professionals I have used over the years knows about this firm.
I wanted to capture several thoughts about what I call “alert corruption.”
Lost and desperate for relevance. Those in the woods are probably evil. See Canto One of the Divine Comedy.
First, Yahoo is not doing a particularly good job providing me with new information about enterprise search. Today I saw items related to OpenText, an outfit that owns a number of search engines. The story, however, talks about enterprise information management. I do not know what that phrase means. There was a story about Imprezzo, a company that purports to “overcome the problem of traditional text based search.” Well, maybe that is worth a look. Of the five items sent me, one was possibly of interest. Does a score of 20 percent warrant a pass or a fail.
Second, four of the items in the Yahoo Alert were from the PRWeb outfit. One thing is certain. PRWeb can get its clients’ content into the Yahoo system. The problem is that two of these stories are about practices that I find like tight shoes. I suppose the shoes look okay but I am uncomfortable. But SEO outfits and those who assist them make me uncomfortable. A buck is a buck, but content manipulation is like wearing small shoes that are damp.
Third, after 40 or 50 years of search innovation, endless surveys from outfits like azure chip consultants and morphing vendors like BA Insight, Smartlogic, and LucidWorks, I am not sure if significant information retrieval progress is evident. One would think that Yahoo would tap some super sophisticated new technology to filter out baloney, deliver on point alerts, and work with vendors who exercise some judgment about what passes for search related content.
My hunch is that PR is in a bit of a sticky wicket. It joins content management, governance, search, and Big Data. These disciplines have to find some way to call attention to themselves. Perhaps these “legitimate” disciplines should emulate the search engine optimization crowd. Visibility without a thought about precision and recall is their game.
I would like to receive alerts that actually match the string “enterprise search.” I think that is just too much for those who think that a user absolutely must have a “hit” whether that item is relevant or not.
Search and marketing may be a match made in heaven. Those who are interested in precision and recall occupy one of Dante’s less salubrious regions.
Stephen E Arnold, November 21, 2013
November 14, 2013
I have not heard much about Hewlett Packard Autonomy search. In the pre-buy out deals, I was seeing announcements about IDOL. The flow of information about Autonomy search has slowed. For example, I navigated to NewsNow.co.uk and ran a query for HP Autonomy. Here is the list of “hits” displayed for me for the last four weeks:
Autonomy was one of the most ferocious marketers of its search technology. What jumps out at me is that Hewlett Packard is pitching jargon for customer support, education, and marketing. Augmented reality, perhaps Google Glass envy, makes an appearance as well.
I don’t know how HP will generate sufficient revenue from these products to pay off the $10 billion Autonomy purchase price. I find it interesting that search seems to be a second or third class citizen in the new HP world. I assume HP has its vision. Too bad that search has been marginalized in the stream of “news” in the last four weeks.
Stephen E Arnold, November 14, 2013
November 8, 2013
I read “America’s Media Guzzling Ways.” Good word “guzzling” or “guzzlin” as it is pronounced in rural Kentucky. The write up contained a factoid that I find difficult to grasp; to wit:
The amount of media data, measured in printed text, that Americans consumed last year. That’s 6.9 zettabytes—6.9 million-million gigabytes—to be exact.
Let’s assume that the figure is dead accurate or a couple of zettabytes, plus or minus. According the article, each person in the US spends 15 hours a day checking Facebook, watching videos, and tapping screens.
My reaction is that the consumption of media contributes to these observed events yesterday:
- A sponsored event at a trade show was attended by about 15 people. None of those at the hoe down were employees of the company. I suppose the guzzling of digital content was more important than showing up and pretending to be thrilled that potential customers were eating free snacks and drinking no name beverages. YouTube cannot wait, people.
- A conference program that did not include information about one of the speakers. Heck, it was an oversight even thought that speaker was paid to attend, received a free hotel room, and a free registration. Facebook posts take priority with this outfit I surmise.
- A sign at the National Press Club that contained a misspelling. It is the spelling checker’s fault was one explanation. SMS spelling is the way to go. LOL
- Asking for directions from a bus driver elicited this statement when I asked, “Where is 999 9th Street, NW.” The professional driver replied, “Dude, my iPhone is not connecting. Ask someone else.” The bus driver did not meet my gaze. He was frantically scanning the street for a mobile phone shop.
The article helps me understand why information presented on a mobile device is perceived as accurate, complete, and current. The grazing public has neither the time nor the grit to do much reading, writing, or arithmetic I fear. Oh, as the National Press Club sign maker would have it: Readin, writin, rithmetic, and guzzlin.
One person looked for Cuba Libre Restaurant using Google Maps. No joy. The system displayed four choices, none of which was the desired restaurant. The smart system made it impossible for the iPhone user to locate the destination. Fascinatin’.
Stephen E Arnold, November 8, 2013
November 4, 2013
A unified information access solutions vendor, BA Insight, appointed a new CEO recently. We learned more about the person stepping into this role in the article, “BA Insight Appoints Massood Zarrabian as Chief Executive Officer.” Massood Zarrabian is a software industry veteran and has much experience in growing venture funded software companies, in particular.
Zarrabian formerly served as President and CEO of OutStart, a provider of knowledge and management solutions (which was acquired by Kenexa) The emphasis on his ability to scale is mentioned frequently in the article.
The article shares a quote from Zarrabian, himself:
“I am honored to have the opportunity to work with an extraordinary group of experts who have shaped the industry, and have a successful track record of delivering valuable solutions to our customers. Our customers have implemented powerful search-driven applications, rapidly, at a fraction of the cost, time and risk of traditional alternatives; anytime you can save buyers cost and shrink time-to-value, you have defensible competitive advantage.”
This is another example of a company undergoing a period of management shuffling. Only time will tell how these new placements and positions will pan out. Will we see venture capitalists taking action? We will keep our eyes peeled.
Megan Feil, November 04, 2013