March 9, 2014
Figure skating, anyone? You can do a Salchow jump. The skater has some options. Falling is not one of them. The idea is to leap from one foot to another. The Axel jump tosses is some spinning; for example, a triple Axel is 3.5 revolutions. Want creativity? The skater can flip, bunny hop, and Mazurka.
But the ice has to be right. Skating requires a Zamboni. Search requires information retrieval that works.
One should not confuse a Zamboni with an ageing ice skater.
Fast Search & Transfer has just come back from an extended training period and is ready to perform. The founder may be retired after an unfavorable court decision. The Fast Search Linux and Unix customers have been blown off. But, according to Fortune CNN, Microsoft has made enterprise search better. Give the skater a three for that jump called Office 365.
Navigate to “Can Microsoft Make Enterprise Search Better?” The subtitle is ripe with promise: “Updates to its Office 365 suite show benefits from a 2008 acquisition.” There you go. Technology from the late 1990s, a withdrawal from Web search, a run at unseating Autonomy as the leading provider of enterprise content processing, and allegations of financial wrongdoing and you have a heck of base from which to “make enterprise search better.”
At one time, Fast Search offered an alternative to Google’s Web search system. The senior management of Fast Search decided to cede Web search to Google and pursue dominance in the enterprise search market. Well, how did that work out? The shift from the Web to the enterprise worked for a while, but the costs of customer support, sales, and implementation put the company in a bind. The result was a crash to the ice.
Microsoft bought the sliding Fast Search operation and embarked on a journey to make content in SharePoint findable. The effort was a boom to second tier search vendors who offered SharePoint licensees a search and retrieval system. Most of these vendors are all but unknown outside of the 150 million SharePoint license base. Others have added new jumps to their search routines and have skated to customer support and business intelligence.
January 25, 2014
The ArnoldIT Overflight snagged this headline on January 16, 2014: ‘Chocolate Toothpaste’ Prevents Tooth Decay. Unusual news can be entertaining. Then on January 24, 2014, I spotted this headline: P&G’s Chocolate Toothpaste: Innovation or Desperation? The source of this story was not a secondary information source. The Innovation or Desperation item appeared in Bloomberg Businessweek.
Here’s the quote to note:
The line, which P&G (PG) promises to start selling soon, comprises three flavors: “Mint Chocolate Trek,” “Lime Spearmint Zest,” and “Vanilla Mint Spark.” Here’s how the Crest marketing team describes the new paste: “It’s a whole new world of deliciousness for toothbrushes everywhere.”
The hook for me was not “chocolate toothpaste prevents tooth decay.” This assertion reminds me of marketers who assert that a particular search system delivers value or understands human discourse. The problem is that the association of “chocolate AND tooth decay” is easier for me to grasp than “chocolate PREVENTS tooth decay.”
With search, value is difficult to connect to search. Search costs money, generates more work because documents have to be opened and read, or creates a willingness among busy users to assume that a search result is a correct result.
The Businessweek story connects “innovation” and “desperation.” Marketers have hit upon a product innovation that will make some influencers go for the chocolate toothpaste.
Search and content processing vendors have been following this path for many years. Not only is the uptake of new jargon standard operating procedure for search vendors, the consultants and experts working in search are turbochargers of constant exploration of ways to make search have sizzle. Search is analytics, taxonomy, knowledge, Big Data, etc.
I recall that in one of my university’s required classes, one professor insisted that ancient people cleaned their teeth with twigs. I also know that search methods from the years before “smart software” worked as well.
Progress in search and retrieval, like the chocolate toothpaste innovation, are not “innovation and desperation.” The juxtaposition of attributes is another indicator that the disconnect between expectations and reality is a characteristic of business today.
Will chocolate toothpaste work better than “regular” toothpaste? Will the new search system work better than “regular” search systems? Under specific test conditions, it is possible to “prove” efficacy. But in the real world, toothpaste like search has a baseline of performance. Wordsmithing, odd juxtapositions, and cleverness cannot be confused with Daliesque novelty.
Stephen E Arnold, January 25, 2014
January 15, 2014
I read “A Search Engine That ‘Makes’ Data-Driven Business Decisions.” The enamel on the article was about making decisions without old fashioned search.
In the article/interview, Attivio (a user of open source software) positions the company in this way:
Attivio is focused on unifying information. The end goal of big data is to make some insight that is actionable.
I understand. Actionable information. The article explains:
With search engines, there is no concept that one page is linked to another. So, we added a graph engine, a mathematical graph, where there are nodes and links between them. We use the graph to link the results in query A to all possible results in query B. So, it is incredibly fast.
Does this remain anyone other than me of Autonomy’s embedded link invention from six, seven years ago?
Then I learn about the magic:
They put them in front of this interface. When a ticket comes in, they automatically identify the related content across all sources. Now, the sysadmins are happier, the company is happier, and these folks are learning all about this environment that they don’t really need to be trained on.
In short, the system goes beyond search just like IBM Watson, HP Autonomy, Palantir, and dozens of other content processing vendors.
Will Attivio become the next $800 million in revenue search vendor? There are some heavy hitters chasing the same brass ring. So far most search vendors get stuck in under the $100 million glass ceiling. With open source software offering a lower cost option, how will the dozens of information retrieval cum business intelligence systems fare in 2014? Good question. No answers yet.
Stephen E Arnold, January 15, 2014
January 14, 2014
I marveled at the buzzword to English ratio in “Your 2014 Heat Map for Enterprise Technology.” You must read the article yourself. My focus is upon the jargon in the InfoWorld article. I must admit I don’t have the faintest idea what some of the terms mean, but I would bet 25 cents that most of the azure chip consultants, unemployed middle school teachers, and recent spate of unemployed grads with JD degrees don’t know either. You, gentle reader, are in full command of Baloneyglish. You will have no problems using and defining these terms. The alphabetical selected list of 2014 hoohah is:
back ends for cloud services
cloud data integration
cloud operating systems
cloud scale hairball
cloud test infrastructure
consumerization of IT
core application code base
data layer technologies
deep layer technologies
dynamic enterprise systems
event stream processing
ground zero of enterprise innovation
hydra-headed personal computers
hyper connected cloud
mobile app lifecycle Mobile back end as a service mobile computing
Notifications one large distributed cache
orchestrating data centers
platform as a service or PaaS
semi structured data
SDx (You may want to think of CxO)
server side storage caching
software defined infrastructure
software defined storage
software defined data center
switches into drones
systems of engagement
systems of record
the third platform
virtualized infrastructure resources
What’s my favorite?
Cloud scale hairball.
I might even be able to define that concept with a few references to HealthCare.gov and the UK Ministry of Defense’s Recruitment Partnering Project.
Stephen E Arnold, January 14, 2014. You can read more at < href=”Google+</a>
January 10, 2014
Fast Company published “IBM’s Watson For Business: The $1 Billion Siri Slayer.” The write offers some nuggets of information that convert Watson from search system into the next Apple or Google. Frankly I find this notion somewhat amusing.
The story reports this interesting assertion, “IBM wants to transform Watson into a Siri for business.” Quite an analogy.
I also noted these items:
- Stephen Gold is the vice president of IBM Watson Solutions
- Watson Discovery Advisor will be a product/service for publishing, education, and health care
- Watson Analytics Advisor appears to be an interactive analytics solution
- An ecosystem will be built around the Watson Application Programming Interface and “the Watson headquarters will also include space for a tech incubator for startups building Watson-based apps”
- Watson will be deployed on Softlayer, an IBM cloud computing service. Apparently some eager Watson prospects have an appetite for Softlayer’s delivering Watson.
I marked a quote to note from Mr. Gold and Fast Company:
Watson for Business is “one of the top innovations in IBM’s history” and it could even be the biggest IBM innovation since the IBM PC.
IBM seems to have made a different executive available to the Wall Street Journal and the New York Times. My hunch is that the cheerleading will continue for a while.
Meanwhile where’s the online demonstration of Watson’s functionality? I want to see how the system compares to Hewlett Packard’s Autonomy technology, check out the visualizations to see if they are different from IBM i2’s, and figure out if the analytics are recycled SPSS functions or something different.
Stephen E Arnold, January 10, 2014
January 6, 2014
I follow two or three LinkedIn groups. Believe me. The process is painful. On the plus side, LinkedIn’s discussions of “enterprise search” reveal the broken ribs in the body of information retrieval. On the surface, enterprise search and content processing appear to be fit and trim. The LinkedIn discussion X-ray reveals some painful and potentially life-threatening injuries. Whether it is marketing professionals at search vendors or individuals with zero background in information retrieval, the discussions often give me a piercing headache.
The eruption of digital information posed a challenge to UK firms in Autonomy’s “Information Black Holes” report. © Autonomy, 1999
One of the “gaps” in the enterprise search sector is a lack of historical perspective. Moderators and participants see only the “now” of their search work. When looking down the information highway, the LinkedIn search group participants strain to see bright white lines. Anyone who has driven on the roads in Kentucky knows that lines are neither bright nor white. Most are faded, mere suggestions of where the traffic should flow.
In 1999, I picked up a printed document called “Information Black Holes.” The subtitle was this question, “Will the Evolution of EIPs Save British Business £17 Billion per Year?” The author of the report was an azure chip consulting firm doing business as “Continental Research.” The company sponsoring the research was Autonomy. Autonomy as a concept relates to “automatic”, “automation,” and “autonomous.” This connotation is a powerful one. Think “automation” and the mind accepts an initial investment followed by significant cost reductions. Autonomy had a name and brand advantage from its inception. Who remembers Cambridge Neurodynamics? Not many of the 20 something flogging search and content processing systems in 2014 I would wager.
As you may know, Hewlett Packard purchased Autonomy in 2011. I doubt that HP has a copy of this document, and I know that most of the LinkedIn enterprise search group members have not read the report. I understand because 15 year old marketing collateral (unlike Kentucky bourbon) does not often improve with age. But “Information Black Holes” is an important document. Unwittingly today’s enterprise search vendors are addressing many of the topics set forth in the 1999 Autonomy publication.
December 20, 2013
One of the ArnoldIT goslings called to my attention a 2011 PDF white paper with the title (I kid you not):
Human inFormation (sic): Cloud, pan enterprise search, automation, video search, audio search, discovery, infrastructure platfo9rm, Big Data, business process management, mobile search, OEMs, and advanced analytics.
I checked on December 19, 2013, and this PDF was available at http://bit.ly/19Vwkqg.
That covers a lot of ground even for HP with or without Autonomy. The analysis includes some “factoids”; for example:
- Unstructured data represents 85% of all information but structure information is growing at 22% CAGR
- Unstructured information is growing at 62% CAGR.
- Users upload 35 hours of video every minute
- Unstructured data will grow to over 35 zettabytes by 2020
- Videos on YouTube were viewed 2 billion times per day, 20 times more than in 2006.
You get the idea. With lots of data, information is a problem. I need to pause a moment and catch my breath.
Well, “it’s not just about search.” Again, I must pause. One Mississippi, two Mississippi, and three Mississippi. Okay.
Fundamentally, the ability to understand meaning and automatically process information is all about distance, probabilities, relativeness (sic), definitions, slang, and more. It is an overwhelming and continually growing problem that requires advanced technology to solve.
One technique is to use structured data methods to solve the unstructured problem. (Wasn’t this the approach taken by Fulcrum Technologies, what? 25 or 30 years ago? I just read a profile of Fulcrum that suggested Fulcrum did this first and continues chugging along within the OpenText product line up which competes directly with HP in information archiving.
HP points out, “People are Lazy.” More interesting is this observation, “People are stupid.” I thought about HP’s write off of billions after owning a company for a couple of years, but I assume that HP means “other people” are stupid, not HP people.
December 15, 2013
If you are interested in “artificial intelligence” or “artificial general intelligence”, you will want to read “Creative Blocks: The Very Laws of Physics Imply That Artificial Intelligence Must Be Possible. What’s Holding Us Up?” Artificial General Intelligence is a discipline that seeks to render in a computing device the human brain.
Dr. Deutsch asserts:
I cannot think of any other significant field of knowledge in which the prevailing wisdom, not only in society at large but also among experts, is so beset with entrenched, overlapping, fundamental errors. Yet it has also been one of the most self-confident fields in prophesying that it will soon achieve the ultimate breakthrough.
Adherents of making a machine’s brain work like a human’s are, says Dr. Deutsch:
split the intellectual world into two camps, one insisting that AGI was none the less impossible, and the other that it was imminent. Both were mistaken. The first, initially predominant, camp cited a plethora of reasons ranging from the supernatural to the incoherent. All shared the basic mistake that they did not understand what computational universality implies about the physical world, and about human brains in particular. But it is the other camp’s basic mistake that is responsible for the lack of progress. It was a failure to recognize that what distinguishes human brains from all other physical systems is qualitatively different from all other functionalities, and cannot be specified in the way that all other attributes of computer programs can be. It cannot be programmed by any of the techniques that suffice for writing any other type of program. Nor can it be achieved merely by improving their performance at tasks that they currently do perform, no matter by how much.
One of the examples Dr. Deutsch invokes is IBM’s game show “winning” computer Watson. He explains:
Nowadays, an accelerating stream of marvelous and useful functionalities for computers are coming into use, some of them sooner than had been foreseen even quite recently. But what is neither marvelous nor useful is the argument that often greets these developments, that they are reaching the frontiers of AGI. An especially severe outbreak of this occurred recently when a search engine called Watson, developed by IBM, defeated the best human player of a word-association database-searching game called Jeopardy. ‘Smartest machine on Earth’, the PBS documentary series Nova called it, and characterized its function as ‘mimicking the human thought process with software.’ But that is precisely what it does not do. The thing is, playing Jeopardy — like every one of the computational functionalities at which we rightly marvel today — is firmly among the functionalities that can be specified in the standard, behaviorist way that I discussed above. No Jeopardy answer will ever be published in a journal of new discoveries. The fact that humans perform that task less well by using creativity to generate the underlying guesses is not a sign that the program has near-human cognitive abilities. The exact opposite is true, for the two methods are utterly different from the ground up.
IBM surfaces again with regard to playing chess, a trick IBM demonstrated years ago:
Likewise, when a computer program beats a grandmaster at chess, the two are not using even remotely similar algorithms. The grandmaster can explain why it seemed worth sacrificing the knight for strategic advantage and can write an exciting book on the subject. The program can only prove that the sacrifice does not force a checkmate, and cannot write a book because it has no clue even what the objective of a chess game is. Programming AGI is not the same sort of problem as programming Jeopardy or chess.
After I read Dr. Deutsch’s essay, I refreshed my memory about Dr. Ray Kurzweil’s view. You can find an interesting essay by this now-Googler in “The Real Reasons We Don’t Have AGI Yet.” The key assertions are:
The real reasons we don’t have AGI yet, I believe, have nothing to do with Popperian philosophy, and everything to do with:
- The weakness of current computer hardware (rapidly being remedied via exponential technological growth!)
- The relatively minimal funding allocated to AGI research (which, I agree with Deutsch, should be distinguished from “narrow AI” research on highly purpose-specific AI systems like IBM’s Jeopardy!-playing AI or Google’s self-driving cars).
- The integration bottleneck: the difficulty of integrating multiple complex components together to make a complex dynamical software system, in cases where the behavior of the integrated system depends sensitively on every one of the components.
Dr. Kurzweil concludes:
The difference between Deutsch’s perspective and my own is not a purely abstract matter; it does have practical consequence. If Deutsch’s perspective is correct, the best way for society to work toward AGI would be to give lots of funding to philosophers of mind. If my view is correct, on the other hand, most AGI funding should go to folks designing and building large-scale integrated AGI systems.
These discussions are going to be quite important in 2014. As search systems do more thinking for the human user, disagreements that appear to be theoretical will have a significant impact on what information is displayed for a user.
Do users know that search results are shaped by algorithms that “think” they are smarter than humans? Good question.
Stephen E Arnold, December 15, 2013
December 13, 2013
I have been working through some of the archives in my personal file about search vendors. I came across a wonderfully amusing article from DMReview. The article “The Problem with Unstructured Data.”
Here’s the part I have circled in 2003, a decade ago, about the next big thing:
Content intelligence is maturing into an essential enterprise technology, comparable to the relational database. The technology comes in several flavors, namely: search, classification and discovery. In most cases, however, enterprises will want to integrate this technology with one or more of their existing enterprise systems to derive greater value from the embedded unstructured data. Many organizations have identified high-value, content intelligence-centric applications that can now be constructed using platforms from leading vendors. What will make content intelligence the next big trend is how this not-so-new set of technologies will be used to uncover new issues and trends and to answer specific business questions, akin to business intelligence. When this happens, unstructured data will be a source of actionable, time-critical business intelligence.
I can see this paragraph appearing without much of a change in any one of a number of today’s vendors’ marketing collateral.
I just finished an article for about the lack of innovation in search and content processing. My focus in that essay was from 2007 to the present. I will keep my eyes open for examples of jargon and high-flying buzzwords that reach even deeper into the forgotten past of search and retrieval.
The chit chat on LinkedIn about “best” search system is a little disappointing but almost as amusing as this quote from DM Review. Yep, “content intelligence” was the next big thing a decade ago. I suppose that “maturing” process is like the one used for Kentucky bourbon. No oak barrels, just hyperbole, for the search mavens.
Stephen E Arnold, January 26, 2013
December 12, 2013
Short honk. I came across an interesting marketing concept in “Diffbot and Semantria Join to Find and Parse the Important Text on the ‘Net (Exclusive).”
Semantria (a company that offers sentiment analysis as a service) participated in a hackathon in San Francisco. The explains:
To make the Semantria service work quickly, even for text-mining novices, Rogynskyy’s team decided to build a plugin for Microsoft’s popular Excel spreadsheet program. The data in a spreadsheet goes to the cloud for processing, and Semantria sends back analysis in Excel format.
Semantria sponsored a prize for the best app. Diffbot won:
A Diffbot developer built a simple plugin for Google’s Chrome browser that changes the background color of messages on Facebook and Twitter based on sentiment — red for negative, green for positive. The concept won a prize from Semantria, Rogynskyy said. A Diffbot executive was on hand at the hackathon, and Rogynskyy started talking with him about how the two companies could work together.
I like the “sponsor”, “winner” and “team up” approach. The pay off, according to the article, is “While Semantria and Diffbot technologies continue to be available separately, they can now be used together.”
Sentiment analysis is one of the search submarkets that caught fire and then, based on the churning at some firms like Attensity, may be losing some momentum. Marketing innovation may be a goal other firms offering this functionality in 2014.
Stephen E Arnold, December 12, 2013