CyberOSINT banner

Another Categorical Affirmative: Nobody Wants to Invest in Search

October 8, 2015

Gentle readers, I read “Autonomy Poisoned the Well for Businesses Seeking VC Cash.” Keep in mind that I am capturing information which appeared in a UK publication. I find this type of essay interesting and entertaining. Will you? Beats me. One thing is certain. This topic will not be fodder for the LinkedIn discussion groups, the marketers hawking search and retrieval at conferences to several dozen fellow travelers, or in consultant reports promoting the almost unknown laborers in the information access vineyards.

Why not?

The problem with search reaches back a few years, but I will add a bit of historical commentary after I highlight what strikes me as the main point of the write up:

Nobody wants to invest in enterprise search, says startup head. Patrick White, Synata

Many enterprise search systems are a bit like the USS United States, once the slickest ocean liner in the world. The ship looks like a ship, but the effort involved in making it seaworthy is going to be project with a hefty price tag. Implementing enterprise search solutions are similar to this type of ocean-going effort.

There you go. “Nobody.” A categorical in the “category” of logic like “All men are mortal.” Remarkable because outfits like Attivio, Coveo, and Digital Reasoning, among others have received hefty injections of venture capital in recent memory.

The write up makes this interesting point:

“I think Autonomy really messed up [the space]”, and when investors hear ‘enterprise search for the cloud’ it “scares the crap out of them”, he added. “Autonomy has poisoned the well for search companies.” However, White added that Autonomy was just the most high profile example of cases that have scared off investors. “It is unfair just to blame Autonomy. Most VCs have at least one enterprise search in their portfolio. So VCs tend to be skittish about it,” he [added.

I am not sure I agree. Before there was Autonomy, there was Fulcrum Technologies. The company’s marketing literature is a fresh today as it was in the 1990s. The company was up, down, bought, and merged. The story of Fulcrum, at least up to 2009 or so is available at this link.

The hot and cold nature of search and content processing may be traced through the adventures of Convera (formerly Excalibur Technologies) and its relationships with Intel and the NBA, Delphes (a Canadian flame out), Entopia (a we can do it all), and, of course, Fast Search & Transfer.

Now Fast Search, like most old school search technology, is very much with us. For a dose of excitement one can have Search Technologies (founded by some Convera wizards) implement Fast Search (now owned by Microsoft).

Where Are the Former Big Six in Enterprise Search Vendors: 2004 and 2015

Autonomy, now owned by HP and mired in litigation over allegations of financial fraud

Convera, after struggles with Intel and NBA engagements, portions of the company were sold off. Essentially out of business. Alums are consultants.

Endeca, owned by Oracle and sold as an eCommerce and business intelligence service. Oracle gives away its own enterprise search system.

Exalead, owned by Dassault Systèmes and now marketed as a product component system. No visibility in the US.

Fast Search, owned by Microsoft and still available as a utility for SharePoint. The technology dates from the late 1990s. Brand is essentially low profiled at this time.

Verity, Autonomy purchased Verity and used its customer list for upsales and used the K2 technology as part of the sprawling IDOL suite.

Fast Search reported revenues which after an investigation and court procedure were found to be a bit enthusiastic. The founder of Fast Search was the subject of the Norwegian authorities’ attention. You can check out the news reports about the prohibition on work and the sentence handed down for the issues the authorities concluded warranted a slap on the wrist and a tap on the head.

The story of enterprise search has been efforts—sometimes Herculean—to sell information access companies. When a company sells like Vivisimo for about one year’s revenues or an estimated $20 million, there is a sense of getting that mythic task accomplished. IBM, like most of the other acquirers of search technology, try valiantly to convert a utility into something with revenue lift. As I watch the evolution of the lucky exits, my overall impression is that the purchasers realize that search is a utility function. Search can generate consulting and engineering fees, but the customers want more.

That realization leads to the wild and crazy hyper marketing for products like Hewlett Packard’s cloud version of Autonomy’s IDOL and DRE technology or IBM’s embrace of open source search and the wisdom of wrapping that core with functions.

Enterprise search, therefore, is alive and well within applications or solutions that are more directly related to something that speaks to senior managers; namely, making sales and reducing costs.

What’s the cost of making sure the controls for an enterprise search system are working and doing the job the licensee wants done?

The problem is the credit card debt load which Googlers explained quite clearly. Technology outfits, particularly information access players, need more money than it is possible for most firms to generate. This contributes to the crazy flips from search to police analysis, from looking up an entry in a data base to an assertion that customer support is enabled, hunting for an article in this blog is now real time, active business intelligence, or indexing by proper noun like White House morphs into natural language understanding of unstructured text.

Investments are flowing to firms which could be easily positioned as old school search and retrieval operations. Consider Lexmark, a former unit of IBM, and an employer of note not far from my pond filled with mine run off in Kentucky. The company, like Hewlett Packard, wants to find a way to replace its traditional business which was not working as planned as a unit of IBM. Lexmark bought Brainware, a company with patents on trigram methods and a good business for processing content related to legal matters. Lexmark is doing its best to make that into a Trump scale back office content processing business. Lexmark then bought a technology dating from the 1980s (ISYS Search Software once officed in Crow’s Nest I believe) and has made search a cornerstone of the Lexmark next generation health care money spinning machine. Oracle has a number of search properties. Most of these are unknown to Oracle DBAs; for example, Artificial Linguistics, TripleHop, InQuira’s shotgun NLP technology, etc. The point is that the “brands” have not had enough magnetism to pull revenues on a stand alone basis.

Successes measured in investment dollars is not revenue. Palantir is, in effect, a search and retrieval outfit packaged as a super stealthy smart intelligence system. Recorded Future, funded by Google and In-Q-Tel, is doing a bang up job with specialized content processing. There are, remember, search and retrieval companies.

The money in search appears to be made in these plays:

  • The Fast Search model. Short cuts until an investigator puts a stop to the activities.
  • Creating a company and then selling it to a larger firm with a firm conviction that it can turn search into a big time money machine
  • Buying a search vendor to get its customers and opportunities to sell other enterprise software to those customers
  • Creating a super technology play and going after venture funding until a convenient time arrives to cash out
  • Pursue a dream for intelligent software and survive on research grants.

This list does not exhaust what is possible. There are me-too plays. There are mobile niche plays. There are apps which are thinly disguised selective dissemination of information services.

The point is that Autonomy is a member of the search and retrieval club. The company’s revenues came from two principal sources:

  1. Autonomy bought companies like Verity and video indexing and management vendor Virage and then sold other products to these firm’s clients and incorporated some of the acquired technology into products and services which allowed Autonomy to enter a new market. Remember Autonomy and enhanced video ads?
  2. Autonomy managed well. If one takes the time to speak with former Autonomy sales professionals, the message is that life was demanding. Sales professionals including partners had to produce revenue or some face time with the delightful Dr. Michael Lynch or other senior Autonomy executives was arranged.

That’s it. Upselling and intense management for revenues. Hewlett Packard was surprised at the simplicity of the Autonomy model and apparently uncomfortable with the management policies and procedures that Autonomy had been using in highly visible activities for more than a decade as a publicly traded company.

Perhaps some sources of funding will disagree with my view of Autonomy. That is definitely okay. I am retired. My house is paid for. I have no charming children in a private school or university.

The focus should be on what the method for generating revenue is. The technology is of secondary importance. When IBM uses “good enough” open source search, there is a message there, gentle reader. Why reinvent the wheel?

The trick is to ask the right questions. If one does not ask the right questions, the person doing the querying is likely to draw incorrect conclusions and make mistakes. Where does the responsibility rest? When one makes a bad decision?

The other point of interest should be making sales. Stated in different terms, the key question for a search vendor, regardless of camouflage, what problem are you solving? Then ask, “Will people pay money for this solution?”

If the search vendor cannot or will not answer these questions and provide data to be verified, the questioner runs the risk of taking the USS United States for a cruise as soon as you have refurbed the ship, made it seaworthy, and hired a crew.

The enterprise search sector is guilty of making a utility function appear to be a solution to business uncertainty. Why? To make sales. Caveat emptor.

Stephen E Arnold, October 8, 2015

Fighting the Academic Publishers Gets You Fired

September 11, 2015

Academic publishers, such as Springer and Elsevier, have a monopoly on academic publishing and they do not want to lose their grasp.  In the Slashdot science forum, a report from The Guardian was posted “Paywalled Science Journals Under Fire Again” describing how the academic publishers won a battle in Australia.

The Medical Journal of Australia (MJA) fired their editor Professor Stephen Leeder, when he expressed his displeasure over the journal outsourcing its functions to Elsevier.  Leeder might have lost his job, but he will speak at a symposium at the State Library of NSW about ways academic communities can fight against the commoditization of knowledge.

What is concerning is that academic publishers are more interested in turning a profit than expanding humanity’s knowledge base:

“Alex Holcombe, an associate professor of psychology who will also be presenting at the symposium, said the business model of some of the major academic publishers was more profitable than owning a gold mine. Some of the 1,600 titles published by Elsevier charged institutions more than $19,000 for an annual subscription to just one journal. The Springer group, which publishes more than 2,000 titles, charges more than $21,000 for access to some of its titles. ‘The mining giant Rio Tinto has a profit margin of about 23%,’ Holcombe said. ‘Elsevier consistently comes in at around 37%. Open access publishing is catching on, but it requires researchers to pay up to $3000 to get a single open access article published.’”

Where does the pursuit of knowledge actually take place if researchers are at the mercy of academic publishers?  One might say that researchers could publish their work for free on the Web, but remember that anyone can do that.  Being published under a reputable banner adds to study’s authenticity and also helps it get used to support other research.  The problem lies in the fact that big academic publishers limit who accesses their content to subscription holders and often those subscriptions are too expensive for the average researcher to afford on their own.  Researchers want to have access to more academic content, but it is being locked down.

Whitney Grace, September 11, 2015
Sponsored by, publisher of the CyberOSINT monograph

Algorithms Still Need Oversight

September 8, 2015

Many have pondered what might happen when artificial intelligence systems go off the rails. While not spectacular enough for Hollywood, some very real consequences have been observed; the BBC examines “The Bad Things that Happen When Algorithms Run Online Shops.”

The article begins by relating the tragic tale of an online T-shirt vendor who just wanted to capitalize on the “Keep Calm and Carry On” trend. He set up an algorithm to place random terms into the second half of that oft-copied phrase and generate suggested products. Unfortunately, the list of phrases was not sufficiently vetted, resulting in a truly regrettable slogan virtually printed on virtual examples. Despite the fact that the phrase appeared only on the website, not on any actual shirts, the business never recovered its reputation and closed shortly thereafter. Reporter Chris Baranuik writes:

“But that’s the trouble with algorithms. All sorts of unexpected results can occur. Sometimes these are costly, but in other cases they have benefited businesses to the tune of millions of pounds. What’s the real impact of the machinations of machines? And what else do they do?”

Well, one other thing is to control prices. Baranuik reports that software designed to set online prices competitively, based on what other sites are doing, can cause prices to fluctuate day-to-day, sometimes hour-to-hour. Without human oversight, results can quickly become extreme to either end of the scale. For example, for a short time last December, prices of thousands of products sold through Amazon were set to just one penny each. Amazon itself probably weathered the unintended near-giveaways just fine, but smaller merchants selling through the site were not so well-positioned; some closed as a direct result of the error. On the other hand, vendors trying to keep their prices as high as feasible can make the opposite mistake; the article points to the time a blogger found an out-of-print textbook about flies priced at more than $23 million, the result of two sellers’ dueling algorithms.

Such observations clearly mean that consumers should be very wary about online prices. The bigger takeaway, though, is that we’re far from ready to hand algorithms the reigns of our world without sufficient human oversight. Not yet.

Cynthia Murrell, September 8, 2015

Sponsored by, publisher of the CyberOSINT monograph

Elasticsearch is the Jack of All Trades at Goldman Sachs

August 25, 2015

The article titled Goldman Sachs Puts Elasticsearch to Work on Information Week discusses how programmers at Goldman Sachs are using Elasticsearch. Programmers there are working on applications to exploit both the data retrieval capabilities as well as the faculty it has for unstructured data. The article explains,

“Elasticsearch and its co-products — Logstash, Elastic’s server log data retrieval system, and Kibana, a dashboard reporting system — are written in Java and behave as core Java systems. This gives them an edge with enterprise developers who quickly recognize how to integrate them into applications. Logstash has plug-ins that draw data from the log files of 165 different information systems. It works natively with Elasticsearch and Kibana to feed them data for downstream analytics, said Elastic’s Jeff Yoshimura, global marketing leader.”

The article provides detailed examples of how Elastic is being used in legal, finance, and engineering departments within Goldman Sachs. For example, rather than hiring a “platoon of lawyers” to comb through Goldman’s legal contracts, a single software engineer was able to build a system that digitized everything and flagged contract documents that needed revision. With over 9,000 employees, Goldman currently has several thousand using Elasticsearch. The role of search has expanded, and it is important that companies recognize the many functions it can provide.

Chelsea Kerwin, August 25, 2015

Sponsored by, publisher of the CyberOSINT monograph


Researchers Glean Audio from Video

July 10, 2015

Now, this is fascinating. Scary, but fascinating. MIT News explains how a team of researchers from MIT, Microsoft, and Adobe are “Extracting Audio from Visual Information.” The article includes a video in which one can clearly hear the poem “Mary Had a Little Lamb” as extrapolated from video of a potato chip bag’s vibrations filmed through soundproof glass, among other amazing feats. I highly recommend you take four-and-a-half minutes to watch the video.

 Writer Larry Hardesty lists some other surfaces from which the team was able reproduce audio by filming vibrations: aluminum foil, water, and plant leaves. The researchers plan to present a paper on their results at this year’s Siggraph computer graphics conference. See the article for some details on the research, including camera specs and algorithm development.

 So, will this tech have any non-spying related applications? Hardesty cites MIT grad student, and first writer on the team’s paper, Abe Davis as he writes:

 “The researchers’ technique has obvious applications in law enforcement and forensics, but Davis is more enthusiastic about the possibility of what he describes as a ‘new kind of imaging.’

“‘We’re recovering sounds from objects,’ he says. ‘That gives us a lot of information about the sound that’s going on around the object, but it also gives us a lot of information about the object itself, because different objects are going to respond to sound in different ways.’ In ongoing work, the researchers have begun trying to determine material and structural properties of objects from their visible response to short bursts of sound.”

 That’s one idea. Researchers are confident other uses will emerge, ones no one has thought of yet. This is a technology to keep tabs on, and not just to decide when to start holding all private conversations in windowless rooms.

 Cynthia Murrell, July 10, 2015

Sponsored by, publisher of the CyberOSINT monograph

Digital Reasoning a Self-Described Cognitive Computing Company

June 26, 2015

The article titled Spy Tools Come to the Cloud on Enterprise Tech shows how Amazon’s work with analytics companies on behalf of the government have realized platforms like “GovCloud”, with increased security. The presumed reason for such platforms being the gathering of intelligence and threat analysis on the big data scale. The article explains,

“The Digital Reasoning cognitive computing tool is designed to generate “knowledge graphs of connected objects” gleaned from structured and unstructured data. These “nodes” (profiles of persons or things of interest) and “edges” (the relationships between them) are graphed, “and then being able to take this and put it into time and space,” explained Bill DiPietro, vice president of product management at Digital Reasoning. The partners noted that the elastic computing capability… is allowing customers to bring together much larger datasets.”

For former CIA staff officer DiPietro it logically follows that bigger questions can be answered by the data with tools like the AWS GovCloud and subsequent Hadoop ecosystems. He cites the ability to quickly spotlight and identify someone on a watch list out of the haystack of people as the challenge set to overcome. They call it “cluster on demand,” the process that allows them to manage and bring together data.

Chelsea Kerwin, June 26,  2015

Sponsored by, publisher of the CyberOSINT monograph

Twitter Gets a Search Facelift

June 25, 2015

Twitter has been experimenting with improving its search results and according to TechCrunch the upgrade comes via a new search results interface: “Twitter’s New Search Results Interface Expands To All Users.”  The new search results interface is the one of the largest updates Twitter has made in 2015.  It is supposed to increase the ease with a cleaner look and better filtering options.  Users will now be able to filter search results by live tweets, photos, videos, news, accounts, and more.

Twitter made the update to help people better understand how to use the message service and to take a more active approach to using it, rather than passively reading other peoples tweets.  The update is specifically targeted at new Twitter users.

The tweaked search interface will return tweets related to the search phrase or keyword, but that does not mean that the most popular tweets are returned:

“In some cases, the top search result isn’t necessarily the one with the higher metrics associated with it – but one that better matches what Twitter believes to be the searcher’s “intent.” For example, a search for “Steve Jobs” first displays a heavily-retweeted article about the movie’s trailer, but a search for “Mad Men” instead first displays a more relevant tweet ahead of the heavily-favorited “Mad Men” mention by singer Lorde.”

The new interface proves to be simpler and better list trends, related users, and news.  It does take a little while to finesse Twitter, which is a daunting task to new users.  Twitter is not the most popular social network these day and it’s using these updates to increase its appeal.

Whitney Grace, June 25, 2015
Sponsored by, publisher of the CyberOSINT monograph

HP Sales Are Slow, But CEO Says Progress

June 24, 2015

According to Computer Weekly, “HP CEO Hails Business Split Progress Amid Downbeat Q2 Revenue Slumps.”  HP’s Enterprise Service has the worst revenue reports for the quarter along with several more of its business units with a seven percent net loss.  The Enterprise Service saw a sixteen percent loss.

Ironically, the company’s stock rose 1 percent, mostly due to HP expanding into China due to a new partnership with Tsinghua University.  The joint venture will focus on developing HP’s H3C’s technology and its China-based server business, supposedly it will have huge implications on the Chinese technology market.

Another piece of news is that HP will split up:

“[CEO Meg ] Whitman also spoke in favour of the progress the company is making with its plans to separate into two publicly traded business entities: one comprised of its consumer PC and printing operations, and the other focused on enterprise hardware, software and services.

The past six months have reinforced Whitman’s conviction that this is the right path for the company to take, and the split is still on course to occur before the end of the firm’s financial year.”

The company wants to increase its revenue, but it needs to cut gross costs across the board.  HP is confidant that it will work.  Sales will continue to be slow for 2015, but they can still do investment banking things at HP.

Whitney Grace, June 24, 2015
Sponsored by, publisher of the CyberOSINT monograph

New Analysis Tool for Hadoop Data from Oracle

June 23, 2015

Oracle offers new ways to analyze Hadoop data, we learn from the brief write-up, “Oracle Zeroes in on Hadoop Data with New Analytics Tool” at PCWorld. Use of the Hadoop open-source distributed file system continues to grow  among businesses and other organizations, so it is no surprise to see enterprise software giant Oracle developing such tools. This new software is dubbed Oracle Big Data Spatial and Graph. Writer Katherine Noyes reports:

“Users of Oracle’s database have long had access to spatial and graph analytics tools, which are used to uncover relationships and analyze data sets involving location. Aiming to tackle more diverse data sets and minimize the need for data movement, Oracle created the product to be able to process data natively on Hadoop and in parallel using MapReduce or in-memory structures.

“There are two main components. One is a distributed property graph with more than 35 high-performance, parallel, in-memory analytic functions. The other is a collection of spatial-analysis functions and services to evaluate data based on how near or far something is, whether it falls within a boundary or region, or to process and visualize geospatial data and imagery.”

The write-up notes that such analysis can reveal connections for organizations to capitalize upon, like relationships between customers or assets. The software is, of course, compatible with Oracle’s own Big Data Appliance platform, but can be deployed on other Hadoop and NoSQL systems, as well.

Cynthia Murrell, June 23, 2015

Sponsored by, publisher of the CyberOSINT monograph

Basho Enters Ring With New Data Platform

June 18, 2015

When it comes to enterprise technology these days, it is all about making software compliant for a variety of platforms and needs.  Compliancy is the name of the game for Basho, says Diginomica’s article, “Basho Aims For Enterprise Operational Simplicity With New Data Platform.”  Basho’s upgrade to its Riak Data Platform makes it more integration with related tools and to make complex operational environments simpler.  Data management and automation tools are another big seller for NoSQL enterprise databases, which Basho also added to the Riak upgrade.  Basho is not the only company that is trying to improve NoSQL enterprise platforms, these include MongoDB and DataStax.  Basho’s advantage is delivering a solution using the  Riak data platform.

Basho’s data platform already offers a variety of functions that people try to get to work with a NoSQL database and they are nearly automated: Riak Search with Apache Solr, orchestration services, Apache Spark Connector, integrated caching with Redis, and simplified development using data replication and synchronization.

“CEO Adam Wray released some canned comment along with the announcement, which indicates that this is a big leap for Basho, but also is just the start of further broadening of the platform. He said:

‘This is a true turning point for the database industry, consolidating a variety of critical but previously disparate services to greatly simplify the operational requirements for IT teams working to scale applications with active workloads. The impact it will have on our users, and on the use of integrated data services more broadly, will be significant. We look forward to working closely with our community and the broader industry to further develop the Basho Data Platform.’”

The article explains that NoSQL market continues to grow and enterprises need management as well as automation to manage the growing number of tasks databases are used for.  While a complete solution for all NoSQL needs has been developed, Basho comes fairly close.

Whitney Grace, June 18, 2015

Sponsored by, publisher of the CyberOSINT monograph

Next Page »