CyberOSINT banner

CSI Search Informatics Are Actually Real

October 29, 2015

CSI might stand for a popular TV franchise, but it also stands for “compound structured identification” explains in “Bioinformaticians Make The Most Efficient Search Engine For Molecular Structures Available Online.” Sebastian Böcker and his team at the Friedrich Schiller University are researching metabolites, chemical compounds that determine an organism’s metabolism.  Metabolites are used to gauge information about the condition of living cells.

While this is amazing science there are some drawbacks:

“This process is highly complex and seldom leads to conclusive results. However, the work of scientists all over the world who are engaged in this kind of fundamental research has now been made much easier: The bioinformatics team led by Prof. Böcker in Jena, together with their collaborators from the Aalto-University in Espoo, Finland, have developed a search engine that significantly simplifies the identification of molecular structures of metabolites.”

The new search works like a regular search engine, but instead of using keywords it searches through molecular structure databases containing information and structural formulae of metabolites.  The new search will reduce time in identifying the compound structures, saving on costs and time.  The hope is that the new search will further research into metabolites and help researchers spend more time working on possible breakthroughs.

Whitney Grace, October 29, 2015

Sponsored by, publisher of the CyberOSINT monograph

Apple May Open up on Open Source

October 27, 2015

Is Apple ready to openly embrace open source? MacRumors reports, “Apple Building Unified Cloud Platform for iCloud, iTunes, Siri and More.” Writer Joe Rossignol cites a new report from the Information that indicates the famously secret company may be opening up to keep up with the cloudy times. He writes:

“The new platform is based on Siri, which itself is powered by open source infrastructure software called Mesos on the backend, according to the report. Apple is reportedly placing more emphasis on open source software in an attempt to attract open source engineers that can help improve its web services, but it remains to be seen how far the company shifts away from its deep culture of secrecy.

“The paywalled report explains how Apple is slowly embracing the open source community and becoming more transparent about its open source projects. It also lists some of the open source technologies that Apple uses, including Hadoop, HBase, Elasticsearch, Reak, Kafka, Azkaban and Voldemort.”

Rossignol goes on to note that, according to Bloomberg, Apple is working on a high-speed content delivery network and upgrading data centers to better compete with its rivals in the cloud, like Amazon, Google, and Microsoft. Will adjusting its stance on open-source allow it to keep up?

Cynthia Murrell, October 27, 2015

Sponsored by, publisher of the CyberOSINT monograph

Xendo, Can Do

October 23, 2015

While it would be lovely to access and find all important documents, emails, and Web sites within a couple clicks, users usually have to access several programs or individual files to locate their information.  Stark Industries wanted users to have the power of Google search engine without compromising their personal security.  Xendo is a private, personal search engine that connects with various services, including email servers, social media account, clouds, newsfeeds, and more.

Once all the desired user accounts are connected to Xendo, the search engine indexes all the files within the services.  The index is encrypted, so it securely processes them.  After the indexing is finished, Xendo will search through all the files and return search results displaying the content and service types related to inputted keywords.  Xendo promises that:

“After your initial index is built, Xendo automatically keeps it up-to-date by adding, removing and updating content as it changes. Xendo automatically updates your index to reflect role and permission changes in each of your connected services. Xendo is hosted in some of the most secure data-centers in the world and uses multiple layers of security to ensure your data is secured in transit and at rest, like it’s in a bank vault.”

Basic Xendo search is free for individual users with payments required for upgrades.  The basic search offers deep search, unlimited access, and unlimited content, while the other plans offer more search options based on subscription.  Xendo can be deployed for enterprise systems, but it requires a personalized quote.

Whitney Grace, October 23, 2015
Sponsored by, publisher of the CyberOSINT monograph

Meg Whitman, President of HP, Gets Flack for Partial Follow-Through on Ultimatum

October 14, 2015

The article titled HP Didn’t Actually Fire All the Employees It Threatened to Cut on Business Insider details the management teachings from Hewlett Packard. To summarize, HP recently delivered an ultimatum to several hundred employees that they had to shift off HP’s payroll and become contract workers for significantly lower pay with HP’s partner Ciber. If they refused, they would be let go. Except that the employees mutinied and complained, resulting in HP negotiating for higher salaries from Ciber as well as holding on to a few employees who refused the deal. The article states,

“On top of that, HP is also shipping most of the jobs in this business unit offshore. Whitman wants 60% of the Enterprise Services division jobs to be in low-cost areas of the world, compared to less than 40% today. Employees in this unit fully expect HP to line up more take-it-or-leave it contract jobs, they tell us, so we’ll see how HP handles the next one if it does materialize.”

This is all in the midst of HP’s massive layoffs of over 80,000 employees, 51,000 of whom have already been let go. Morale must be under the building. The non-negotiable ultimatum strategy did not seem to work, and at any rate is bad business, especially when coupled with it being overturned later in a handful of instances.

Chelsea Kerwin, October 14, 2015

Sponsored by, publisher of the CyberOSINT monograph

Another Categorical Affirmative: Nobody Wants to Invest in Search

October 8, 2015

Gentle readers, I read “Autonomy Poisoned the Well for Businesses Seeking VC Cash.” Keep in mind that I am capturing information which appeared in a UK publication. I find this type of essay interesting and entertaining. Will you? Beats me. One thing is certain. This topic will not be fodder for the LinkedIn discussion groups, the marketers hawking search and retrieval at conferences to several dozen fellow travelers, or in consultant reports promoting the almost unknown laborers in the information access vineyards.

Why not?

The problem with search reaches back a few years, but I will add a bit of historical commentary after I highlight what strikes me as the main point of the write up:

Nobody wants to invest in enterprise search, says startup head. Patrick White, Synata

Many enterprise search systems are a bit like the USS United States, once the slickest ocean liner in the world. The ship looks like a ship, but the effort involved in making it seaworthy is going to be project with a hefty price tag. Implementing enterprise search solutions are similar to this type of ocean-going effort.

There you go. “Nobody.” A categorical in the “category” of logic like “All men are mortal.” Remarkable because outfits like Attivio, Coveo, and Digital Reasoning, among others have received hefty injections of venture capital in recent memory.

The write up makes this interesting point:

“I think Autonomy really messed up [the space]”, and when investors hear ‘enterprise search for the cloud’ it “scares the crap out of them”, he added. “Autonomy has poisoned the well for search companies.” However, White added that Autonomy was just the most high profile example of cases that have scared off investors. “It is unfair just to blame Autonomy. Most VCs have at least one enterprise search in their portfolio. So VCs tend to be skittish about it,” he [added.

I am not sure I agree. Before there was Autonomy, there was Fulcrum Technologies. The company’s marketing literature is a fresh today as it was in the 1990s. The company was up, down, bought, and merged. The story of Fulcrum, at least up to 2009 or so is available at this link.

The hot and cold nature of search and content processing may be traced through the adventures of Convera (formerly Excalibur Technologies) and its relationships with Intel and the NBA, Delphes (a Canadian flame out), Entopia (a we can do it all), and, of course, Fast Search & Transfer.

Now Fast Search, like most old school search technology, is very much with us. For a dose of excitement one can have Search Technologies (founded by some Convera wizards) implement Fast Search (now owned by Microsoft).

Where Are the Former Big Six in Enterprise Search Vendors: 2004 and 2015

Autonomy, now owned by HP and mired in litigation over allegations of financial fraud

Convera, after struggles with Intel and NBA engagements, portions of the company were sold off. Essentially out of business. Alums are consultants.

Endeca, owned by Oracle and sold as an eCommerce and business intelligence service. Oracle gives away its own enterprise search system.

Exalead, owned by Dassault Systèmes and now marketed as a product component system. No visibility in the US.

Fast Search, owned by Microsoft and still available as a utility for SharePoint. The technology dates from the late 1990s. Brand is essentially low profiled at this time.

Verity, Autonomy purchased Verity and used its customer list for upsales and used the K2 technology as part of the sprawling IDOL suite.

Fast Search reported revenues which after an investigation and court procedure were found to be a bit enthusiastic. The founder of Fast Search was the subject of the Norwegian authorities’ attention. You can check out the news reports about the prohibition on work and the sentence handed down for the issues the authorities concluded warranted a slap on the wrist and a tap on the head.

The story of enterprise search has been efforts—sometimes Herculean—to sell information access companies. When a company sells like Vivisimo for about one year’s revenues or an estimated $20 million, there is a sense of getting that mythic task accomplished. IBM, like most of the other acquirers of search technology, try valiantly to convert a utility into something with revenue lift. As I watch the evolution of the lucky exits, my overall impression is that the purchasers realize that search is a utility function. Search can generate consulting and engineering fees, but the customers want more.

That realization leads to the wild and crazy hyper marketing for products like Hewlett Packard’s cloud version of Autonomy’s IDOL and DRE technology or IBM’s embrace of open source search and the wisdom of wrapping that core with functions.

Enterprise search, therefore, is alive and well within applications or solutions that are more directly related to something that speaks to senior managers; namely, making sales and reducing costs.

What’s the cost of making sure the controls for an enterprise search system are working and doing the job the licensee wants done?

The problem is the credit card debt load which Googlers explained quite clearly. Technology outfits, particularly information access players, need more money than it is possible for most firms to generate. This contributes to the crazy flips from search to police analysis, from looking up an entry in a data base to an assertion that customer support is enabled, hunting for an article in this blog is now real time, active business intelligence, or indexing by proper noun like White House morphs into natural language understanding of unstructured text.

Investments are flowing to firms which could be easily positioned as old school search and retrieval operations. Consider Lexmark, a former unit of IBM, and an employer of note not far from my pond filled with mine run off in Kentucky. The company, like Hewlett Packard, wants to find a way to replace its traditional business which was not working as planned as a unit of IBM. Lexmark bought Brainware, a company with patents on trigram methods and a good business for processing content related to legal matters. Lexmark is doing its best to make that into a Trump scale back office content processing business. Lexmark then bought a technology dating from the 1980s (ISYS Search Software once officed in Crow’s Nest I believe) and has made search a cornerstone of the Lexmark next generation health care money spinning machine. Oracle has a number of search properties. Most of these are unknown to Oracle DBAs; for example, Artificial Linguistics, TripleHop, InQuira’s shotgun NLP technology, etc. The point is that the “brands” have not had enough magnetism to pull revenues on a stand alone basis.

Successes measured in investment dollars is not revenue. Palantir is, in effect, a search and retrieval outfit packaged as a super stealthy smart intelligence system. Recorded Future, funded by Google and In-Q-Tel, is doing a bang up job with specialized content processing. There are, remember, search and retrieval companies.

The money in search appears to be made in these plays:

  • The Fast Search model. Short cuts until an investigator puts a stop to the activities.
  • Creating a company and then selling it to a larger firm with a firm conviction that it can turn search into a big time money machine
  • Buying a search vendor to get its customers and opportunities to sell other enterprise software to those customers
  • Creating a super technology play and going after venture funding until a convenient time arrives to cash out
  • Pursue a dream for intelligent software and survive on research grants.

This list does not exhaust what is possible. There are me-too plays. There are mobile niche plays. There are apps which are thinly disguised selective dissemination of information services.

The point is that Autonomy is a member of the search and retrieval club. The company’s revenues came from two principal sources:

  1. Autonomy bought companies like Verity and video indexing and management vendor Virage and then sold other products to these firm’s clients and incorporated some of the acquired technology into products and services which allowed Autonomy to enter a new market. Remember Autonomy and enhanced video ads?
  2. Autonomy managed well. If one takes the time to speak with former Autonomy sales professionals, the message is that life was demanding. Sales professionals including partners had to produce revenue or some face time with the delightful Dr. Michael Lynch or other senior Autonomy executives was arranged.

That’s it. Upselling and intense management for revenues. Hewlett Packard was surprised at the simplicity of the Autonomy model and apparently uncomfortable with the management policies and procedures that Autonomy had been using in highly visible activities for more than a decade as a publicly traded company.

Perhaps some sources of funding will disagree with my view of Autonomy. That is definitely okay. I am retired. My house is paid for. I have no charming children in a private school or university.

The focus should be on what the method for generating revenue is. The technology is of secondary importance. When IBM uses “good enough” open source search, there is a message there, gentle reader. Why reinvent the wheel?

The trick is to ask the right questions. If one does not ask the right questions, the person doing the querying is likely to draw incorrect conclusions and make mistakes. Where does the responsibility rest? When one makes a bad decision?

The other point of interest should be making sales. Stated in different terms, the key question for a search vendor, regardless of camouflage, what problem are you solving? Then ask, “Will people pay money for this solution?”

If the search vendor cannot or will not answer these questions and provide data to be verified, the questioner runs the risk of taking the USS United States for a cruise as soon as you have refurbed the ship, made it seaworthy, and hired a crew.

The enterprise search sector is guilty of making a utility function appear to be a solution to business uncertainty. Why? To make sales. Caveat emptor.

Stephen E Arnold, October 8, 2015

Fighting the Academic Publishers Gets You Fired

September 11, 2015

Academic publishers, such as Springer and Elsevier, have a monopoly on academic publishing and they do not want to lose their grasp.  In the Slashdot science forum, a report from The Guardian was posted “Paywalled Science Journals Under Fire Again” describing how the academic publishers won a battle in Australia.

The Medical Journal of Australia (MJA) fired their editor Professor Stephen Leeder, when he expressed his displeasure over the journal outsourcing its functions to Elsevier.  Leeder might have lost his job, but he will speak at a symposium at the State Library of NSW about ways academic communities can fight against the commoditization of knowledge.

What is concerning is that academic publishers are more interested in turning a profit than expanding humanity’s knowledge base:

“Alex Holcombe, an associate professor of psychology who will also be presenting at the symposium, said the business model of some of the major academic publishers was more profitable than owning a gold mine. Some of the 1,600 titles published by Elsevier charged institutions more than $19,000 for an annual subscription to just one journal. The Springer group, which publishes more than 2,000 titles, charges more than $21,000 for access to some of its titles. ‘The mining giant Rio Tinto has a profit margin of about 23%,’ Holcombe said. ‘Elsevier consistently comes in at around 37%. Open access publishing is catching on, but it requires researchers to pay up to $3000 to get a single open access article published.’”

Where does the pursuit of knowledge actually take place if researchers are at the mercy of academic publishers?  One might say that researchers could publish their work for free on the Web, but remember that anyone can do that.  Being published under a reputable banner adds to study’s authenticity and also helps it get used to support other research.  The problem lies in the fact that big academic publishers limit who accesses their content to subscription holders and often those subscriptions are too expensive for the average researcher to afford on their own.  Researchers want to have access to more academic content, but it is being locked down.

Whitney Grace, September 11, 2015
Sponsored by, publisher of the CyberOSINT monograph

Algorithms Still Need Oversight

September 8, 2015

Many have pondered what might happen when artificial intelligence systems go off the rails. While not spectacular enough for Hollywood, some very real consequences have been observed; the BBC examines “The Bad Things that Happen When Algorithms Run Online Shops.”

The article begins by relating the tragic tale of an online T-shirt vendor who just wanted to capitalize on the “Keep Calm and Carry On” trend. He set up an algorithm to place random terms into the second half of that oft-copied phrase and generate suggested products. Unfortunately, the list of phrases was not sufficiently vetted, resulting in a truly regrettable slogan virtually printed on virtual examples. Despite the fact that the phrase appeared only on the website, not on any actual shirts, the business never recovered its reputation and closed shortly thereafter. Reporter Chris Baranuik writes:

“But that’s the trouble with algorithms. All sorts of unexpected results can occur. Sometimes these are costly, but in other cases they have benefited businesses to the tune of millions of pounds. What’s the real impact of the machinations of machines? And what else do they do?”

Well, one other thing is to control prices. Baranuik reports that software designed to set online prices competitively, based on what other sites are doing, can cause prices to fluctuate day-to-day, sometimes hour-to-hour. Without human oversight, results can quickly become extreme to either end of the scale. For example, for a short time last December, prices of thousands of products sold through Amazon were set to just one penny each. Amazon itself probably weathered the unintended near-giveaways just fine, but smaller merchants selling through the site were not so well-positioned; some closed as a direct result of the error. On the other hand, vendors trying to keep their prices as high as feasible can make the opposite mistake; the article points to the time a blogger found an out-of-print textbook about flies priced at more than $23 million, the result of two sellers’ dueling algorithms.

Such observations clearly mean that consumers should be very wary about online prices. The bigger takeaway, though, is that we’re far from ready to hand algorithms the reigns of our world without sufficient human oversight. Not yet.

Cynthia Murrell, September 8, 2015

Sponsored by, publisher of the CyberOSINT monograph

Elasticsearch is the Jack of All Trades at Goldman Sachs

August 25, 2015

The article titled Goldman Sachs Puts Elasticsearch to Work on Information Week discusses how programmers at Goldman Sachs are using Elasticsearch. Programmers there are working on applications to exploit both the data retrieval capabilities as well as the faculty it has for unstructured data. The article explains,

“Elasticsearch and its co-products — Logstash, Elastic’s server log data retrieval system, and Kibana, a dashboard reporting system — are written in Java and behave as core Java systems. This gives them an edge with enterprise developers who quickly recognize how to integrate them into applications. Logstash has plug-ins that draw data from the log files of 165 different information systems. It works natively with Elasticsearch and Kibana to feed them data for downstream analytics, said Elastic’s Jeff Yoshimura, global marketing leader.”

The article provides detailed examples of how Elastic is being used in legal, finance, and engineering departments within Goldman Sachs. For example, rather than hiring a “platoon of lawyers” to comb through Goldman’s legal contracts, a single software engineer was able to build a system that digitized everything and flagged contract documents that needed revision. With over 9,000 employees, Goldman currently has several thousand using Elasticsearch. The role of search has expanded, and it is important that companies recognize the many functions it can provide.

Chelsea Kerwin, August 25, 2015

Sponsored by, publisher of the CyberOSINT monograph


Researchers Glean Audio from Video

July 10, 2015

Now, this is fascinating. Scary, but fascinating. MIT News explains how a team of researchers from MIT, Microsoft, and Adobe are “Extracting Audio from Visual Information.” The article includes a video in which one can clearly hear the poem “Mary Had a Little Lamb” as extrapolated from video of a potato chip bag’s vibrations filmed through soundproof glass, among other amazing feats. I highly recommend you take four-and-a-half minutes to watch the video.

 Writer Larry Hardesty lists some other surfaces from which the team was able reproduce audio by filming vibrations: aluminum foil, water, and plant leaves. The researchers plan to present a paper on their results at this year’s Siggraph computer graphics conference. See the article for some details on the research, including camera specs and algorithm development.

 So, will this tech have any non-spying related applications? Hardesty cites MIT grad student, and first writer on the team’s paper, Abe Davis as he writes:

 “The researchers’ technique has obvious applications in law enforcement and forensics, but Davis is more enthusiastic about the possibility of what he describes as a ‘new kind of imaging.’

“‘We’re recovering sounds from objects,’ he says. ‘That gives us a lot of information about the sound that’s going on around the object, but it also gives us a lot of information about the object itself, because different objects are going to respond to sound in different ways.’ In ongoing work, the researchers have begun trying to determine material and structural properties of objects from their visible response to short bursts of sound.”

 That’s one idea. Researchers are confident other uses will emerge, ones no one has thought of yet. This is a technology to keep tabs on, and not just to decide when to start holding all private conversations in windowless rooms.

 Cynthia Murrell, July 10, 2015

Sponsored by, publisher of the CyberOSINT monograph

Digital Reasoning a Self-Described Cognitive Computing Company

June 26, 2015

The article titled Spy Tools Come to the Cloud on Enterprise Tech shows how Amazon’s work with analytics companies on behalf of the government have realized platforms like “GovCloud”, with increased security. The presumed reason for such platforms being the gathering of intelligence and threat analysis on the big data scale. The article explains,

“The Digital Reasoning cognitive computing tool is designed to generate “knowledge graphs of connected objects” gleaned from structured and unstructured data. These “nodes” (profiles of persons or things of interest) and “edges” (the relationships between them) are graphed, “and then being able to take this and put it into time and space,” explained Bill DiPietro, vice president of product management at Digital Reasoning. The partners noted that the elastic computing capability… is allowing customers to bring together much larger datasets.”

For former CIA staff officer DiPietro it logically follows that bigger questions can be answered by the data with tools like the AWS GovCloud and subsequent Hadoop ecosystems. He cites the ability to quickly spotlight and identify someone on a watch list out of the haystack of people as the challenge set to overcome. They call it “cluster on demand,” the process that allows them to manage and bring together data.

Chelsea Kerwin, June 26,  2015

Sponsored by, publisher of the CyberOSINT monograph

« Previous PageNext Page »