August 23, 2016
After several tests, the fourth HonkinNews video is available on YouTube. You can view the six minute video at https://youtu.be/AIYdu54p2Mg. The HonkinNews highlights a half dozen stories from the previous week’s Beyond Search stream. The commentary adds a tiny twist to most of the stories. We know that search and content processing are not the core interests of the millennials. We don’t expect to attract much of a following from teens or from “real” search experts. Nevertheless, we will continue with the weekly news program because Google has an appetite for videos. We will continue with the backwoods theme and the 16 mm black and white film. We think it adds a high tech look to endless recycling of search and content jargon which fuels information access today.
Kenny Toth, August 23, 2016
August 22, 2016
My recollection is that the search plumbing for Rocket Software Enterprise Search is AeroText. If you are not familiar with AeroText, the system was for a number of years a property of Lockheed Martin. But times change. Rocket Software purchased AeroText in 2008. The news release about the deal stated:
The AeroText product suite provides a fast, agile information extraction system for developing knowledge-based content analysis applications. The technology excels at developing a core understanding of content contained within unstructured text, such as emails and documents, as well as an ability to automatically reconcile information cited across multiple documents. Such a capability makes it suited for a variety of applications, from counter-terrorism and law enforcement to business intelligence and enterprise content management. AeroText was originally developed by Lockheed Martin and is often integrated into other solutions. AeroText solutions provide both information extraction and link analysis capabilities by converting unstructured information into structured information.
Is this information important? Well, to those who want to use open source search solutions, nah. To companies wanting a proprietary search system with a defense pedigree, yes.
If you want Rocket Software’s description of one of its uses of the AeroText technology, you can download the white paper “How Enterprise Search Enhances Enterprise Intelligence” at this link. You will have to register and be careful not to hit the “return” key. Don’t care? Well, prepare to complete the information a second time.
AeroText used to require human tweaked rules, a human built classification scheme, and content in XML. Each of these attributes is characteristics of a traditional approach to information retrieval.
Stephen E Arnold, August 22, 2016
August 16, 2016
In an exclusive interview, Yippy’s head of enterprise search reveals that Yippy launched an enterprise search technology that Google Search Appliance users are converting to now that Google is sunsetting its GSA products.
Yippy also has its sights targeting the rest of the high-growth market for cloud-based enterprise search. Not familiar with Yippy, its IBM tie up, and its implementation of the Velocity search and clustering technology? Yippy’s Michael Cizmar gives some insight into this company’s search-and-retrieval vision.
Yippy ((OTC PINK:YIPI) is a publicly-trade company providing search, content processing, and engineering services. The company’s catchphrase is, “Welcome to your data.”
The core technology is the Velocity system, developed by Carnegie Mellon computer scientists. When IBM purchased Vivisimio, Yippy had already obtained rights to the Velocity technology prior to the IBM acquisition of Vivisimo. I learned from my interview with Mr. Cizmar that IBM is one of the largest shareholders in Yippy. Other facets of the deal included some IBM Watson technology.
This year (2016) Yippy purchased one of the most recognized firms supporting the now-discontinued Google Search Appliance. Yippy has been tallying important accounts and expanding its service array.
John Cizmar, Yippy’s senior manager for enterprise search
Beyond Search interviewed Michael Cizmar, the head of Yippy’s enterprise search division. Cizmar found MC+A and built a thriving business around the Google Search Appliance. Google stepped away from on premises hardware, and Yippy seized the opportunity to bolster its expanding business.
I spoke with Cizmar on August 15, 2016. The interview revealed a number of little known facts about a company which is gaining success in the enterprise information market.
Cizmar told me that when the Google Search Appliance was discontinued, he realized that the Yippy technology could fill the void and offer more effective enterprise findability. He said, “When Yippy and I began to talk about Google’s abandoning the GSA, I realized that by teaming up with Yippy, we could fill the void left by Google, and in fact, we could surpass Google’s capabilities.”
Cizmar described the advantages of the Yippy approach to enterprise search this way:
We have an enterprise-proven search core. The Vivisimo engineers leapfrogged the technology dating from the 1990s which forms much of Autonomy IDOL, Endeca, and even Google’s search. We have the connector libraries THAT WE ACQUIRED FROM MUSE GLOBAL. We have used the security experience gained via the Google Search Appliance deployments and integration projects to give Yippy what we call “field level security.” Users see only the part of content they are authorized to view. Also, we have methodologies and processes to allow quick, hassle-free deployments in commercial enterprises to permit public access, private access, and hybrid or mixed system access situations.
With the buzz about open source, I wanted to know where Yippy fit into the world of Lucene, Solr, and the other enterprise software solutions. Cizmar said:
I think the customers are looking for vendors who can meet their needs, particularly with security and smooth deployment. In a couple of years, most search vendors will be using an approach similar to ours. Right now, however, I think we have an advantage because we can perform the work directly….Open source search systems do not have Yippy-like content intake or content ingestion frameworks. Importing text or an Oracle table is easy. Acquiring large volumes of diverse content continues to be an issue for many search and content processing systems…. Most competitors are beginning to offer cloud solutions. We have cloud options for our services. A customer picks an approach, and we have the mechanism in place to deploy in a matter of a day or two.
Connecting to different types of content is a priority at Yippy. Even through the company has a wide array of import filters and content processing components, Cizmar revealed that Yippy is “enhanced the company’s connector framework.”
I remarked that most search vendors do not have a framework, relying instead on expensive components licensed from vendors such as Oracle and Salesforce. He smiled and said, “Yes, a framework, not a widget.”
Cizmar emphasized that the Yippy IBM Google connections were important to many of the company’s customers plus we have also acquired the Muse Global connectors and the ability to build connectors on the fly. He observed:
Nobody else has Watson Explorer powering the search, and nobody else has the Google Innovation Partner of the Year deploying the search. Everybody tries to do it. We are actually doing it.
Cizmar made an interesting side observation. He suggested that Internet search needed to be better. Is indexing the entire Internet in Yippy’s future? Cizmar smiled. He told me:
Yippy has a clear blueprint for becoming a leader in cloud computing technology.
For the full text of the interview with Yippy’s head of enterprise search, Michael Cizmar, navigate to the complete Search Wizards Speak interview. Information about Yippy is available at http://yippyinc.com/.
Stephen E Arnold, August 16, 2016
July 25, 2016
Sinequa, a French search vendor, is hunting for partners in the US. The news appears in “Sinequa Partner Advantage Program Empowers the Channel to Capitalize on Leading Cognitive Search & Analytics Technology.” If you liked the title of this article, you will love the subtitle:
Company Launches New Partner Program to Drive Cross-Industry Adoption of Cognitive Search & Analytics and Address Growing Customer Demands
Keywords galore. What I noted was the euphony of “leading cognitive search and analytics technology.” A number of outfits are chasing the “cognitive search” pot of gold. Competitors include the champion in declining quarterly revenue IBM. Then there are the assorted machine learning folks at the Alphabet Google thing. Plus there are various and sundry deep learning initiatives appearing on a daily basis from the money crucible in Sillycon Valley; for example, Indico, MetaMind, Ripjar, Synapsify, and, my favorite, Idibon. I just love “idibon.” So many associations from ichibon to bon bon. Good, right?
Partners flock like Zika bearing mosquitoes when there is big money in a reseller/OEM/integrator tie up.
I learned from the Sinequa write up about Sinequa:
Sinequa continues to grow its partnerships with leading global systems integrators and value-add resellers (VARs) as well vendors of enterprise application, cloud and Big Data. In an effort to address rising customer demands from Global Fortune 2000 organizations for turning data into actionable insights, Sinequa extends its worldwide network with partners seeking to enrich their Big Data/analytics offerings in key strategic markets such as banking, defense and security, life sciences, manufacturing, utilities and government. The Sinequa Partner Advantage Program enables channel and service partners to quickly capitalize on the high growth opportunity in cognitive search and analytics. Designed to empower partners with certification programs, technical support and world-class training, Sinequa also offers partners performance-based incentives and marketing support programs…Certified partners access the recently introduced Sinequa ES Version 10. Powered by Machine Learning capabilities at its core, this ground breaking version helps deliver deep analytics of contents and user behavior, offering information with continually improving relevance to users in their work environments.
A point I think is important: Sinequa was founded in 2002. That makes the company 14 years young. Not quite a start up but agile enough when it comes to cognitive technology.
I assume that in today’s economic environment, potential partners will be swarming like the Zika bearing mosquitoes in the river marsh near my home in Harrod’s Creek, Kentucky. These critters seem to fancy my chubby, 72 year old body.
I have noted, however, that some vendors of search are having to work extra hard to close deals. Examples range from Big Blue in Union Square to SLI Systems in New Zealand and parts in between.
The idea of partnering is a good one. Endeca rose to its legitimate $100 million plus in search revenue with its carefully crafted partnering program. On the other hand, the Google Search Appliance partners continue to regroup because the wiser minds at Mother Google killed off the pricey Google Search Appliance. I treasure my print out of the GSA schedule with the five and six digit license fees for the wonderful GB 7007 and 9009 models. Imagine a locked down appliance for the price of a pre acquisition Autonomy IDOL license. Then when the document capacity of the search appliance was reached, a customer could license more Google Search Appliances. I found this business model interesting because taxi meter pricing is often an issue for chief financial officers who want to budget for certain products and services.
The upside of partnerships is that, as Endeca learned, unusual opportunities can be discovered. Once the deal is closed, the lucky partner has an opportunity to tailor the search system to meet the needs of the customer. Once up and running, life is good. Renewals, customization, consulting, maintenance fees, and other oddments make a search vendor’s life one of comfort and joy. The downsides include lawsuits, squabbles, and disruptions from competitors.
Worth watching how Sinequa maneuvers in the US market. Other French search vendors have found the costs and cultural issues a bit of a headache. Examples range from Antidot, Pertimm, and Exalead among others. Do you use Qwant?
Stephen E Arnold, July 25, 2016
July 12, 2016
I participated in a telephone call before the US holiday break. The subject was the likelihood of a potential investment in an enterprise search technology would be a winner. I listened for most of the 60 minute call. I offered a brief example of the over promise and under deliver problems which plagued Convera and Fast Search & Transfer and several of the people on the call asked, “What’s a Convera?” I knew that today’s whiz kids are essentially reinventing the wheel.
I wanted to capture three ideas which I jotted down during that call. My thought is that at some future time, a person wanting to understand the incredible failures that enterprise search vendors have tallied will have three observations to consider.
Enterprise Search: Does a Couple of Things Well When Users Expect Much More
Enterprise search systems ship with filters or widgets which convert source text into a format that the content processing module can index. The problem is that images, videos, audio files, content from wonky legacy systems, or proprietary file formats like IBM i2’s ANB files do not lend themselves to indexing by a standard enterprise search system. The buyers or licensees of the enterprise search system do not understand this one trick pony nature of text retrieval. Therefore, when the system is deployed, consternation follows confusion when content is not “in” the enterprise search system and, therefore, cannot be found. There are systems which can deal with a wide range of content, but these systems are marketed in a different way, often cost millions of dollars a year to set up, maintain, and operate.
Net net: Vendors do not explain the limitations of text search. Licensees do not take the time or have the desire to understand what an enterprise search system can actually do. Marketers obfuscate in order to close the deal. Failure is a natural consequence.
Data Management Needed
The disconnect boils down to what digital information the licensee wants to search. Once the universe is defined, the system into which the data will be placed must be resolved. No data management, no enterprise search. The reason is that licensees and the users of an enterprise search system assume that “all” or “everything” – maps to web content, email to outputs from an AS/400 Ironside are available any time. Baloney. Few organizations have the expertise or the appetite to deal with figuring out what is where, how much, how frequently each type of data changes, and the formats used. I can hear you saying, “Hey, we know what we have and what we need. We don’t need a stupid, time consuming, expensive inventory.” There you go. Failure is a distinct possibility.
Net net: Hope springs eternal. When problems arise, few know what’s where, who’s on first, and why I don’t know is on third.
July 8, 2016
Another day, another merger. PR Newswire released a story, VirtualWorks and Language Tools Announce Merger, which covers Virtual Works’ purchase of Language Tools. In Language Tools, they will inherit computational linguistics and natural language processing technologies. Virtual Works is an enterprise search firm. Erik Baklid, Chief Executive Officer of VirtualWorks is quoted in the article,
“We are incredibly excited about what this combined merger means to the future of our business. The potential to analyze and make sense of the vast unstructured data that exists for enterprises, both internally and externally, cannot be understated. Our underlying technology offers a sophisticated solution to extract meaning from text in a systematic way without the shortcomings of machine learning. We are well positioned to bring to market applications that provide insight, never before possible, into the vast majority of data that is out there.”
This is another case of a company positioning themselves as a leader in enterprise search. Are they anything special? Well, the news release mentions several core technologies will be bolstered due to the merger: text analytics, data management, and discovery techniques. We will have to wait and see what their future holds in regards to the enterprise search and business intelligence sector they seek to be a leader in.
July 4, 2016
Enterprise search is one of the driving forces behind an enterprise system because the entire purpose of the system is to encourage collaboration and quickly find information. While enterprise search is an essential tool, according to Computer Weekly’s article. “Beyond Keywords: Bringing Initiative To Enterprise Search” the feature is stuck in the past.
Enterprise search is due for an upgrade. The amount of enterprise data has increased, but the underlying information management system remains the same. Structured data is easy to make comply with the standard information management system, however, it is the unstructured data that holds the most valuable information. Unstructured information is hard to categorize, but natural language processing is being used to add context. Ontotext combined natural language processing with a graph database, allowing the content indexing to make more nuanced decisions.
We need to level up the basic keyword searching to something more in-depth:
“Search for most organisations is limited: enterprises are forced to play ‘keyword bingo’, rephrasing their question multiple times until they land on what gets them to their answer. The technologies we’ve been exploring can alleviate this problem by not stopping at capturing the keywords, but by capturing the meaning behind the keywords, labeling the keywords into different categories, entities or types, and linking them together and inferring new relationships.”
In other words, enterprise search needs the addition of semantic search in order to add context to the keywords. A basic keyword search returns every result that matches the keyword phrase, but a context-driven search actually adds intuition behind the keyword phrases. This is really not anything new when it comes to enterprise or any kind of search. Semantic search is context-driven search.
July 1, 2016
Voyager Search is vendor of search and retrieval based on Lucene. I was not familiar with the company until I read “Voyager Search Improves Search Capabilities and Overall Usability With More Than 150 Updates to Its Version 1.9.8.” According to the write up:
In the new version, Voyager makes it easier to configure content in Navigo, its modern web app, extends its spatial content search, and improves the usability of its Navigo processing tools. Managing content in Navigo can now be done through the new personalized ‘My Voyager’ customization page, which allows customers to share saved searches and update display configurations through a drag and drop interface.
One point in the write up I noted was this statement: “An improved ?spatial search interface now includes the ability to draw and buffer points, lines and polygons.” The idea is that geo-spatial operations appear to be supported by the system.
I also highlighted this comment:
Voyager Search is a leading global provider of geospatial, enterprise search tools that connect, find and deliver more than 1,800 different file formats.
In my experience, support for more than 1,000 file formats suggests a large number of conversion widgets.
Stephen E Arnold, July 1, 2016
June 24, 2016
I spoke with a confused and unbudgeted worker bee at a giant outfit this weekend. The stellar professional was involved in figuring out what to do about enterprise search. The story is one I have heard many times in the last 40 years. The system doesn’t meet the needs of the users. The system is over budget. The system does not index in real time. Yadda yadda yadda.
The big question was, “What are the enterprise search vendors offering a system which actually works, does not experience downtime, cost overruns, and user outrage. Note that this is not the word “outage.” The word is “outrage”.
I don’t know of such a system. As a helpful 72 year old, I rattled off a list of vendors who purport to offer Big Data capable, next generation semantic-linguistic-NLP systems. True to form, I repeated the list twice. I thought he would cry.
For those of you who want to know the vendors I plucked from my list of outfits in the search and content processing game, I reproduce the list. If you want upsides, downsides, license fees, gotchas, and other assorted details, I will provide the information. But since you are not likely to buy me dinner this evening, you will have to pay for my thoughts.
Here’s the selected list. Reader, start your browser:
- Elasticsearch (Lucene)
- Fabasoft Mindbreeze
- IBM Omnifind
- IHS Goldfire
- Lucid Works (Solr)
- Squiz Funnelback
There are quite a few outfits whose systems do search like Palantir, but I trimmed the list to companies for my worried pal.
What’s interesting is that most of these outfits explain that their systems are much, much more than search and retrieval. Believe it or not as Mr. Ripley used to say.
Factoid: Most of these outfits have been around for quite a few years. Only Elasticsearch has managed to become a “brand” in the search space. What happened to Autonomy, Convera, Endeca, Fast Search & Transfer, and Verity since I wrote the first three editions of the Enterprise Search Report between 2003 and 2007? Ugly for some.
Search is a tough problem and has yet to deliver what users expect. Remember Google killed its search appliance. Ads are a better business because they spell money for Alphabet.
Stephen E Arnold, June 24, 2016
June 22, 2016
While Dark Web users understand the perks of anonymity, especially for those those involved with illicit activity, consistency in maintaining that anonymity appears to be challenging. Geek.com published an article that showcases how one drug dealer revealed his identity while trying to promote his brand: Drug dealer busted after trying to trademark his dark web username. David Ryan Burchard of Merced, California reportedly made $1.25 million by selling marijuana and cocaine on the Dark Web before he trademarked the username he used to sell drugs, “caliconnect”. The article summarizes,
“He started out on Silk Road and moved on to other shady marketplaces in the wake of its highly-publicized shutdown. Burchard wound up on Homeland Security’s list of top sellers, though they were having trouble establishing a rock-solid connection between him and his online persona. They knew that Burchard was accumulating a large Bitcoin stash and that there didn’t appear to be a legitimate source. Then, finally, investigators got the break they were looking for. It seems that Burchard decided that his personal brand was worth protecting, and he filed paperwork to trademark “caliconnect.””
Whether this points to the proclivity of human nature to self-promote or the egoism of one person in a specific situation, it seems that all covering the story are drawing attention to this foiling move as a preventable mistake on Burchard’s part. Look no farther than the title of a recent Motherboard article: Pro-Tip: If You’re a Suspected Dark Web Drug Dealer, Don’t Trademark Your #Brand. The nature of promotions and marketing on the Dark Web will be an interesting area to see unfold.
Megan Feil, June 22, 2016