Voyager Search: New Release Available
July 1, 2016
Voyager Search is vendor of search and retrieval based on Lucene. I was not familiar with the company until I read “Voyager Search Improves Search Capabilities and Overall Usability With More Than 150 Updates to Its Version 1.9.8.” According to the write up:
In the new version, Voyager makes it easier to configure content in Navigo, its modern web app, extends its spatial content search, and improves the usability of its Navigo processing tools. Managing content in Navigo can now be done through the new personalized ‘My Voyager’ customization page, which allows customers to share saved searches and update display configurations through a drag and drop interface.
One point in the write up I noted was this statement: “An improved ?spatial search interface now includes the ability to draw and buffer points, lines and polygons.” The idea is that geo-spatial operations appear to be supported by the system.
I also highlighted this comment:
Voyager Search is a leading global provider of geospatial, enterprise search tools that connect, find and deliver more than 1,800 different file formats.
In my experience, support for more than 1,000 file formats suggests a large number of conversion widgets.
The company bills itself as the “only install and go Solr/Lucene search engine.” Information about the company is available at this link. A demo is available here.
Stephen E Arnold, July 1, 2016
Enterprise Search Vendors: A Partial List
June 24, 2016
I spoke with a confused and unbudgeted worker bee at a giant outfit this weekend. The stellar professional was involved in figuring out what to do about enterprise search. The story is one I have heard many times in the last 40 years. The system doesn’t meet the needs of the users. The system is over budget. The system does not index in real time. Yadda yadda yadda.
The big question was, “What are the enterprise search vendors offering a system which actually works, does not experience downtime, cost overruns, and user outrage. Note that this is not the word “outage.” The word is “outrage”.
I don’t know of such a system. As a helpful 72 year old, I rattled off a list of vendors who purport to offer Big Data capable, next generation semantic-linguistic-NLP systems. True to form, I repeated the list twice. I thought he would cry.
For those of you who want to know the vendors I plucked from my list of outfits in the search and content processing game, I reproduce the list. If you want upsides, downsides, license fees, gotchas, and other assorted details, I will provide the information. But since you are not likely to buy me dinner this evening, you will have to pay for my thoughts.
Here’s the selected list. Reader, start your browser:
- Attivio
- Coveo
- dtSearch
- Elasticsearch (Lucene)
- Fabasoft Mindbreeze
- IBM Omnifind
- IHS Goldfire
- Lookeen
- Lucid Works (Solr)
- Marklogic
- Maxxcat
- Polyspot
- Sinequa
- Solcara
- Squiz Funnelback
- Thunderstone
- X1
- Yippy
There are quite a few outfits whose systems do search like Palantir, but I trimmed the list to companies for my worried pal.
What’s interesting is that most of these outfits explain that their systems are much, much more than search and retrieval. Believe it or not as Mr. Ripley used to say.
Factoid: Most of these outfits have been around for quite a few years. Only Elasticsearch has managed to become a “brand” in the search space. What happened to Autonomy, Convera, Endeca, Fast Search & Transfer, and Verity since I wrote the first three editions of the Enterprise Search Report between 2003 and 2007? Ugly for some.
Search is a tough problem and has yet to deliver what users expect. Remember Google killed its search appliance. Ads are a better business because they spell money for Alphabet.
Stephen E Arnold, June 24, 2016
The Paradox of Marketing and Anonymity
June 22, 2016
While Dark Web users understand the perks of anonymity, especially for those those involved with illicit activity, consistency in maintaining that anonymity appears to be challenging. Geek.com published an article that showcases how one drug dealer revealed his identity while trying to promote his brand: Drug dealer busted after trying to trademark his dark web username. David Ryan Burchard of Merced, California reportedly made $1.25 million by selling marijuana and cocaine on the Dark Web before he trademarked the username he used to sell drugs, “caliconnect”. The article summarizes,
“He started out on Silk Road and moved on to other shady marketplaces in the wake of its highly-publicized shutdown. Burchard wound up on Homeland Security’s list of top sellers, though they were having trouble establishing a rock-solid connection between him and his online persona. They knew that Burchard was accumulating a large Bitcoin stash and that there didn’t appear to be a legitimate source. Then, finally, investigators got the break they were looking for. It seems that Burchard decided that his personal brand was worth protecting, and he filed paperwork to trademark “caliconnect.””
Whether this points to the proclivity of human nature to self-promote or the egoism of one person in a specific situation, it seems that all covering the story are drawing attention to this foiling move as a preventable mistake on Burchard’s part. Look no farther than the title of a recent Motherboard article: Pro-Tip: If You’re a Suspected Dark Web Drug Dealer, Don’t Trademark Your #Brand. The nature of promotions and marketing on the Dark Web will be an interesting area to see unfold.
Megan Feil, June 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Enterprise Search Vendor Sinequa Partners with MapR
June 8, 2016
In the world of enterprise search and analytics, everyone wants in on the clients who have flocked to Hadoop for data storage. Virtual Strategy shared an article announcing Sinequa Collaborates With MapR to Power Real-Time Big Data Search and Analytics on Hadoop. A firm specializing in big data, Sinequa, has become certified with the MapR Converged Data Platform. The interoperation of Sinequa’s solutions with MapR will enable actionable information to be gleaned from data stored in Hadoop. We learned,
“By leveraging advanced natural language processing along with universal structured and unstructured data indexing, Sinequa’s platform enables customers to embark on ambitious Big Data projects, achieve critical in-depth content analytics and establish an extremely agile development environment for Search Based Applications (SBA). Global enterprises, including Airbus, AstraZeneca, Atos, Biogen, ENGIE, Total and Siemens have all trusted Sinequa for the guidance and collaboration to harness Big Data to find relevant insight to move business forward.”
Beyond all the enterprise search jargon in this article, the collaboration between Sinequa and MapR appears to offer an upgraded service to customers. As we all know at this point, unstructured data indexing is key to data intake. However, when it comes to output, technological solutions that can support informed business decisions will be unparalleled.
Megan Feil, June 8, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Mindbreeze Breaks into Slovak Big Data Market Through Partnership with Medialife
April 18, 2016
The article titled Mindbreeze and MEDIALIFE Launch Strategic Partnership on BusinessWire discusses what the merger means for the Slovak and Czech Republic enterprise search market. MediaLife emphasizes its concentrated approach to document management systems for Slovak customers in need of large systems for the management, processing, and storage of documents. The article details,
“Based on this partnership, we provide our customers innovative solutions for fast access to corporate data, filtering of relevant information, data extraction and their use in automated sorting (classification)… Powerful enterprise search systems for businesses must recognize relationships among different types of information and be able to link them accordingly. Mindbreeze InSpire Appliance is easy to use, has a high scalability and shows the user only the information which he or she is authorized to view.”
Daniel Fallmann, founder and CEO of Mindbreeze, complimented himself on his selection of a partner in MediaLife and licked his chops at the prospect of the new Eastern European client base opened to Mindbreeze through the partnership. Other Mindbreeze partners exist in Italy, the UK, Germany, Mexico, Canada, and the USA, as the company advances its mission to supply enterprise search appliances as well as big data and knowledge management technologies.
Chelsea Kerwin, April 18, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
RAVN ACE Can Help Financial Institutions with Regulatory Compliance
March 31, 2016
Increased regulations in the financial field call for tools that can gather certain information faster and more thoroughly. Bobsguide points to a solution in, “RAVN Systems Releases RAVN ACE for Automated Data Extraction of ISDA Documents Using Artificial Intelligence.” For those who are unaware, ISDA stands for International Swaps and Derivatives Association, and a CSA is a Credit Support Annex. The press release informs us:
“RAVN’s ground-breaking technology, RAVN ACE, joins elements of Artificial Intelligence and information processing to deliver a platform that can read, interpret, extract and summarise content held within ISDA CSAs and other legal documents. It converts unstructured data into structured output, in a fraction of the time it takes a human – and with a higher degree of accuracy. RAVN ACE can extract the structure of the agreement, the clauses and sub-clauses, which can be very useful for subsequent re-negotiation purposes. It then further extracts the key definitions from the contract, including collateral data from tabular formats within the credit support annexes. All this data is made available for input to contract or collateral management and margining systems or can simply be provided as an Excel or XML output for analysis. AVN ACE also provides an in-context review and preview of the extracted terms to allow reviewing teams to further validate the data in the context of the original agreement.”
The write-up tells us the platform can identify high-credit-risk relationships and detail the work required to repaper those accounts (that is, to re-draft, re-sign, and re-process paperwork). It also notes that even organizations that have a handle on their contracts can benefit, because the platform can compare terms in actual documents with those in that have been manually abstracted.
Based in London, enterprise search firm RAVN tailors its solutions to the needs of each industry it serves. The company was founded in 2011.
Cynthia Murrell, March 31, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Allegedly Secretive Palantir Technologies Getting Chatty?
March 22, 2016
Many of the articles I read about Palantir Technologies describe the company as secretive. I am not sure that is 100 percent accurate. The company has videos on YouTube for goodness sake.
I noted “How Palantir Uses Big Data to Find Missing Kids.” This article came hard on the heels of “Is Morgan Stanley Wrong about Big Palantir Valuation Markdown?”
The missing kids story emphasizes Palantir’s social “good” work. I noted this passage:
Lucky for Palantir, big data challenges are just as common in the nonprofit world as in the for-profit sector. Recently, the company, which started out partnering with the U.S. intelligence and defense communities in antiterrorism efforts, has turned its attention to one of the biggest current problems: The Syrian civil war and subsequent refugee crisis, via a collaboration with The Carter Center. “We’re a company that focuses on the world’s hardest problems,” says Karin Knox, head of Palantir’s philanthropy engineering team. “Right now we probably have a hand in all of them.”
Lucky.
Stephen E Arnold, March 22, 2016
Cost Common Sense: Why Your Search System Just Keeps Getting More and More Expensive
March 21, 2016
I read “4 Unseen Expenses with In-House IT Departments.” The information in the write up is helpful. Too bad more specialists keep the lid on cost data. In the write up, there are four “unseen expenses” which almost guarantee that enterprise search systems will, like the Entergizer bunny, keep going and going. What are the costs? Here are the four from the write up:
- Staffing costs
- Downtime costs
- Ineffective IT support costs
- Cost of replacement.
I would mention several others, but I don’t want to exhaust my list of the costs associated with an enterprise search system. (Mine are split into planning or pre acquisition costs, procurement costs, initial installation costs, first year costs, and subsequent year costs. Four, as you may conclude, gentle reader, only spot the iceberg of money that looms through the fog of disbelief.
The Energizer bunny of cost overruns, enterprise search.
My additions:
- The costs of operating legacy systems. As I have pointed out in previous books and articles, Fortune 1000 firms have a minimum of five or more enterprise search systems in operation
- The costs of legal fees related to adjudications with the vendor or vendors for services related to the enterprise search system
- The costs of infrastructure surprises; for example, why is this system so slow to add new and changed content? Answer: We need more hardware, bandwidth, storage, memory, etc.
Enterprise search, after 50 years, is a chief financial officer’s bane in many organizations.
Stephen E Arnold, March 21, 2015
Lexmark and Search
March 20, 2016
Short honk. Last year, Forrester, the mid tier consulting firm, released a magic square for enterprise search. I noted this morning that Lexmark was relying on TechRepublic to push this old wine in a somewhat new bottle. You can see the pitch at this link. What’s remarkable about this particular magic square thing is that Lexmark is flagged as a leader in enterprise search. Lexmark as you may know acquired the ISYS Search Software system and Brainware a couple of years ago. ISYS is interesting because its technology was crafted in the 1980s. Lexmark’s financial challenges are similar to those faced by other print centric companies trying to make the transition to the digital ecosystem. But a leader in a sector which has largely embraced open source search technology? Interesting.
Stephen E Arnold, March 20, 2016
Elasticsearch Case Example: Scrunch
March 19, 2016
If you are using or considering the use of Elasticsearch, you will want to read “Lessons Learned From A Year Of Elasticsearch In Production.” The write up contains five excellent tips.
I highlighted this statement as one which Elasticsearch users will want to keep in mind:
If you can afford SSDs, then buy them. Elasticsearch does a lot of reading from disk and fast disks equal fast queries.
Elasticsearch is one reason proprietary search vendors are gasping for air.
Stephen E Arnold, March 19, 2016