SeMI: Yet Another Smart Search System

May 2, 2022

Once upon a time, search engines were incapable of understanding queries phrased like a question. With the advent of smarter technology, particularly machine learning and AI, search engines are almost as smart as a human. TechCrunch discusses how one company has created its take on smart search: “SeMI Technologies’ Search Engine Opens Up New Ways To Query Your Data.”

SeMi Technologies invented Weaviate, a vector search engine that uses a unique AI-first database with machine outputting vectors aka embedding. The company wishes to commoditize the technology and has an open source business model. Bob can Luijt is the CEO and co-founder of SeMI. He wants his vector search engine to remain open source so it can help people and businesses that truly need it. SeMi did not create the models used in Weaviate, instead, they deliver the power and systems recommendations.

SeMI Technologies has had over one hundred use cases, including startups powered by vector search engines and use Weaviate to deliver results. SeMi was not actively seeking investors when it received funding in 2020:

“SeMI raised a $1.2 million seed in August 2020 from Zetta Venture Partners and ING Ventures and since then has been on the radar of venture capital companies. Since then, its software has been downloaded almost 750,000 times, growth of about 30% per month. Van Luijt didn’t give specifics on the company’s growth metrics, but did say the number of downloads can correlate to sales of enterprise licenses and managed services. In addition, the spike in usage and understanding of the added value of Weaviate has caused all growth metrics to go up, and the company to exhaust its seed funding.

The company has received more funding in a Series A round that ended with $16 million. The CEO will use the money to hire more employees in the US and Europe, expand its open source community, focus on go-to-market and products centered on the open source core, and invest in research where machine learning overlaps with computer science.

Whitney Grace, May 2, 2022

Deepset: Following the Trail of DR LINK, Fast Search and Transfer, and Other Intrepid Enterprise Search Vendors

April 29, 2022

I noted a Yahooooo! news story called “Deepset Raises $14M to Help Companies Build NLP Apps.” To me the headline could mean:

Customization is our business and services revenue our monetization model

Precursor enterprise search vendors tried to get gullible prospects to believe a company could install software and employees could locate the information needed to answer a business question. STAIRS III, Personal Library Software / SMART, and the outfit with forward truncation (InQuire) among others were there to deliver.

Then reality happened. Autonomy and Verity upped the ante with assorted claims. The Golden Age of Enterprise Search was poking its rosy fingers through the cloud of darkness related to finding an answer.

Quite a ride: The buzzwords sawed through the doubt and outfits like Delphis, Entopia, Inference, and many others embraced variations on the smart software theme. Excursions into asking the system a question to get an answer gained steam. Remember the hand crafted AskJeeves or the mind boggling DR LINK; that was, document retrieval via linguistic knowledge.

Today there are many choices for enterprise search: Free Elastic, Algolia, Funnelback now the delightfully named Squiz, Fabasoft Mindbreeze, and, of course, many, many more.

Now we have Deepset, “the startup behind the open source NLP framework Haystack, not to be confused with Matt Dunie’s memorable “haystack with needles” metaphor, the intelware company Haystack, or a basic piles of dead grass.

The article states:

CEO Milos Rusic co-founded Deepset with Malte Pietsch and Timo Möller in 2018. Pietsch and Möller — who have data science backgrounds — came from Plista, an adtech startup, where they worked on products including an AI-powered ad creation tool. Haystack lets developers build pipelines for NLP use cases. Originally created for search applications, the framework can power engines that answer specific questions (e.g., “Why are startups moving to Berlin?”) or sift through documents. Haystack can also field “knowledge-based” searches that look for granular information on websites with a lot of data or internal wikis.

What strikes me? Three things:

  1. This is essentially a consulting and services approach
  2. Enterprise becomes apps for a situation, department, or specific need
  3. The buzzwords are interesting: NLP, semantic search, BERT,  and humor.

Humor is a necessary quality which trying to make decades old technology work for distributed, heterogeneous data, email on a sales professionals mobile, videos, audio recordings, images, engineering diagrams along with the nifty datasets for the gizmos in the illustration, etc.

A question: Is $14 million enough?

Crickets.

Stephen E Arnold, April 29, 2022

Enterprise Search Vendor Buzzword Bonanza!

April 25, 2022

Enterprise search vendors are similar to those two Red Bull-sponsored wizards who wanted to change aircraft—whilst in flight. How did that work out? The pilots survived. That aircraft? Yeah, Liberty, Liberty Mutual as the YouTube ads intone.

Enterprise search vendors want to become something different. Typical repositionings include customer support which entails typing in a word and scanning for matches and business intelligence which often means indexing content, matching words and phrases on a list, and generating alerts. There are other variations which include analyzing content and creating a report which tallies text messages from outraged customers.

Let’s check out reality. “Enterprise search” means finding information. Words and phrase are helpful. Users want these systems to know what is needed and then output it without asking the user to do anything. The challenge becomes assigning a jazzy marketing hook to make enterprise search into something more vital, more compelling, and more zippy.

Navigate to “What Should We Remember?” Bonanza. The diagram is a remarkable array of categories and concepts tailor-made for search marketers. Here’s an example of some of the zingy concepts:

  • Zero-risk bias
  • Social comparison
  • Fundamental attribution
  • Barnum effect — Who? The circus person?

Now mix in natural language processing, semantic analysis, entity extraction, artificial intelligence, and — my fave — predictive analytics.

How quickly will outfits in the enterprise search sector gravitate to these more impactful notions? Desperation is a motivating factor. Maybe weeks or months?

Stephen E Arnold, April 25, 2022

Enterprise Search Vendors: Sure, Some Are Missing But Does Anyone Know or Care?

April 20, 2022

I came across a site called Software Suggest and its article “Coveo Enterprise Search Alternatives.” Wow. What’s a good word for bad info?

The system generated 29 vendors in addition to Coveo. The options were not in alphabetical order or any pattern I could discern. What outfits are on the list? Here are the enterprise search vendors for February 2022, the most recent incarnation of this list. My comments are included in parentheses for each system. By the way, an alternative is picking from two choices. This is more correctly labeled “options.” Just another indication of hippy dippy information about information retrieval.

AddSearch (Web site search which is not enterprise search)

Algolia (a publicly trade search company hiring to reinvent enterprise search just as Fast Search & Transfer did more than a decade ago)

Bonsai.io (another Eleasticsearch repackager)

Coveo (no info, just a plea for comments)

C Searcher(from HNsoft in Portugal. desktop search last updated in 2018 according to the firm’s Web site)

CTX Search (the expired certificate does bode well)

Datafari (maybe open source? chat service has no action since May 2021)

Expertrec Search Engine (an eCommerce solution, not an enterprise search system)

Funnelback (the name is now Squiz. The technology Australian)

Galaktic (a Web site search solution from Taglr, an eCommerce search service)

IBM Watson (yikes)

Inbenta (A Catalan outfit which shapes its message to suit the purchasing climate)

Indica Enterprise Search (based in the Netherlands but the name points to a cannabis plant)

Intrasearch (open source search repackaged with some spicy AI and other buzzwords)

Lateral (the German company with an office in Tasmania offers an interface similar to that of Babel Street and Geospark Analytics for an organization’s content)

Lookeen (desktop search for “all your data”. All?)

OnBase ECM (this is a tricky one. ISYS Search sold to Lexmark. Lexmark sold to Highland. Highland appears to be the proud possessor of ISYS Search and has grafted it to an enterprise content management system)

OpenText (the proud owner of many search systems, including Tuxedo and everyone’s fave BRS Search)

Relevancy Platform (three years ago, Searchspring Relevancy Platform was acquired by Scaleworks which looks like a financial outfit)

Sajari (smart site search for eCommerce)

SearchBox Search (Elasticsearch from the cloud)

Searchify (a replacement for Index Tank. who?)

SearchUnify (looks like a smart customer support system, a pitch used by Coveo and others in the sector)

Site Search 360 (not an enterprise search solution in my opinion)

SLI Systems (eCommerce search, not enterprise search, but I could be off base here)

Team Search (TransVault searches Azure Tenancy set ups)

Wescale (mobile eCommerce search)

Wizzy (the name is almost as interesting as the original Purple Yogi system and another eCommerce search system)

Wuha (not as good a name as Purple Yogi. A French NLP search outfit)

X1 Search (from Idea Labs, X1 is into eDiscovery and search)

This is quite an incomplete and inconsistent list from Software Suggest. It is obvious that there is considerable confusion about the meaning of “enterprise search.” I thought I provided a useful definition in my book “The Landscape of Enterprise Search,” published by Panda Press a decade ago. The book, like me, is not too popular or well known. As a result, the blundering around in eCommerce search, Web site search, application specific search, and enterprise search is painful. Who cares? No one at Software Suggest I posit.

My hunch is that this is content marketing for Coveo. Just a guess, however.

Stephen E Arnold, April xx, 2022

Enterprise Search: What Did Shakespeare Allegedly Write?

November 15, 2021

The statement, according to my ratty copy of Shakespeare’s plays edited by one of the professors who tried to get me out of the university’s computer “room” in 1964, presents the Bard’s original, super authentic words this way:

The play is Hamlet. The queen, looking queenly, says to the fellow Thespian: “The lady doth protest too much, methinks.”

Ironic? You decide. I just wanted to regurgitate what the professor wanted. Irony played no part in getting and A and getting back to the IBM mainframe and the beloved punch card machine.

I thought about “protesting too much” after I read “Making a Business Case for Enterprise Search.”

I noted this statement:

In effect you have to develop a Fourth Dimension costing model to account for the full range of potential costs.

Okay, the 4th dimension. Experts (real and self anointed) have been yammering about enterprise search for decades.

Why does an organization snap at the marketing line deployed by vendors of search and retrieval technology? The answer is obvious, at least to me. Someone believes that finding information is needed for some organizational instrumentality. Examples include finding an email so it can be deleted before litigation begins. Another is to locate the PowerPoint which contains the price the now terminated sales professional presented to close a very big contract. How about pinpoint who in the organization had access to the chemical composition of a new anti viral? Another? A shipment went walkabout. Some person making minimum wage has to locate products to be able to send out another shipment.

The laughable part of “enterprise search” is that there is no single system, including the craziness pitched by Amazon, Microsoft, Google, start ups with AI centric systems, or small outfits which have been making minimal revenue headway for a very long time from a small city in Austria or a suburb of the delightful metropolis of Moscow.

The cost of failing to find information cannot be reduced to the made up data about how long a person spends hunting for information. I believe a mid tier consulting outfit and a librarian cooked up this info-confection. Nor is any accountant going to be able to back out the “cost” of search in a cloud database service provided by one of the regulators’ favorite monopolies. No system manager I know keeps track of what time and effort goes into making it possible for a 23 year old art history major locate the specific technical innovation in an autonomous drone. Information of this type requires features not included in Everything, X1, Solr, or the exciting Amazon knock off of Elastic’s follow on to Compass.

Enterprise information retrieval has been a thing for about 50 years. Where has the industry gone? Well, one search executive did a year in prison. Another is fighting extradition for financial fancy dancing. Dozens have just failed. Remember Groxis? And many others have gone to the search-doesn’t-work section of the dead software cemetery.

I find it interesting that people have to explain search in the midst of smart software, blockchain, and a shift to containerized development.

Oh, well. There’s the Sinequa calculator thing.

Stephen E Arnold, November 15, 2021

Elastic CEO on New Products and AWS Battle

November 10, 2021

Here is an interesting piece from InfoWorld about a company we have been following for years. Elastic is the primary developer behind the open source Elasticsearch and made its money vending managed services for the platform. Lately, though, the company has been expanding into new markets—application performance management (APM), observability, and security information event management (SIEM). The company’s CEO discusses this expansion as well as its struggle with Amazon over the use of Elasticsearch in, “Elastic’s Shay Banon: Why We Went Beyond our Search Roots—and Stood Up to ‘Bully’ AWS.”

First, reporter Scott Carey asks about the move into security. Banon admits Elastic was late to the SEIM game, but that timing gave the CEO a unique perspective. He makes this observation:

“When I got into security, I really didn’t understand why the market is so fragmented. I think a big part of it is top-down selling. It’s not like CISOs [Chief Information Security Officers] aren’t smart, but they’re not practitioners, so you can go in and more easily communicate to them that they need certain protection. I could see that there was tension between the security team and developers, operations, devops teams. Security didn’t trust them, and it was the same story as before with operators and developers. This is where I think our biggest opportunity is in the security market. To be one of the companies that brings the trends that caused dev and ops to come together and bring it to security.”

See the write-up for more of Banon’s observations on security, APM, and observability. As for the licensing battle with Amazon, that began in 2015 when AWS implemented its own managed Elasticsearch service without collaborating with Elastic. Carey notes both MongoDB and Cloudflare had similar issues with the mammoth cloud-services vendor. Elastic ultimately took a controversial step to deal with the problem. We learn:

“In a January blog post, Banon outlined how the company was changing its license for Elasticsearch from Apache 2.0 to a dual Elastic License and Server Side Public License (SSPL), a change ‘aimed at preventing companies from taking our Elasticsearch and Kibana products and providing them directly as a service without collaborating with us.’ AWS has since renamed its now-forked service as OpenSearch.”

Banon states he did not really want to change the license but felt he had to take a stand against AWS, which he compared to a schoolyard bully. The CEO has some sympathy for those who feel the decision was unfair to developers outside Elastic who had contributed to Elasticsearch. However, he notes, his company did develop 99% of the software. See the article for more of his reasoning, his perspective on Elasticsearch’s “very open and very simple” new license, and where he sees the company going in the future.

Cynthia Murrell November 10, 2021

Sinequa: Estimating Once and for All the Value of Search

November 8, 2021

Vendors of enterprise search systems have struggled for decades to explain the “value” of their systems and software. The task is a difficult one for several reasons:

  1. Search is a term which is difficult to define in a satisfactory way to each person, unit, department, job specialty, and executive in an organization. Why? Search is personal. Chemists don’t want what lawyers want;  marketers don’t want what invoice clerks want.
  2. Search is perceived as either built in (Microsoft and Oracle provide crude tools to find items) or free (bright computer grads know about Solr).
  3. Search over the last 50 years has fragmented into specific tools for specialist jobs because the one-size fits all has demonstrated it does not work, produces financial meltdown (Convera, Entopia, Delphis, Hakia, etc.), creates legal hassles (Autonomy and Fast Search & Transfer), and crazy marketing hyperbole which does not deliver on time, on target, or within the budget (Dieselpoint, Endeca, Teratext, etc.)

I read “Sinequa Develops ROI Calculator for Determining the Benefits of Enterprise Search.” The write up asserts:

The new online tool has been designed to assess a company’s productivity and predict the potential productivity gains achievable with the company’s recently released Insight Apps deployed with Sinequa’s Intelligent Enterprise Search platform.

Might be worth a look, but I have learned that it is better to have a specific problem regarding information retrieval and then spell out what’s required. Then one can go looking for a system which delivers. Being sued? E-discovery. Hunting for information on your local machine? Everything. Information related to a law enforcement case? Datawalk.

Stephen E Arnold, November 8, 2021

Enterprise Search: Will It Prove Sweet for MeiliSearch?

October 20, 2021

Developers looking for a search solution may want to check out MeiliSearch. The customizable search engine recently released a new version that boasts a refined indexer, customizable ranking rules, and a sort function that works as users type their query. Their website promises:

“For developers: Scalable, maintainable, customizable. MeiliSearch provides an extensive toolset for customization. Unlike with other search engines, these customization options are just that: optional. It works out-of-the-box with a preset that easily answers the needs of most applications. Communication is done with a RESTful API because most developers are already familiar with its norms.

And we noted:

“For users: Blazing fast, relevant, typo tolerant. The search experience feels simple and intuitive. It’s all too common for search bars to make users feel like they have to learn a new language just to get the best results – or worse, that they have to jump back and forth between their search and Google just to get the right spelling or product UID.MeiliSearch makes searching simple and responsive, so the user can stay focused on the results.”

The company’s blog post, “What’s New in v0.22,” emphasizes the new sort-as-you-type feature. Writer Gui Machiavelli tells us one could previously achieve a similar result with custom ranking rules applied during index configuration, but this feature makes it easier. MeiliSearch is an open source project. The company was founded in 2018 and is based in Paris, France.

Cynthia Murrell October 20, 2021

A Business Case for Search in the Time of Covid and the SolarWinds Misstep

February 8, 2021

Why does one working in an organization have to make a case for enterprise search? Oh, right, I forgot. Enterprise search has a rich history: Fast Search & Transfer with jail time for the founder, Autonomy with a sentencing date looming for the founder, Entopia with financial pain for its investors, and, well, the list of issues with enterprise search can be extended with references to IBM OmniSphere or STAIRS III, Delphes, Siderean, Arikus, Attensity, Brainware, Eegi, Relegence, Hakia, and the memorable Zaizi, among others.

Making the Business Case for Enterprise Search” is sponsored. That means it is an advertisement, marketing collateral, and hoo hah. But what is its message. I noted this passage:

Knowledge-centric organizations know that tools such as intelligent search are critical for cutting through the noise and making relevant information discoverable. However, many executives don’t prioritize these types of tools.

Yep, and there is a reason. Consider that Elasticsearch is open source. Amazon offers search and is educating the enthusiastic for free. Put these successes against the backdrop of Google’s high profile failure: The GSA or Google Search Appliance, a fine product according to some Google engineers.

Regardless of today, large organizations typically have multiple information retrieval systems. The idea of federating the information is a really good one until the bean counters realize that the staff, professional for fee services, and the time required to figure out access controls, file formats, and how to cope with versions, rich media, trade secrets in engineering drawings and chemical formulas, and index latency cost more money than anyone revealed in a marketing pitch.

The write up notes:

In a recent survey, nearly half of all respondents said it was challenging finding the right information when they needed it.

One question: What’s right? The problem with enterprise search is that it is a fake discipline trying to gain traction in a world of business intelligence, analytics, and real time data capture, analysis, and outputs.

I laughed at the reminder “Don’t neglect security.” This is the era of the SolarWinds’ misstep. Security is underfunded in most organizations. Do responsible Boards of Directors and senior executives need to be reminded that their security systems is now Job Number One.

Enterprise search? Yeah, a hot enterprise solution. Just a solution which has become a utility and a free one via open source software at that.

Stephen E Arnold, February 8, 2021

The AWS Bulldozer and Elasticsearch: Can the Rubber Trees Grow Back?

January 22, 2021

In 1955 or 1956, I lived in Campinas, Brazil. My father worked from RG LeTourneau. He had the delightful job of setting up a factory to produce what were then called sheep foot rollers. Most people are not aware of the function of a sheep’s foot roller. Let me explain.

Hoot a D9 or other comparable bulldozer to two or more sheep foot rollers. Drive the bulldozer, scraper, or other heavy duty machine through a grassy field, a jungle or grassland. Crush and smash the trees, plants, and animals. What’s in the wake of the snorting and roaring yellow beast is a surface almost ready for paving. That’s right. The sheep foot rollers made the Trans-Amazon highway a reality.

Holmes Sheepfoot Rollers & Parts

What did the fleets of earth moving machinery do to the Hevea brasiliensis, a species of rubberwood. Well, in the case of highway deforestation, the elastic plants did not fare particularly well.

image

What does this slice of my life have to do with search, retrieval, log file analysis, information access, and other content related activities?

Stepping Up for a Truly Open Source Elasticsearch” reminded me of the impact of the bulldozers and the sheep foot roller combos. The write up explains:

We launched Open Distro for Elasticsearch in 2019 to provide customers and developers with a fully featured Elasticsearch distribution that provides all of the freedoms of ALv2-licensed software. Open Distro for Elasticsearch is a 100% open source distribution that delivers functionality practically every Elasticsearch user or developer needs, including support for network encryption and access controls. In building Open Distro, we followed the recommended open source development practice of “upstream first.”

Who is the “we” driving what I think of as a digital bulldozer? Why none other than Amazon.

I wrote about Elastic search’s difficult decision to try to stave off the building of an information superhighway directly over the Elastic NV buildings in Amsterdam. You can find that essay in “Enterprise Search: Flexible and Stretchy. Er, No.”

I think my observation that it was too late for Elastic NV. Perhaps the company can find a way to avoid the Bezos bulldozer. The sentiments about the virtues of open source software echo through the Amazon blog post and the Elastic NV explanation of its decision to be a different flavor of open source goodness.

Put that handwaving aside.

The function of the bulldozer and the sheep foot roller is to build a new trail. That trail leads to Amazon AWS revenues, service offerings, and integrated functionality.

Vrrooom. Too bad about those hyacinth macaws. My father and Mr. LeTourneau were not environmentalists. Neither was particularly elastic either. Both loved the results of big yellow machines dragging sheep foot rollers across the virgin landscape.

There’s a lesson here. The Trans-Amazon highway is visible from the international space station. The rubber trees and other trivialities are not.

Stephen E Arnold, January 22, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta