Crazy Enterprise Search Market Report for May 25, 2020

May 25, 2020

Another crazy enterprise market report is now available. This one skips when the report was written, falling back on the vague word “recent.” In fact, my hunch is that this is one dicey report marketed under different aliases in order to gin up sales.

The title? “Enterprise Search Market Dynamics, Comprehensive Analysis, Business Growth, Revealing Key Drivers, Prospects and Opportunities 2025”

What’s in this gem from Market Study Report. The write up about the report promises:

The recent document on the Enterprise Search market involves breakdown of this industry as well as division of this vertical. As per the report, the Enterprise Search market is subjected to grow and gain returns over the predicted time period with an outstanding growth rate y-o-y over the predicted period.

Yep, outstanding. Obviously the global economic downturn has not had an impact on the half century young enterprise search software sector.

Enterprise search solutions are hot items. Forget hand sanitizer and surgical masks, enterprise search solutions are the barn burners. Are their lines of eager customers queuing outside of Algolia, Coveo, Elastic, IBM Omnifind’s office, Lucidworks, and Microsoft’s search facility in Beijing? Sure, sure, long lines. No social distancing either. Jostling and crowding is what happens when a sizzler is on offer.

The report presents information “with regards to the geographical landscape.” Yep, but how many languages do enterprise search systems support?

What’s interesting is the list of companies analyzed in the report? Here you go:

Attivio Inc

Concept Searching Limited

Coveo Corp

Dassault Systemes

Expert System Inc

Google

Hyland

IBM Corp

Lucid Work (Would it be helpful if the report authors spelled the name of the company correctly, wouldn’t it?)

Marklogic Inc

Micro Focus

Oracle

SAP AG

Microsoft

X1 Technologies

There are some notable omissions, but I won’t provide these names. Obviously I am not hip to where enterprise search is at at this moment.

In the three editions of the Enterprise Search Report I wrote, it never crossed my mind to include this a manufacturing cost structure analysis. Poor stupid me.

What seems clear is that whoever is marketing this report recycles the content under different names, hoping for a sale.

The data in the report, one hopes, is more polished than the promotional material.

Stephen E Arnold, May 25, 2020

Written by Stephen E. Arnold · Filed Under Marketing, News, Search | 1 Comment

Search: Contentious and Increasingly Horrible

May 25, 2020

I dropped enterprise search, commercial search, and vertical search to the bottom of my “Favorite Topics” list years ago.

Why?

The individuals popping up and off at conferences were disconnected from the realities of looking for information under stressful circumstances.

Hey, big rocks, how did you move from that quarry kilometers away and get yourselves smoothed down? Just like modern online search systems, you won’t get an answer. Finding information relevant to a query is as difficult as getting megalithic stones to become Chatty Kathies.

The thumb typing crowd, some are now in their mid forties, ASSUME that search has to think for the stupid user.

The techniques range from smart software which skews results in what are to an experienced researcher stupid ways. For those search experts concerned with making their information or their name appear number one on a results list, good search was anything that produced a top spot in a result list even if that result was stupid, irrelevant, or shameless ego jockeying. Then there are the chipper, super confident experts who emerged from an educational system which awarded those who showed up and sort of behaved a blue ribbon. Yep, everything that group does is just wonderful. Yeah, right.

You can see the consequences of two forces colliding when you read Science Magazine’s “They Redesigned PubMed, a Beloved Website. It Hasn’t Gone Over Well.”

You can work through the examples in the source article. The pain points range from appearance to search functionality.

Why did this happen?

The change is a result of people who do not have the experience of performing search under stressful conditions. No, I don’t mean locating the Cuba Libre restaurant in Washington, DC, on a Google Map. I mean looking up technical information to complete a lab test, perform a diagnosis, locate a procedure, or some similar action. There is a pandemic going on, isn’t there?

The complaints indicate that the “new” PubMed is not perceived as a home run.

Go read the original.

I want to offer several observations:

Those who do research with intent need predictability; that is, when a Boolean query is entered, the results should reflect that logic. Modern systems think Boolean is stupid. There you go, a value judgment from those with “Also Participated” ribbons in high school.
Interfaces should allow the user to select an approach. There are some users who like a blinking dot or a question mark. Enter the commands and get a text output. Others like the Endeca style training wheels, although I doubt if any of the modern “helper” interfaces know what Endeca offered. Other may want some other type of interface like a PhD approach; that is, push here, dummy. The point is: Why not allow the user to select the interface?
Change is introduced for dark purposes. Catalina has many points of friction so that Apple can extend its span of control. Annoying? Sure is. Why doesn’t Apple tell the truth about these friction points? What? Tell the truth, are you crazy. Apple, like Facebook and Google, are doing what they can to protect their hegemony, and the user is the victim. Tough. The same logic applies to PubMed. Dollars to donuts there is a “reason” for the change, and it may be due to whimsy, money, or the need to demonstrate the team is actually doing something instead of just having meetings with contractors.

Net net: Search, as I wrote for Barbara Quint in the now departed magazine Searcher, search is dead. Each day the hope for a better, more appropriate way to locate online information becomes lost in the mists of time. Getting relevant information from PubMed or any modern systems is like trying to get the stone of Ollantaytambo to explain how the rocks moved eons ago.

Finding information today is more difficult than at any other time in my professional career. That’s a big problem.

Stephen E Arnold, May 24, 2020

Written by Stephen E. Arnold · Filed Under Feature, Search | 3 Comments

Microsoft and Its Latest Search Innovation: Moving Past Fast? Nope

May 22, 2020

I read “Microsoft Search: Search Your Document Like You Search the Web.” Perhaps Microsoft did not get the reports about the demise of the Google Search Appliance. That “invention” made clear that searching a corporate content collection like you search the Web was not exactly the greatest thing since sliced bread. There were a number of reasons for the failure of the GSA. It was a black box. You know that mere mortals could not tune the relevance component. You know that it produced results that left employees wondering, “Where is the document I wrote yesterday?” You know that the corpus of Web content is different from the fruit cake of corporate content. Web search returns something because the system is rigged to find a way to display ads to the hapless searcher.

Contrast this with documents in the cloud, in different systems like that old AS/400 Ironsides application used by the warehouse supervisors, and content tucked away on employees’ USB drives, mobile phones, the oldest kid’s iPad, and on services a go to sales professional uses to store PowerPoints for “special” customers. Then there are the documents in the corporate legal office. The consultants’ reports scanned and stored on the Market Department’s computer kept for interns.

Nevertheless, the article explains:

We’re utilizing well-established web search technologies, such as query and document understanding, and adding deep learning based natural language models. This allows us to handle a much broader set of search queries beyond “exact match.”

Okay, query expansion, synonym look up, and Fast Search’s concept feature. But there’s more:

With the recent breakthroughs in deep learning techniques, you can now go beyond the common search term-based queries. The result is answers to your questions based on the document content. This opens a whole new way of finding knowledge. When you’re looking at a water quality report, you can answer questions like “where does the city water originate from? How to reduce the amount of lead in water?”

May I suggest that Microsoft and dozens of other enterprise search vendors have promised magical retrieval?

May I point out that the following content types are usually outside the ken of the latest and great enterprise search confection; for example:

Quality control data on parts stored in an Autodesk engineering document
Real time data flowing into an organization from sensors
Video content, audio content, and rich media like photographs
Classified or content restricted by certain constraints. (Access controls are often best implemented by specialized systems unknown to the greedy enterprise search indexing system.)
Documents obtained through an eDiscovery process for legal matters.

Has Microsoft solved these problems? Sure, if everything (note the logically impossible categorical affirmative) is in an Azure repository, it is conceivable that a user query could return a particular content object.

But that’s Microsoft fantasy land, and it is about as likely as Mr. Nadella arriving at work on the back of a unicorn.

Microsoft feels compelled to reinvent search every year or two. The longest journey begins with a single step. It is just that Microsoft took those steps decades ago and still has not reached the now rubbelized Fred Harvey’s.

Stephen E Arnold, May 22, 2020

Written by Stephen E. Arnold · Filed Under Microsoft, News, Search | 1 Comment

Lucidworks: Buzzwording in the Pandemic

May 19, 2020

Lucid Imagination (the outfit which contributed some Lucene/Solr talent to Amazon search) renamed itself Lucidworks. The company then embarked on becoming a West Coast version of Fast Search & Transfer, a Splunk like outfit, and now a customer support provider.

That’s a remarkable trajectory for a company built on open source software with more than $200 million in funding since 2007.

One of the DarkCyber researchers spotted “Lucidworks Develops Deep Learning Solution to Make Chatbots Smarter.” The story appeared in a New Zealand online publication. That’s interesting, but more intriguing is that Lucidworks is following in the marketing footsteps of Attivio, Coveo, and other vendors of search and retrieval. The destination customer service. Who doesn’t love automated customer support chat robots, self serve Web sites with smart software, and the general extinction of individuals who actually know a company’s software or hardware products?

The write up states:

Deep learning is essential for automated chatbots to understand natural language questions and to provide the right answers, which is something that AI-powered search firm Lucidworks has taken on board.

And why?

According to Lucidworks, companies rely on digital portals to provide information to users, whether digital commerce customers looking for product information before purchase, employees hunting for an HR document, or someone looking for an airline’s updated cancellation policies. Information is often scattered across disparate silos and is impossible for a user to locate using natural language questions.

But smart software is available from Amazon with a credit card and some free training courses. Outfits from Algolia to Voyager Search offer the service.

What is interesting is the buzzword salad tossed into this reheated plastic container of mapo tofu:

AI (artificial intelligence)
Automated
Chatbots
Conversational
Deep learning
Digital portals
Engagement
Experiences
Fusion
Natural language
Satisfaction
User intent
Virtual assistants

Quite vocabulary and what seems an exercise in content marketing. Plus, eager customers in New Zealand will have an opportunity to help the company repay its investors the $200 million plus interest. That works out to 13 years in the enterprise search wilderness before arriving at chatbots.

Options abound and many of them are open source and well documented.

Stephen E Arnold, May 19, 2020

Written by Stephen E. Arnold · Filed Under Marketing, News, Search | Comments Off on Lucidworks: Buzzwording in the Pandemic

Boolean Is Better but Maybe Google Must Motor Through Ad Inventory by Relaxing Queries…a Lot?

May 17, 2020

A brief exchange on StackExchange demonstrates some common sense. One user, moseisley.2015, asks the community, “Should Default Search Behavior be ‘This AND That,’ or ‘This OR That’?” They elaborate:

“I have web application that shows lists of various data types … employees, customers, inventory items, orders, and so on. There’s one simple search field for doing a ‘global’ search … . Question is, when a user enters multi-word text in the field should the default search behavior be (1) this OR that or (2) this AND that? What default behavior do you think average users would expect?”

Their example lists four records: John Smith, John Jones, Michael Smith, and Betty Taylor-Smith. Would users expect the query “John Smith” to return just the first record (AND) or all four (OR)? As any online researcher from the ‘70s and ‘80s would tell you, the Boolean AND is the better default. The first respondent, SNag, sensibly writes:

“As a user, the more I type in, the more specific I’m expecting the results to get, and this is what happens with AND. With OR, your results would explode! If my search for popular Google Doodle games gave me everything that was popular, everything Google, everything Doodle and every game out there, I’d be lost! If you’re expecting your user to fetch all matching either John or Smith results, consider supporting syntax like John|Smith (where | is the logical symbol for OR) and placing a hint ? icon next to the search box to showcase the various supported syntaxes. You could also consider quotes in the search syntax for exact matches, where “Smith” wouldn’t match Taylor-Smith, but Smith would. “John”|”Smith” would then match all John and all Smith but not Betty Taylor-Smith.”

We concur. The second respondent, Big_Chair, adds a good observation—users without any programming background are probably unfamiliar with the | character and may need a more explicit cue that their query is about to return results based on OR rather than AND.

Cynthia Murrell, May 17 2020

Written by Stephen E. Arnold · Filed Under News, Search | 1 Comment

Google: Regular Search Not Up to Covid19 Queries. Who Knew?

May 15, 2020

Google has launched a new semantic search tool designed to help researchers fight this pandemic. The Google AI Blog reveals “An NLU-Powered Tool to Explore COVID-19 Scientific Literature.” As one might expect, researchers around the world have been turning out an enormous number of papers on the disease and how we might fight it. Why does this call for a special tool? Google researcher Keith Hall writes:

“Traditional search engines can be excellent resources for finding real-time information on general COVID-19 questions like ‘How many COVID-19 cases are there in the United States?’, but can struggle with understanding the meaning behind research-driven queries. Furthermore, searching through the existing corpus of COVID-19 scientific literature with traditional keyword-based approaches can make it difficult to pinpoint relevant evidence for complex queries. To help address this problem, we are launching the COVID-19 Research Explorer, a semantic search interface on top of the COVID-19 Open Research Dataset (CORD-19), which includes more than 50,000 journal articles and preprints.”

Based on the BERT technology recently injected into the general Google Search, this bespoke semantic AI has been trained on biomedical literature. The team chose to build a hybrid term-neural retrieval model for this platform—a combination of keyword search and neural retrieval; see the article for the technical details. Here’s how the search functions:

“When the user asks an initial question, the tool not only returns a set of papers (like in a traditional search) but also highlights snippets from the paper that are possible answers to the question. The user can review the snippets and quickly make a decision on whether or not that paper is worth further reading. If the user is satisfied with the initial set of papers and snippets, we have added functionality to pose follow-up questions, which act as new queries for the original set of retrieved articles.”

The open-alpha platform is available for free to the research community, and Google plans to continue refining the system over the next few months. May this tool help scientists find solutions that much faster.

Cynthia Murrell, May 15, 2020

Written by Stephen E. Arnold · Filed Under Google, News, Search | Comments Off on Google: Regular Search Not Up to Covid19 Queries. Who Knew?

Deindexing: Does It Officially Exist?

May 14, 2020

DarkCyber noted “LinkedIn Temporarily Deindexed from Google.” The rock solid, hard news service stated:

LinkedIn found itself deindexed from Google search results on Wednesday, which may or may not have occurred due to an error on their part. The telltale sign of an entire domain being deindexed from Google is performing a “site:” search and seeing zero results.

Mysterious.

DarkCyber has fielded two reports of deindexing from Google in the last three days. I one case a site providing automobile data was disappeared. In another, a site focused on the politics of the intelligence sector was pushed from page one to the depths of page three.

Why?

No explanation, of course.

LinkedIn is owned by Microsoft. Is that a reason? Did LinkedIn’s engineers ignore a warning about a problem in AMP?

Google does not make errors. If a problem arises, the cause is the vaunted Google smart software.

DarkCyber’s view is that Google is taking stepped up action to filter certain types of content. We have documented that one Google office has access to controls that can selectively block certain content from appearing in the public facing Web search system. The content is indeed indexed and available to those with certain types of access.

What’s up? Here are our theories?

Google is trying to deal with problematic content in a more timely manner by relaxing constraints on search engineers working in Google “virtual offices” around the world. Human judgments will affect some Web site. (Contacting Google is as difficult as it has been for the last 20 years.)
Google wants to make sure that ads do not appear next to content that might cause a big spender to pull away. Google needs the cash. The thought is that Amazon and Facebook are starting to put a shunt in the money pipeline.
Google is struggling to control costs. Slowing indexing, removing sites from a crawl, and pushing content that is rarely viewed to the side of the Information Superhighway reduces some of the costs associated with serving more than 95 percent of the queries launched by humans each day.

Regardless of the real reason or the theoretical ones, Google’s control over findable content can have interesting consequences. For example, more investigations are ramping up in Europe about the firm’s practices (either human or software centric).

Interesting. Too bad others affected by Google actions are not of the girth and heft of LinkedIn. Oh, well, the one percent are at the top for a reason.

Stephen E Arnold, May 14, 2020

Written by Stephen E. Arnold · Filed Under Google, Microsoft, News, Search | Comments Off on Deindexing: Does It Officially Exist?

New Arnold-Steele Discussion: Findability Is Terrible

May 7, 2020

Robert David Steele, a former CIA professional, stored a video of our recent discussion about finding open source information. The main point is that findability has degraded to the point that results are generally useless. Bing, Google, and other ad-supported systems have abandoned precision and relevance. Search results are a dog’s breakfast. To view the findabiity discussion, navigate to this link. The video was produced by Mr. Steele.

Stephen E Arnold, May 7, 2020

Written by Stephen E. Arnold · Filed Under Business strategy, News, Search, search engine | Comments Off on New Arnold-Steele Discussion: Findability Is Terrible

Search Engine Optimization: The Next Frontier Is Smart SEO

April 29, 2020

Content strategy plans are the most overlooked part of any Web site design and advertising campaign. Good content is integral to selling a product or a service, but not everyone is good at creating it. News Patrolling runs down the: “Best AI Tools For Content Marketing Strategy” and how AI is becoming an industry game changer.

Content is usually the first impression consumers have of companies. It is meant to engage the consumer, then:

“It serves as a tool to communicate with your audience. If you identify their pain points to provide them with a solution, they will trust you and be more interested in buying your offerings. The growth of your business depends on content strategy. It must be as effective as possible if you do not go downhill. Artificial intelligence can help you make an effective content marketing strategy. There are various tools to help you from targeting keywords to choosing the right topic. You will be surprised to know that AI tools can create a smarter content strategy by identifying the behaviour of users. Such software can help you increase revenues and reduce cost.”

The article recommends four content marketing software: Hubpost, Quill, Clearscope, and BrightEdge. Hubpost is advertised as using machine learning to help one get an edge on competition. The software analyzes keywords to discover what consumers want, then it clusters topics based on competition level.

Quill specializes in keyword optimization and generating quality content. Clearscope also optimizes content using keywords. It helps you generate keywords based on Google data and select the best keywords to use. Once you choose a keyword and write your post, Clearscope analyzes a post with other top-ranking posts.

BrightEdge is one integrated software solution that provides performance measurement, optimization, and keywords. It is described as a one-size-fits-all for content marketing strategies.

AI can provide insights into how to create the best content, but the most important part of a content strategy plan remains creative humans.

Yep, SEO is modernizing and automating methods to ensure that ad-supported Web search engines decide what matches a query. Precision, recall, and objectivity? Forget those irrelevant concepts.

Whitney Grace, April 29, 2020

Written by Stephen E. Arnold · Filed Under AI, News, Search, SEO | Comments Off on Search Engine Optimization: The Next Frontier Is Smart SEO

Dig.ccMixter for Royalty-Free Tunes

April 22, 2020

Here is a resource that makers (and aspiring makers) of video content and games will want to bookmark. CCMixter is an online community where musicians share their work through creative commons licenses. Dig.ccMixter is our search portal into that content, free to download and use even for commercial purposes. Scrolling down reveals three categories: instrumental music for film & video; free music for commercial projects; and music for video games. Clicking the “Dig!” button leads to a keyword search page, where you can search by attributes like genre, mood, and instruments. The site’s About page, titled Yea, But Is It Legal? explains:

“This is a community music remixing site featuring remixes and samples licensed under Creative Commons licenses. Music on this site is licensed under a Creative Commons license. You are free to download and sample from music on this site and share the results with anyone, anywhere, anytime. Some songs might have certain restrictions, depending on their specific licenses. Each submission is marked clearly with the license that applies to it.”

So there you have it—a free source of music for your projects, even ones you intend to profit from. All you have to do is give credit where credit is due.

Interestingly, developers can also access the site’s ccHost Query API. We’re told:

“The ccHost Query API is an open, publicly available interface that is available for public use, especially by 3rd party websites, mobile applications, smart TV appliances and any other network connected device. We here at ccMixter use it to help expose the artists that upload their Creative Commons licensed music to audiences that otherwise would not have access to. The API and software implementation is owned by ArtIsTech Media under a license agreement with Creative Commons. The music itself is owned by the individual artists that uploaded it to the site and agree, through the Creative Commons licenses to share the music through this mechanism.”

Bing, Google, and Yandex are not suited for some types of music search. Enter Dig.cc Mixter. Applause, please.

Cynthia Murrell, April 22, 2020

Written by Stephen E. Arnold · Filed Under News, Rich media, Search, search engine | Comments Off on Dig.ccMixter for Royalty-Free Tunes

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Crazy Enterprise Search Market Report for May 25, 2020

Search: Contentious and Increasingly Horrible

Microsoft and Its Latest Search Innovation: Moving Past Fast? Nope

Lucidworks: Buzzwording in the Pandemic

Boolean Is Better but Maybe Google Must Motor Through Ad Inventory by Relaxing Queries…a Lot?

Google: Regular Search Not Up to Covid19 Queries. Who Knew?

Deindexing: Does It Officially Exist?

New Arnold-Steele Discussion: Findability Is Terrible

Search Engine Optimization: The Next Frontier Is Smart SEO

Dig.ccMixter for Royalty-Free Tunes

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta