Can Enterprise Search Improve Governance? Security?

December 10, 2020

I thought about this question after I read “BA Insight Delivers Internet-Like Search for Egnyte Customers.” The write up is a content marketing item with some jazzy jargon; for example:

AI-driven enterprise search
Connector-driven software portfolio
Intelligent recommendations
Machine learning
Natural Language
User behavior
User productivity.

What is, I ask myself, AI driven enterprise search? I don’t know what AI means, and I still have not figured out what “enterprise search” means after writing The New Landscape of Search and a number of other books and monographs on this subject.

a search revolution

My recollection is that Attivio has been wrapping layers of functionality around Lucene, but maybe my recollection is faulty. I do recall the interesting business intelligence application which pivoted on baseball data.

But that was in 2007 when former Fast Search & Transfer professionals pivoted from ESP (enterprise search platform) to Attivio. Attivio’s founder told me “attivio” was an Italian-like word which implies forward movement. Today a jaunty MBA would call this “kinetic branding.” Whatever.

The focus of the marketing collateral is a deal with an outfit involved in resolving content chaos and delivering information cohesion. I am not exactly sure what this means, but here is the description offered by Attivio’s partner / licensee Egnyte:

Your files contain your most critical data, but, more than ever, they’re sprawled across disconnected systems, devices, locations, and apps. Egnyte enables you to gain visibility and control across a hybrid content stack while also improving employee experience and driving business advantage.

Egnyte is in the compliance business, the data governance business, the risk reduction business, and the cyber security business. But the key value proposition seems to be:

Unified multi cloud content search

Specifically:

Egnyte is the only all-in-one platform that combines data-centric security and governance, AI for real-time and predictive insights, and the flexibility to connect with the content sources and applications your business users know and love – on any device, anywhere, without friction.

The words “only” and “all” are blinking yellow lights to me. Categorical affirmatives are tough for me to accept. These types of “make a case” statements are, however, popular with the millennials and thumbtypers in marketing departments.

I took a look at one of the buzzwords used to describe the Egnyte system powered in part by Attivio and learned that these are the functions the platform delivers:

  • Breach reporting
  • Classification policies (for GDPR compliance, CCPA, HIPAA, etc.)
  • Content lifecycle management
  • Content safeguards
  • Custom keyword classification
  • Data subject access requests
  • Issue detection and alerting
  • Insider threat and ransomware detection
  • Multi-repository governance .

The combination of cyber security and search is interesting. However, the cyber security sector seems to have some explaining to do. Cyber crime particularly insider threats and phishing are experiencing a bad actor gold rush. Adding to the woe are reports of a cyber security firm’s inability to prevent a crippling cyber attack; specifically, “U.S. Cybersecurity Firm FireEye Discloses Breach, Theft of Hacking Tools.” What this means is that cyber security super stars are not secure. Thus, questions about a firm which is a relative newcomer to cyber security equipped with “only” and “all” assertions may face some interesting questions about the security of Egnyte and Attivio systems. I know I would ask some questions and carefully consider the responses. Insider threats and phishing are topics of interest to me.

Several observations:

  • Search vendors are indeed working overtime to find markets for what is a downloadable utility function
  • Partnerships are one way to generate sales leads and revenue from technical services and training
  • Organizations, regardless of type, face significant findability, security, and regulatory challenges.

Interesting play, but “only” and “all” are big concepts, particularly when Amazon AWS, to cite one example, offers technology to deliver a similar solution directly or via its extensive partner network.

Stephen E Arnold, December 10, 2020

OpenText: A Cyber Graphic Points to Its Future

December 9, 2020

When I think of OpenText, here’s what flashes through my find:

  • BRS (Livelink)
  • Fulcrum
  • Hummingbird
  • InQuery
  • nQuire
  • Recommind
  • SGML search.

My recollection is that there may be a Web search engine, a search system for law firm email, and a database from Information Dimensions. I cannot recall, but the message seems clear:

OpenText is a company deeply involved in search and retrieval.

When I read “Mark J. Barrenechea Keynote: The Future of Cyber Resilience”, I realized that I am thinking about the “old” OpenText. What do I mean “old.” That “old” OpenText was an enterprise search vendor wrapped in search-based applications like eDiscovery and content management.

Not any more.

Here’s the new OpenText:

image

Yep, the Rona, cyber security, health, and “agility, flexibility, and trust.” Who knew? Ice skaters call this a counter turn.

Stephen E Arnold, December 9, 2020

LinkedIn Reveals Disinterest in Search and Retrieval

December 7, 2020

LinkedIn does quite a bit of info-ramming when either one of my team or I log in to the Microsoft social media system. Here’s the graphic displayed when we were checking to see if our automated posts from this blog were appearing:

image

The eight “cards” tell me about LinkedIn Groups in which I may have an interest. The little boxes reveal a small amount of information about the content access topics in which the unemployed, the consultants cruising for gigs, and the self-promoters have an interest.

The table below presents some of the data in this graphic in tabular form. No, I did not use Excel 365 connected to Teams. Sorry, Mother Microsoft. I still recall Bob. (You remember Bob, don’t you, gentle reader?)

LinkedIn Group Name Number of LinkedIn Followers
Data Science Central 374,694
Association for Intelligent Information Management 27,861
Scientific, Technical, Medical Publishing Group 12,253
Data & Text Analytics Professionals 12,503
Special Libraries Asso. 15,191
Semantic Web 15,098
Semantic Technologies Group 3,772
Enterprise Search & Discovery 624

LinkedIn does not reveal the hard count for its total number of registered humans, the number of human users who log on to the system once per week, or the number of paying human users. Hence, figuring out the percentage of LinkedIn members interested in these groups is a difficult task akin to predicting the share price of Palantir Technologies on January 1, 2022.

An outfit called Oberlo reports with confidence that LinkedIn has 660 million users. Close enough for horseshoes.

The table below presents the percentage of these LinkedIn users interested in each the groups suggested to me:

LinkedIn Group Name Percentage of LinkedIn Members Interested in These Topics
Data Science Central 0.0567718182%
Association for Intelligent Information Management 0.0042213636%
Scientific, Technical, Media Publishing Group 0.0018565152%
Data & Text Analytics Professionals 0.0018943939%
Special Libraries Asso. 0.0023016667%
Semantic Web 0.0022875758%
Semantic Technologies Group 0.0005715152%
Enterprise Search & Discovery 0.0000945455%

Eyeballing my math, surely there are errors. How can such a compelling subject as Enterprise Search & Discovery appeal to 0.0000945455 percent of the LinkedIn members.

What’s interesting is that an astounding 0.0042213636 percent of the LinkedIn membership are pulled to the Association for Intelligent Information Management.

And the semantic topics. Magnetic indeed.

What’s the analysis suggest? Anyone looking for a job in enterprise search may want to spin their expertise a different way.

Stephen E Arnold, December 7, 2020

Fess Up: Elasticsearch Is a Threat to Proprietary Search and Retrieval

December 1, 2020

We have been poking around the world of Elasticsearch-based information retrieval systems. There are some interesting plays; that is, entrepreneurs use Elasticsearch (Shay Banon’s open source system) as a platform.

Fess provides Elasticsearch for personal use, although one can employ the system for an organization. The system is:

Fess is Elasticsearch-based search server, but knowledge/experience about Elasticsearch is NOT needed because of All-in-One Enterprise Search Server. Fess provides Administration GUI to configure the system on your browser. Fess also contains a crawler, which can crawl documents on Web/File System/DB and support many file formats, such as MS Office, pdf and zip.

Fess became available in 2019. The CEO of the N2SM, Inc. company is Masaharu Manabe. Demonstrations and links to the code are available at this link. A fee-based version of the software is provided under the name N2 Search. More information about the for fee version is here. A discussion forum is available at this link.

Observation: The Elasticsearch ecosystem is providing alternatives to the proprietary search systems. Beyond Search thinks that some vendors of proprietary search software are likely to be see Elasticsearch as digital kudzu. Good news or bad news for the Coveos, Fabasofts, and Microsoft Fast type folks? That’s a question some of these types of vendors stakeholders may be asking as they beat the bushes for deals in customer service, chatbots, business intelligence, and smart software services.

Stephen E Arnold, December 1, 2020

OpenText: The New Equilibrium. Think How? What?

November 27, 2020

I read a weird content marketing, predicting the future article called “OpenText CEO: Organizations Must Rethink Approach to Business, Technology.” OpenText is interesting for a number of reasons. It is a Canadian outfit. The company owns more search and retrieval systems than one can remember. Fulcrum, BRS, Dr. Tim Bray’s SGML search, and others. There are content management systems which once shipped with an Autonomy stub. I dimly recall that OpenText was into Hummingbird and maybe Information Dimensions too.

Wow.

Now a company which ostensibly sells content management is suggesting that there is a “new equilibrium” on deck for 2021 is fascinating. I am not sure about the old equilibrium which seemed slightly crazy to me, but, hey, I am just reading what a Canadian outfit sees coming. I would prefer that the said Canadian outfit invest in enhancing the technologies it has, but I am flawed. That’s probably part of the old equilibrium.

The write up reports that the new equilibrium is part of the great rethink:

We are going through the fastest technology disruption in the history of the world. The shift to Industry 4.0 had already resulted in a huge increase in connectivity, automation, AI, and computing power. The response to COVID-19 has accelerated this process and forever changed the business environment.

Okay. How is that working out?

The pandemic has also forced a huge shift in time-to-value. Five years ago, companies would wait two years to deploy an ERP system. Now, the expectation is that you will have a solution in weeks, or even days.

Ah, ha. New system deployments have to be done faster. Is this an insight? I thought James Gleick’s Faster explained this process 20 years ago. That seems as if the OpenText insight has moved slowly through the great Canadian intellectual winter. Where is the management guru who lived on a sailboat in Canada when one needs him?

The new equilibrium for OpenText sounds a whole lot like Amazon Web services or the Microsoft Azure “blue” thing. I noted:

These cloud solutions enable businesses to re-invent processes and seize emerging opportunities faster, easier, and more cost-effectively. Developer Cloud is particularly exciting. It will provide a platform for developers to create custom solutions to manage information, and will help build a community of innovators working together to create better enterprise applications.

From my point of view, this content marketing fluff has not changed my perception of OpenText which is:

OpenText software applications manage content or unstructured data for large companies, government agencies, and professional service firms.

Services, new equilibrium, rethink. Got it. Enterprise search. Jargon.

Stephen E Arnold, November 27, 2020

Elastic: The Add Value to Open Source Outfit Bounces Along

November 25, 2020

Elastic Adds New Features to Enterprise Search, Observability, and Security Solutions

Search and data-management firm Elastic has some new features to crow about. BusinessWire posts “Elastic Announces Innovations Across its Solutions to Optimize Search and Enhance Performance and Monitoring Capabilities.” One new tool is Kibana Lens, a visual data analysis tool with a drag-and-drop interface described as intuitive. There is also a beta launch of the searchable snapshots, an efficient way to manage data storage tiers with searchable snapshots. The press release tells us:

“New expanded Elastic Observability features, including user experience monitoring and synthetics, give developers new tools to test, measure, and optimize end-user website experiences. The launch of a new dedicated User Experience app in Kibana provides Elastic customers with an enhanced view and understanding of how end users experience their websites. In addition, Elastic customers can use the new user experience monitoring feature to review Core Web Vitals, helping website developers interpret digital experience signals. Elastic users can also leverage a dev preview release of synthetic monitoring in Elastic Uptime to simulate complex user flows, measure performance, and optimize new interaction paths without impact to a website’s end users. The combination of these two new observability features gives Elastic customers a deeper view of their customers’ digital experience before and after a site update is deployed.”

See the write-up for its list of specific updates and features to Elastic’s Enterprise Search, Observability, Security, Stack, and Cloud products. Built around open source software, the company prides itself on its user-friendly products that have been adopted by major organizations around the world, from Cisco to Verizon. Elastic began as Elasticsearch Inc. in 2012, simplified its name in 2015, and went public in 2018. The company is based in Mountain View, California, and maintains offices around the world.

Cynthia Murrell, November 25, 2020

Enterprise Search: Still Crazy after All These Years

November 20, 2020

This is not old wine in new bottles. This is wine in those weird clay jars with the nifty moniker “amphora” filled with Oak Leaf Vineyards Sauvignon Blanc White Wine. Cough, cough.

CMS Wire gets it correct when it declares, “Scanning and Selecting Enterprise Search Results: Not as Easy as it Looks.” The article doesn’t even approach the formation of a query—finding the right wording then tweaking filters and facets to produce a manageable list. Here we are only looking at the next step. Though the task seems simple on its surface—scan a list of results and select the most relevant ones—writer Martin White explains why it is not so straightforward.

First is scanning results. Users’ perceptual speed differs, so for some folks (like those who are dyslexic, for example) the process can be so tedious as to make searching pointless. White tells us that inconvenient fact is often overlooked in the discussion of search functionality. Also under-considered is the issue of snippet length. A bit of research has been performed, but it involved web pages, which are themselves more easily scanned and assessed than content found in enterprise databases. Those documents are often several hundred pages long, so ranking algorithms often have trouble picking out a helpful snippet. Some platforms serve up a text sequence that contains the query term, others create computer-generated summaries of documents, and others reproduce the first few lines of each document. Each of these approaches is imperfect. Still others produce a thumbnail of a whole page that contains the search term, and that probably helps many users. However, there are accessibility problems with that method.

White concludes:

“We know from recent research that people may make different decisions from the information they perceive initially as relevant based on their expertise. Equally, most search metrics are based around the notional relevance of the results being presented in response to a query. If the true value of relevance cannot be well judged from the snippet, that calls any metrics associated with query performance (especially precision) into question.

“There are no easy solutions to the issues raised in this column. In the quest for achieving an acceptable user experience the points to consider are:

*Are the techniques used by the search application to create snippets appropriate to the types of content being searched?

*Can the format of snippets be customized by the user?

*How easy is it to scan and assess results from a federated search?

“In the final analysis, it doesn’t matter how sophisticated the search technology is (in terms of semantic analysis, etc.). What matters is if the user can make an informed judgment of which piece of content in the results serves their information requirement, reinforces their trust in the application and maintains the highest possible level of overall search satisfaction.”

Sigh. It seems the more developers work on enterprise search, the more complicated it is to effectively operate. The field has been at it for 50 years, and is still trying to deliver something useful. Still crazy after all these years too.

PS. Our esteemed check writer (Stephen E Arnold) wrote a book about enterprise search with the author of the source document. No wonder this essay seemed weirdly familiar. I had to proofread what turned out to be prose that made the Oak Leaf stuff welcome at the end of an editing day. Cough, cough, eeep. 

Cynthia Murrell, November 20, 2020

Survey Says Data Governance Is Important. But What Is Data Governance?

November 20, 2020

Here’s what the Google says governance means: The action or manner of governing. Okay, but what exactly is governing. Google says: Having authority to conduct the policy, actions, and affairs of a state, organization, or people.

Okay, now let’s add the magic word “data,” which is a plural, not a single thing. (That’s what datum means, right?)

Google says: Facts and statistics collected together for reference or analysis.

Let’s put the information together, shall we?

An organization uses authority to conduct policy, actions and affairs to deal with facts and statistics for reference or analysis.

Why care? The answer is found in “Businesses Positive about Data Governance but Still Struggle with Privacy Concerns.”

Okay, now we have linked dealing with information and privacy. This is getting interesting or is it? I go with the “not interesting,” but let’s plod forward in the write up.

A vendor of search and retrieval software sponsored a research project conducted by Standard & Poor 451 Research. Note: That report is titled “Pathfinder Report Market Intelligence: Information Driven Compliance and Insight. Two Sides of the Same Coin.” I am not sure about the “coin” metaphor, compliance, insight, and pathfinding. But no one ever accused me of understanding mid-tier consulting firms, sponsored research, and 18 year old vendors of proprietary search and retrieval software.

The 451 outfit tapped its pool of “survey responders” and discovered:

72 percent of enterprises believe data governance is an enabler of business value rather than a cost center.

Okay, that’s a lot of enterprises, assuming the sample was statistically valid, the questions not shaped, and the data analysis of the survey responses was performed on the up and up. But sponsored research is different from the often wonky academic research churned out by professors and work-from-home students. That’s better, right? 

I learned:

  • One in four organizations have more than 50 distinct data silos
  • 37 per cent of respondents say having relevant information automatically displayed, when the team needs it, would benefit them the most in the pursuit of automation.
  • Budget, privacy issues, and expertise are barriers. 

How does one deal with data silos, which I assume is “governance”? How does one deal with security? Privacy? How does an enterprise search company cope with the assorted sixes and sevens of data in an organization; for example, tweets, encrypted messages, images, geospatial data, videos, and information which must be kept isolated from the grubby “let’s federate information” crowd? (Why must some data be isolated? Find an attorney. Ask her what happens if information in a legal matter is out of her span of control.)

What’s the net net of the mid-tier consulting outfit’s report? Here it is:

Success requires alignment of business objectives by looking for common-denominator requirements across business units.

Let me be clear: Enterprise search is not the solution to problems with an “authority to conduct policy, actions and affairs to deal with facts and statistics for reference or analysis.”

Enterprise search is information retrieval, data governance no matter how much a marketer wishes it were. Enterprise search vendors have been struggling for relevance because Lucene/Solr are good enough and users want information to address right now business issues. Library style lists of stuff to read or look up may not ring the chimes of a thumb typing user.

Want the full report? Go here. Please, keep marketing and governance separate. Statistics 101 offered some useful guidelines. Some, however, did not pay attention. You will have to register. Marketing is still marketing.

Stephen E Arnold, November 20, 2020

Voyager Search Tapped for USDA Search and Discovery Project

November 4, 2020

Low-profile enterprise search company Voyager Search just made an important deal with a high-profile government agency. AIThority announces, “New Light Technologies and Voyager Search Team Win New Contracts with the U.S. Department of Agriculture to Implement Data Search and Discovery Solutions.” Voyager’s partner in the project, New Light Technologies (NLT), is a consulting firm working in the areas of cloud tech, cybersecurity, software development, data analytics, geospatial tech, and scientific R&D. The write-up reports:

“Access to accurate information is crucial to the department’s mission to support sustainable agriculture production and protection of natural resources. Both NLT and Voyager Search bring many years of experience developing award-winning federal data integration and dissemination platforms and will build federated data search solutions to index and link disparate cloud-based and on-prem data sources, including large repositories of imagery and geospatial data files that are used for a variety of analytical reporting and data dissemination systems, such as the Global Agricultural Information Network, Global Agricultural & Disaster Assessment System, Crop Explorer, and the Geospatial Data Gateway. Leveraging NLT and Voyager Search’s Professional Services Department and Vose technology which provides robust spatial search capabilities, the team’s solution will enable users to search for data, content, and documents by who, what, when, and where. Together, the team is providing the technology and services to advance a modern data architecture for the department that will support improved information flow, security, and analysis as well as power the Artificial Intelligence (AI) and Machine Learning (ML) of the future.”

“Voyager” is a popular name for a business, so do not confuse Voyager Search with other enterprises like digital innovation firm Voyager, manufacturer Voyager Industries, or even the Voyager Company that pioneered DC-ROM production back in the day. Vose is the name of Voyager Search’s platform that will be used for the USDA project, but the company also offers Server, essentially Vose for larger implementations, and ODN (Open Data Network), a searchable global-content catalog. Both products build on Vose’s “smart spatial search” technology. Based in Redlands, California, Voyager Search was founded in 2008.

Cynthia Murrell, November 4, 2020

Persistent: Enterprise Search and Cloud Expertise

October 22, 2020

I checked my enterprise search files. Sure enough, Persistent Systems is in the enterprise search game. You can get a sense of the firm’s consulting orientation if you download and study “An Essential Primer on Enterprise Search Evaluation.” Yep, evaluation. Most organizations have employees who need to locate information: Text, videos, PowerPoints on clever sales professionals’ work laptops, documents generated by the less-than-forthcoming legal department, and information about recreational softball in the Era of the Rona. We noted that the company acquired Capiot. This is a company which provides integration services. To sum up, “enterprise search” appears to be a consulting services operation at Persistent. With a workable search solution available as open source, renting people who can allegedly make search perform magic tricks seems logical. But what about rich media, tweets, silos of data, and uncooperative sales professionals who tweak slide decks moments before making a pitch to up the chances for a sale? Let’s not dig too deeply into the contents of the “Essential Primer,” shall we? Enterprise search appears to be a synonym for “consulting.”

Stephen E Arnold, October 22, 2020

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta