CyberOSINT banner

Do Search and CMS Deliver a Revenue Winner?

August 21, 2015

I spotted a write up called “Look for Enterprise Search, Analytics and These ECM Leaders for Your Transactional Content.” I found the article darned amazing even for public relations about a mid tier consulting firm and one of its analyses.

The main point of the article is that analysts have analyzed enterprise software and identified vendors who provide “ECM Transactional Content Services.” Fabricating collections of objects and slapping a jargon laded label on the batch is okay with me.


Empty calories await you, gentle reader.

What struck me as interesting was this statement:

Forrester Vice President and Principal Analyst Craig Le Clair points to key advancements and opportunities by the leading ECM providers to help enterprises realize greater value in these systems:

  • Ramping analytics to drive insight and reduce administrative burden
  • Accelerating their move to cloud
  • Improved search and content sharing
  • Using stronger and more open application program interfaces (APIs) that spur innovation
  • Moving quickly to fill gaps in their mobile road maps.

Notice the “ECM”. The acronym refers to software which provides editing, access, and publishing functions to its users. The idea, it seems, is that an employee will write a memo and the ECM will keep track of the document. In practice, based on my experience, the ECM recipe usually fails to satisfy my hunger.

ECM and its close cousins in acronym land are similar to the approach articulated by my kindergarten teacher more than half century ago. She said, according to my mother, “Keep your mittens and lunch in your cubby.” The spirit of the kindergarten teacher lives on in enterprise content management systems.

Unfortunately those who have work to do often create content using tools suited for a specific task. For an engineer, that tool might be Solidworks. Bench chemists are often confused when an ECM is described as the tool for their work. One chemist said to me after an enthusiastic presentation by an information technology person, “I work with chemical structures. What’s this person talking about?” Lawyers in the midst of big risk litigation want to use their own and often flawed document systems.  Even the marketer who cheers for ECM for Web content parks some high value data in that wonderful Adobe creative cloud with some back up data on iCloud. I have spotted a renegade analyst with an off the books workstation equipped with an Australian text processing and search system. is notable for what is not available because executive brand entities roll their own content solutions.

I was able to review a copy of the consultant report upon which the article was based. Wowza. The write up assembled a grad bag of widely disparate companies, added three cups of buzzwords, and output mixed in one kilo of MBAisms.

To be fair, the report identified “challenges.” These items baffled me. For example, “Deep experience in key transactional applications.” This is a challenge, really?

But the vendors in the report are able to “address emerging opportunities.” Okay, so these are not opportunities. The opportunities are emerging. Hmmm. Here’s an example: “Ramping analytics to drive insight and reduce administrative burden.” Yikes. Ramping analytics. Driving analytics. Reducing administrative burden. Very active stuff this ECM. Gerund alert. Gerund alert.

What companies are into this suite of challenges and emerging opportunities? Here’s the list of the mid tier touted stallions from the ECM stable:

  1. EMC, a company which is considering having a subsidiary of itself purchase the parent company. Folks, when a company does this type of recursive stuff, the core business might be a little bit uncertain.
  2. HP. Yep, an outfit which has lost its way, suffered five consecutive quarters of declining revenue, and bought a company for $11 billion and then wrote off most of that expense because the sellers of the company fooled HP, its consultants, accountants, and lawyers. Okay. A winner for the legal eagles maybe.
  3. IBM. Heaven help me. IBM has suffered declining revenues for 13 consecutive quarters, annoyed me with a blizzard of Watson silliness, and spent lots of time getting rid of businesses. I have a difficult time believing that IBM can manage enterprise content. But, hey, that’s just my rural Kentucky ignorance, right?
  4. Laserfiche. The company offers a “flexible, proven enterprise content management system. I believe this statement. The company was founded in 1987 and sure seems to have its roots in well seasoned technology. The company has lots of customers and lots of award. The only hitch in the git along is that I never ran across this outfit in my work. Bad luck I guess.
  5. Lexmark. Folks, let us recall the rumor that Lexmark and its content businesses are not money makers. I heard that the content cluster achieved an astounding $70 to $80 million shortfall. Who knows if this rumor is accurate. I do know that Lexmark is cutting staff, and one does not take this drastic step unless one needs to reduce costs pronto.
  6. M Files. I never heard of this outfit. I did a quick check of my files and learned that the company “helps enterprises find, share, and secure documents and information. Even in highly regulated industries.” The company is also “passionate about productivity.” The outfit relies on dtSearch for information access. This is okay because dtSearch can process most of the content within a Microsoft-centric environment. But M Files strikes me as a different type of outfit from HP or IBM. As I flipped through the information I had collected, the company struck me as a collection of components. Assembly required.
  7. Newgen Software. Another newbie for me. The company was in my Overflight archive. The firm provides BPM (business process management), ECM (enterprise content management), DMS (I have no idea what this acronym means), CCM (I have no idea what this acronym means), and workflow (I thought this was the same as BPM). The company operated from New Delhi. My thought? Another collection of components with assembly in someone’s future.
  8. Hyland OnBase. This is the third outfit on the list about which I have a modest amount of information. The company says that it is a “leader in ECM.” I believe it. The firm’s url is the same as its flagship product. The company was founded in 1991 and created OnBase, which is a plus. After 25 years, the darned thing should work better than a Rube Goldberg solution assembled from a box of components.
  9. OpenText. Okay, OpenText is a company which has more search engines and content processing systems than most Canadian firms. The challenge at OpenText is having enough cash to invest in keeping the diverse assortment of systems current. Which of these systems is the one referenced in the mid tier firm’s report? SGML search, BASIS, BRS, Nstein, the Autonomy stub in RedDot, Nstein, Fulcrum, or some other approach? Details can be important.
  10. Unisys. Okay, finally a company that is essentially an integrator which still supports Burroughs mainframes. Unisys can implement systems because it is an integrator. For government work, Unisys matches the statement of work to available software. Although some might question this statement, Unisys can implement almost any kind of system eventually.

Several observations:

First, enterprise content management is a big and fuzzy concept. The evidence of this is the number of acronyms some of the companies use to explain what they do. I assume that it is my ignorance which prevents me from understanding exactly how scanning, indexing, retrieval, repurposing, workflow, and administrative functions work in a cost constrained, teleworker, mobile gizmo world.

Second, open source is knocking on the door of this sector. At some point, organizations will tire of the cost and complexity of collections of loosely federated and integrated software subsystems and look for an alternative. Toss in the word Big Data, and there will be a stampede of New Age consultants ready to step forward and reinvent these outfits. Disruption is probably less of a challenge than the challenge of keeping existing revenues from doing the HP, IBM, and Lexmark drift down.

Third, the search function seems to be a utility or an after thought. The only problem is that search does not work particularly well in an enterprise where the workers log in from Starbucks and try to interact with enterprise software from a Blackberry.

Fourth, what an odd collection of outfits? HP, IBM, and Lexmark along with 30 year old imaging firms plus some small outfits. Maybe the selection of firms makes sense to you, gentle reader. For me, the report make evident the struggles of some experts in ECM, BPM, and the acronyms I know zero about.

In short, this mid tier report strikes me as a russische punschtorte. On the surface, the darned thing looks good, maybe mouth watering. After a chomp or two, I want a paprikahenderl.

This ECM thing is a confection, not a meaty chicken. Mixing in search does nothing for the recipe.

Stephen E Arnold, August 22, 2015

A Wit Emulates Chip Foose with a Magic Gartner Touch Up

August 18, 2015

I am out of the loop when it comes to mid tier consulting firms and analyzing their outputs on a regular basis. Much of the content is thinly disguised word play designed to generate sales leads. Think data lake and customer care.

I am out of the loop on almost everything. When my squirrel powered Internet connection works, I am lucky if I can see 10 percent of a page before my connection fags out.

I was able to read a story I found subjective and (okay, I admit it) amusing. If Jack Benny were alive, his writers might have incorporated the information into a Buck Benny episode with Andy Devine explaining the Gartnerian motion.

The write up was “Yellow and Blue Circles, Red Arrows Add to Gartner’s Magic Quadrant.” To me, I thought immediately of Chip Foose and his ability to take a ho hum vehicle, add some Foose-tian touches, and thrill the owner. I see these make overs as a Foose-tian deal with the devil, however. I like the autos as they are.

The write up pivots on a person taking two Gartner magic thingamajigs and tried to figure out what changed between two reports about something called “integrated systems.” I don’t know what that means, but as previously stated, I am in rural Kentucky.

Nothing catches the eye like an annual matrix analysis touch up. These are expensive and, in the end, subjective. Congruence? Similarity?

The point of the write up struck me as:

Incidentally, Valdis Filks, Gartner’s lead on the Magic Quadrant reports, tells us how to position and interpret MQS in Gartner’s set of reports about supplier ranking and positioning: “The MQ is focused on the ability of vendors to succeed in a specific market, rather than about products. There is lots in it about understanding customer needs, direction, service, marketing, support, innovation and many other criteria.”

I think I understand. Subjective decisions. No problemo.

The plotting of two magic whatevers (MWs) revealed:

There’s been a lot of movement and five new entries. Two previous entries have disappeared as well; Bull and Unisys.

Wasn’t Bull the Amesys owner? Doesn’t Unisys maintain Burroughs’ computers? I thought IBM did Linux mainframes and Watson? And Huawei? Okay.

My take pivots on this question, “What the heck is an integrated system?” Everything, including the choice of arrow color seems somewhat arbitrary. General Eisenhower’s box, as I recall, relied sometimes on actual data.

What did that guy know? Probably not much about integrated systems.

Stephen E Arnold, August 18, 2015

Mid Tier Consulting Firm Identifies the New Economy

August 16, 2015

The information economy is officially over. There is a new economy in town, and you need to listen up. I can hear R Lee Ermey, the drill sergeant in Platoon saying, “You are dumb, Private Pyle, but do you expect me to believe that you don’t know left from right?”

I read “Big Data Fades to the Algorithm Economy.” I thought Big Data was the future. Guess I was wrong again. I learned that “Algorithms are all around us.” Gee, I did not know that.

I learned:

For organizations, the opportunity will first center on monetizing their proprietary algorithms by offering licensing to other non-competing organizations. For example, a supply chain company can license its just-in-time logistics algorithms to a refrigerator manufacturer that seeks to partner with a grocery chain to automatically replenish food based on your eating habits. Why invent or slowly develop sophisticated algorithms at huge cost when you can license and implement them quickly at low cost?

There is “fevered questioning” underway. Gee, I did not know that. In my experience, I see more “fevered confusion.”

But the fix is “proprietary algorithms.”

Okay, what is an algorithm?

A formula or set of steps for solving a particular problem. To be an algorithm, a set of rules must be unambiguous and have a clear stopping point. Algorithms can be expressed in any language, from natural languages like English or French to programming languages like FORTRAN.

It seems to me that a procedure qualifies as an algorithm if it works. A proprietary algorithm is, I assume, a trade secret.

I am delighted that math is not involved; otherwise, there is the problem with companies built on procedures endlessly recycled from books like Algorithms in a Nutshell, Algorithms, The Art of Computer Programming, and other standard texts with loads of numerical recipes.

I understand. The mid tier consulting firm is defining a business process as an algorithm. Magic. Well, if Big Data won’t sell – If enterprise search won’t sell — If content management won’t sell, just go with algorithms.

How quickly will McKinsey, Bain, BCG, Booz, and SRI pick up on this conceptual breakthrough? Any moment now. I assume each company’s blue chip consultants know left from right — usually. Other outfits may get confused. Left? Right?

Stephen E Arnold, August 16, 2015

Enterprise Search: You Cannot Do It Yourself, People.

July 31, 2015

I love write ups like “Don’t Settle When It Comes to Enterprise Search Platforms.” These articles are designed to make consulting firms with the marketing flim flam which positions each as an “expert” in enterprise information access. I would not be surprised to find copies of this article in the peddler kit of search sales professionals.

The main point of the write up is that enterprise search is a “platform.” Because there are options, no self respecting company will try to implement search without the equivalent of the F Troop in mid tier or below consultants.

I noted:

Let’s look at two very common workarounds some have tried, and then we will talk about why you must go with a reputable developer when you make your final decision.

When I read this, I wondered if the “expert” were familiar with the Maxxcat line of enterprise search systems or the Blossom hosted solution.

The write up dismisses an open source solution apparently unaware of research by Diomidis Spinellis and Vaggelis Giannikas work published in Journal of Systems and Software, March 2012, pages 666 to 682. That’s okay. My hunch is that those finding the “Don’t Settle” article compelling are not likely to be interested in researchy type stuff.

One of the more interesting segments in the write up is the assertion that scalability is a “given.” Hmmm. In my experience, there are some on going enterprise search challenges: Scalability is one facet of a nest of vipers which includes my favorite reptile indexing latency.

The article states:

Open source platforms are only as scalable as their code allows, so if the person who first made it didn’t have your company’s needs in mind, you’ll be in trouble. Even if they did, you could run into a problem where you find out that scaling up actually reveals some issues you hadn’t encountered before. This is the exact kind of event you want to avoid at all costs.

I don’t want to rain on this parade of “information,” but every enterprise search system which I have had the pleasure of procuring, managing, investigating, and analyzing has scalability problems.

The reason is simple: The volume of changed information and the flow of new information goes up. Whatever one starts with is rather rapidly choked. The solutions are painful: Spend more or index less.

I am not confident that one who follows the advice of certain experts will find his or her enterprise search journey pleasant. On the other hand, there are opportunities as Uber drivers one can pursue.

Stephen E Arnold, July 31, 2015

PowerPoint Enabled Big Data Presenters Rejoice

July 27, 2015

Navigate to “A Plethora of Big Data Infographics.” Note that the original write up misspells “plethora” at “pletora” but, as many in Big Data say, “it is close enough for horseshoes.”

big data chart snip

I quit browsing after a baker’s dozen of these puppies. If you want to be an expert in Big Data, these charts will do the trick. I would steer clear of a person with a PhD in statistics, however.

Stephen E Arnold, July 27, 2015

Forbes and Some Big Data Forecasts

July 26, 2015

Short honk: For fee, mid tier consultants have had their thunder stolen. Forbes, the capitalist tool, wants to make certain its readers know how juicy Big Data is as a market. Navigate to “Roundup Of Analytics, Big Data & Business Intelligence Forecasts And Market Estimates, 2015.”

The write up summarizes the eye watering examples of spreadsheet fever’s impact on otherwise semi-rationale MBAs, senior managers, and used car sales professionals. IDC, without the inputs of Dave Schubmehl comes up with a spectacular number: $125 billion in 2015.

Sounds good, right?

The data will find their way into innumerable PowerPoint presentations. Snag ‘em while you can.

Stephen E Arnold, July 26, 2015

Big Data Basics: Garbage In, Garbage Out Still a Problem

July 20, 2015

The person writing “Data Integrity: A Sequence of Words Lost in the World of Big Data” appears to be older than 18. I don’t hear too many young wizards nattering about data integrity. The operative concept is that with enough data, the data work out the bumps in the Big Data tapestry. The cloth may have leaves and twigs in it. But when you make the woven object big enough and hang it on a wall in a poorly illuminated chateau, who can tell. Few visitors demand a ladder and a lanthorn to inspect the handiwork.

According to the write up:

The purpose of this post is to highlight the necessity to keep data clean and orderly so that the results of the analysis are reliable and trustworthy – if data integrity is intact, information derived from this data will be trustworthy resulting in actionable information.

Why tackle this topic in a blog for Big Data professionals?

Answer: No one pays much attention. The author saddles up and does the Don Quixote gallop at the Big Data hyperbole windmill.

The article includes a partial list of questions to ask and, keep this in mind, gentle reader, to answer. One example: “Are values outside of acceptable domain values?”

I found this article refreshing. Take a gander.

Stephen E Arnold, July 20, 2015

Holy Cow. More Information Technology Disruptors in the Second Machine Age!

July 11, 2015

I read a very odd write up called “The Five Other Disruptors about to Define IT in the Second Machine Age.”

Whoa, Nellie. The second machine age. I thought we were in the information age. Dorky machines are going to be given an IQ injection with smart software. The era is defined by software, not machines. You know. Mobile phones are pretty much a commodity with the machine part defined by fashion and brand and, of course, software.

So a second machine age. News to me. I am living in the second machine age. Interesting. I thought we had the Industrial Revolution, then the boring seventh grade mantra of manufacturing, the nuclear age, the information age, etc. Now we are doing the software thing.

My hunch is that the author  of this strange article is channeling Shoshana Zuboff’s In the Age of the Smart Machine. That’s okay, but I am not convinced that the one, two thing is working for me.

Let’s look at the disruptors which the article asserts are just as common as the wonky key fob I have for my 2011 Kia Soul. A gray Kia soul. Call me exciting.

Here are the four disruptors that, I assume, are about to remake current information technology models. Note that these four disruptors are “about to define IT.” These are like rocks balanced above Alexander the Great’s troops as they marched through the valleys in what is now Afghanistan. A 12 year old child could push the rock from its perch and crush a handful of Macedonians. Potential and scary enough to help Alexander to decide to march in a different direction. Hello, India.

These disruptors are the rocks about to plummet into my information technology department. The department, I wish to point out, works from their hovels and automobiles, dialing in when the spirit moves them.

Here we go:

  • Big Data
  • Cloud
  • Mobile
  • Social

I am not confident that these four disruptors have done much to alter my information technology life, but if one is young, I assume that these disruptors are just part of the everyday experience. I see grade school children poking their smart phones when I take my dogs for their morning constitutional.

But the points which grabbed my attention were the “five other disruptors.” I had to calm down because I assumed i had a reasonable grasp on disruptors important in my line of work. But, no. These disruptors are not my disruptors.

Let’s look at each:

The Trend to NoOps

What the heck does this mean? In my experience, experienced operations professionals are needed even as some of the smart outfits I used to work with.

Agility Becomes a First Class Citizen

I did not know that the ability to respond to issues and innovations was not essential for a successful information technology professional.

Identity without Barriers

What the heck does this mean? The innovations in security are focused on ensuring that barriers exist and are not improperly gone through. The methods have little to do with an individual’s preferences. The notion of federation is an interesting one. In some cases, federation is one of the unresolved challenges in information technology. Mixing up security, “passwords,” and disparate content from heterogeneous systems is a very untidy serving of fruit salad.

Thinking about information technology after reading Rush’s book of farmer flummoxing poetry. Is this required reading for a mid tier consultant? I wonder if Dave Schubmehl has read it? I wonder if some Gartner or Forrester consultants have dipped into its meaty pages. (No pun intended.)

IT Goes Bi Modal?

What the heck does this mean again? Referencing Gartner is a sure fire way to raise grave concerns about the validity of the assertion. But bi-modal. Two modes. Like zero and one. Organizations have to figure out how to use available technology to meet that organization’s specific requirements. The problem of legacy and next generation systems defines the information landscape. Information technology has to cope with a fuzzy technology environment. Bi modal? Baloney.

The Second Machine Age

Okay, I think I understand the idea of a machine age. The problem is that we are in a software and information datasphere. The machine thing is important, but it is software that allows legacy systems to coexist with more with it approaches. This silly number of ages makes zero sense and is essentially a subjective, fictional, metaphorical view of the present information technology environment.

Maybe that’s why Gartner hires poets and high profile publications employ folks who might find an hour discussing the metaphorical implications of “bare ruined choirs.”

None of these five disruptions makes much sense to me.

My hunch is that you, gentle reader, may be flummoxed as well.

Stephen E Arnold, July 11, 2015

Enterprise Search and the Mythical Five Year Replacement Cycle

July 9, 2015

I have been around enterprise search for a number of years. In the research we did in 2002 and 2003 for the Enterprise Search Report, my subsequent analyses of enterprise search both proprietary and open source, and the ad hoc work we have done related to enterprise search, we obviously missed something.

Ah, the addled goose and my hapless goslings. The degrees, the experience, the books, and the knowledge had a giant lacuna, a goose egg, a zero, a void. You get the idea.

We did not know that an enterprise licensing an open source or proprietary enterprise search system replaced that system every 60 months. We did document the following enterprise search behaviors:

  • Users express dissatisfaction about any installed enterprise search system. Regardless of vendor, anywhere from 50 to 75 percent of users find the system a source of dissatisfaction. That suggests that enterprise search is not pulling the hay wagon for quite a few users.
  • Organizations, particularly the Fortune 500 firms we polled in 2003, had more than five enterprise search systems installed and in use. The reason for the grandfathering is that each system had its ardent supporters. Companies just grandfathered the system and looked for another system in the hopes of finding one that improved information access. No one replaced anything was our conclusion.
  • Enterprise search systems did not change much from year to year. In fact, the fancy buzzwords used today to describe open source and proprietary systems were in use since the early 1980s. Dig out some of Fulcrum’s marketing collateral or the explanation of ISYS Search Software from 1986 and look for words like clustering, automatic indexing, semantics, etc. A short cut is to read some of the free profiles of enterprise search vendors on my Web site.

I learned about a white paper, which is 21st century jargon for a marketing essay, titled “Best Practices for Enterprise Search: Breaking the Five-Year Replacement Cycle.” The write up comes from a company called Knowledgent. The company describes itself this way on its Who We Are Web page:

Knowledgent [is] a precision-focused data and analytics firm with consistent, field-proven results across industries.

The essay begins with a reference to Lexis, which along with Don Wilson (may he rest in peace) and a couple of colleagues founded. The problem with the reference is that the Lexis search engine was not an enterprise search and retrieval system. The Lexis OBAR system (Ohio State Bar Association) was tailored to the needs of legal researchers, not general employees. Note that Lexis’ marketing in 1973 suggested that anyone could use the command line interface. The OBAR system required content in quite specific formats for the OBAR system to index it. The mainframe roots of OBAR influenced the subsequent iterations of the LexisNexis text retrieval system: Think mainframes, folks. The point is that OBAR was not a system that was replaced in five years. The dog was in the kennel for many years. (For more about the history of Lexis search, see Bourne and Hahn, A History of Online information Services, 1963-1976. By 2010, LexisNexis had migrated to XML and moved from mainframes to lower cost architectures. But the OBAR system’s methods can still be seen in today’s system. Five years. What are the supporting data?

The white paper leaps from the five year “assertion” to an explanation of the “cycle.” In my experience, what organizations do is react to an information access problem and then begin a procurement cycle. Increasingly, as the research for our CyberOSINT study shows, savvy organizations are looking for systems that deliver more than keyword and taxonomy-centric access. Words just won’t work for many organizations today. More content is available in videos, images, and real time almost ephemeral “documents” which can difficult to capture, parse, and make findable. Organizations need systems which provide usable information, not more work for already overextended employees.

The white paper addresses the subject of the value of search. In our research, search is a commodity. The high value information access systems go “beyond search.” One can get okay search in an open source solution or whatever is baked in to a must have enterprise application. Search vendors have a problem because after decades of selling search as a high value system, the licensees know that search is a cost sinkhole and not what is needed to deal with real world information challenges.

What “wisdom” does the white paper impart about the “value” of search. Here’s a representative passage:

There are also important qualitative measures you can use to determine the value and ROI of search in your organization. Surveys can quickly help identify fundamental gaps in content or capability. (Be sure to collect enterprise demographics, too. It is important to understand the needs of specific teams.) An even better approach is to ask users to rate the results produced by the search engine. Simply capturing a basic “thumbs up” or “thumbs down” rating can quickly identify weak spots. Ultimately, some combination of qualitative and quantitative methods will yield an estimate of  search, and the value it has to the company.

I have zero clue how this set of comments can be used to justify the direct and indirect costs of implementing a keyword enterprise search system. The advice is essentially irrelevant to the acquisition of a more advanced system from an leading edge next generation information access vendor like BAE Systems (NetReveal), IBM (not the Watson stuff, however), or Palantir. The fact underscored by our research over the last decade is tough to dispute: Connecting an enterprise search system to demonstrable value is a darned difficult thing to accomplish.

It is far easier to focus on a niche like legal search and eDiscovery or the retrieval of scientific and research data for the firm’s engineering units than to boil the ocean. The idea of “boil the ocean” is that a vendor presents a text centric system (essentially a one trick pony) as an animal with the best of stallions, dogs, tigers, and grubs. The spam about enterprise search value is less satisfying than the steak of showing that an eDiscovery system helped the legal eagles win a case. That, gentle reader, is value. No court judgment. No fine. No PR hit. A grumpy marketer who cannot find a Web article is not value no matter how one spins the story.

Read more

Keyword Search Is Not Productive. Who Says?

June 30, 2015

I noticed a flurry of tweets pointing to a diagram which maps out the Future of Search. You can view the diagram at or Direct your attention to this assertion:

As amount of data grows, keyword search is becoming less productive.

Now look at what will replace keyword search:

  • Social tagging
  • Automatic semantic tagging
  • Natural language search
  • Intelligent agents
  • Web scale reasoning.

The idea is that we will experience a progression through these “operations” or “functions.” The end point is “The Intelligent Web” and the Web scale reasoning approach to information access.

Interesting. But I am not completely comfortable with this analysis.

Let me highlight four observations and then leave you to your own sense of what the Web will become as the amount of data increases.

First, keyword search is a utility function, and it will become ubiquitous. It will not go away or be forgotten. Keyword search will just appear in more and more human machine interactions. Telling your automobile to call John is keyword search. Finding an email is often a matter of plugging a couple of words into the Gmail search box.

Second, more data does translate to programmers lacing together algorithms to deliver information to users. The idea is that a mobile device user will just “get” information. This is a practical response to the form factor, methods to reduce computational loads imposed by routine query processing, and the human desire for good enough information. The information just needs to be good enough which will work for most people. Do you want your child’s doctor to take automatic outputs if your child has cancer?

Third, for certain types of information access, the focus is shifting, as it should, from huge flows of data to chopping flows down into useful chunks. Governments archive intercepts because the computational demands of processing information in real time for large numbers of users who need real time access are an issue. As data volume grows, computing horsepower is laboring to keep pace. Short cuts are, therefore, important. But most of the short cuts require on having a question to answer. Guess what? Those short cuts are often keyword queries. The human may not be doing keyword searching, but the algorithms are.

Fourth, some types of information require both old fashioned Boolean keyword search and retrieval AND the manual, time consuming work of human specialists. In my experience, algorithms are useful, but there are subjects which require the old fashioned methods of querying, reading, researching, analyzing, and discussing. Most of these functions are keyword centric.

In short, keyword queries can be dismissed or dressed up in fancy jargon. I don’t think the method is going away too quickly. Charts and subjective curves are one thing. Real world information interaction is another.

Stephen E Arnold, June 30, 2015

Next Page »