DuckDuckGo: Boosting Search via Open Source Cash Donations

March 21, 2015

There are many ways for commercial enterprises to gain traction via open source. Some companies, like IBM, cheerlead for Eclipse and Lucene, among other open source projects. Other companies hold conferences to tout an open source solution and then pitch extra cost add ons like consulting and training so the unfamiliar can become familiar with the “free” software. A few firms slip open source hints into their commercial messages. One company which sells a government- and academic-based search system used “open source” on a Web page. When I pointed this commercial outfit hinting that their for fee, proprietary product was open source, the reference disappeared after a frisky email exchange. It seems that some company presidents do not look at their own firm’s Web sites.

I read “2015 Open Source Donations.” The write up was straightforward, listing various donations from DuckDuckGo to worthy causes. One of these is the Amnesic Incognito Live System or Tails.

I am okay with this support for open source via cash. Many firms have followed the path. I find it interesting that DuckDuckGo, which I understand is essentially a metasearch engine, is following this route.

Other commercial outfits will become more open about their support of open source. After all, why use a commercial, proprietary product when you can use a perfectly good open source product. All one needs is know how. That, of course, is what the open source services firm sell.

DuckDuckGo wants to keep the communities in which it has an interest watered, fed, and loved. Good deal.

Stephen E Arnold, March 21, 2015

Great Moments in Mid Tier Consulting: The Apple Watch

March 21, 2015

Short honk: I couldn’t resist. I read “Apple Watch Is Like an Invasive Weed Says Gartner.” The idea is that Apple identifies a market (an ecosystem) and then acts like kudzu or a killer carp. Here’s a passage from the Register I highlighted:

Hetu [a Gartner “expert”] says Apple Pay and the Apple Watch are Cupertino’s new invaders, with the latter gadget set to “further revolutionize how consumers are influenced to purchase and their paths to making purchases.”

Colorful. Now let’s see if the Apple Watch is digital kudzu. I know that Gartner has one English major operating as a “search expert.” It seems another Gartner “wizard” is vying for the moniker of “Chief Metaphorist.”

Kudzu is a good way to characterize the thinking of mid tier consulting firms desperate to close deals, keep their jobs, and forget that outfits like McKinsey, Bain, and BCG may be out of reach. A happy quack to King County Government for the image.

Well, if it sells work to those with cash. More power to the mid tier crowd. (Will Dave Schubmehl, who sold a report with my name on it for $3,500 on Amazon,  and his cohorts at IDC move beyond recycling and into the realm of poetry?)

Stephen E Arnold, March 21, 2015

Enterprise Search Is Important: But Vendor Survey Fails to Make Its Case

March 20, 2015

I read “Concept Searching Survey Shows Enterprise Search Rises in the Ranks of Strategic Applications.” Over the years, I have watched enterprise search vendors impale themselves on their swords. In a few instances, licensees of search technology loosed legal eagles to beat the vendors to the ground. Let me highlight a few of the milestones in enterprise search before commenting on this “survey says, it must be true” news release.

A Simple Question?

What do these companies have in common?

  • Autonomy
  • Convera
  • Fast Search & Transfer?

I know from my decades of work in the information retrieval sector that financial doubts plagued these firms. Autonomy, as you know, is the focal point of on-going litigation over accounting methods, revenue, and its purchase price. Like many high-tech companies, Autonomy achieved significant revenues and caused some financial firms to wonder how Autonomy achieved its hundreds of millions in revenue. There was a report from Cazenove Capital I saw years ago, and it contained analyses that suggested search was not the money machine for the company.

And Convera? After morphing from Excalibur with its acquisition of the manual-indexing ConQuest Technologies, a document scanning with some brute force searching technology morphed into Convera. Convera suggested that it could perform indexing magic on text and video. Intel dived in and so did the NBA. These two deals did not work out and the company fell on hard times. With an investment from Allen & Company, Conquest tried its hand at Web indexing. Finally, stakeholders lost faith and Convera sold off its government sales and folded its tent. (Some of the principals cooked up another search company. This time the former Convera wizards got into the consulting engineering business.) Convera lives on in a sense as part of the Ntent system. Convera lost some money along the way. Lots of money as I recall.

And Fast Search? Microsoft paid $1.2 billion for Fast Search. Now the 1998 technology lives on within Microsoft SharePoint. But Fast Search has the unique distinction of facing both a financial investigation for fancy dancing with its profit and loss statement and the distinction of having its founder facing a jail term. Fast Search ran into trouble when its marketers promised magic from the ESP system. When the pixie dust caused licensees to develop an allergic reaction, Fast ran into trouble. The scrambling caused some managers to flee the floundering Norwegian search ship and found another search company. For those who struggle with Fast Search in its present guise, you understand the issues created by Fast Search’s “sell it today and program it tomorrow” approach.

Is There a Lesson in These Vendors’ Trajectories?

What do these three examples tell us? High flying enterprise search vendors seem to have run into some difficulties. Not surprisingly, the customers of these companies are often wary of enterprise search. Perhaps that is the reason so many enterprise search vendors do not use the words “enterprise search”, preferring euphemisms like customer support, business intelligence, and knowledge management?

The Rush to Sell Out before Drowning in Red Ink

Now a sidelight. Before open source search effectively became the go to keyword search system, there were vendors who had products that for the most part worked when installed to do basic information retrieval. These companies’ executives worked overtime to find buyers. The founders cashed out and left the new owners to figure out how to make sales, pay for research, and generate sufficient revenue to get the purchase price back. Which companies are these? Here’s a short list and incomplete list to help jog your memory:

  • Artificial Linguistics (sold to Oracle)
  • BRS Search (sold to OpenText)
  • EasyAsk (first to Progress Software and then to an individual investor)
  • Endeca to Oracle
  • Enginium (sold to Kroll and now out of business)
  • Exalead to Dassault
  • Fulcrum Technology to IBM (quite a story. See the Fulcrum profile at www.xenky.com/vendor-profiles)
  • InQuira to Oracle
  • Information Dimensions (sold to OpenText)
  • Innerprise (Microsoft centric, sold to GoDaddy)
  • iPhrase to IBM (iPhrase was a variant of Teratext’s approach)
  • ISYS Search Software to Lexmark (yes, a printer company)
  • RightNow to Oracle (RightNow acquired Dutch technology for its search function)
  • Schemalogic to Smartlogic
  • Stratify/Purple Yogi (sold to Iron Mountain and then to Autonomy)
  • Teratext to SAIC, now Leidos
  • TripleHop to Oracle
  • Verity to Autonomy and then HP bought Autonomy
  • Vivisimo to IBM (how clustering and metasearch magically became a Big Data system from the company that “invented” Watson) .

The brand impact of these acquired search vendors is dwindling. The only “name” on the list which seems to have some market traction is Endeca.

Some outfits just did not make it or who are in a very quiet, almost dormant, mode. Consider  these search vendors:

  • Delphes (academic thinkers with linguistic leanings)
  • Edgee
  • Dieselpoint (structured data search)
  • DR LINK (Syracuse University and an investment bank)
  • Executive Search (not a headhunting outfit, an enterprise search outfit)
  • Grokker
  • Intrafind
  • Kartoo
  • Lextek International
  • Maxxcat
  • Mondosoft
  • Pertimm (reincarnated with Axel Springer (Macmillan) money as Qwant, which according to Eric Schmidt, is a threat to Google. Yeah, right.)
  • Siderean Software (semantic search)
  • Speed of Mind
  • Suggest (Weitkämper Technology)?
  • Thunderstone

These are not a comprehensive list. I just wanted to layout some facts about vendors who tilted at the enterprise search windmill. I think that a reasonable person might conclude that enterprise search has been a tough sell. Of the companies that developed a brand, none was able to achieve sustainable revenues. The information highway is littered with the remains of vendors who pitched enterprise search as the killer app for anything to do with information.

Now the survey purports to reveal insights to which I have been insensitive in my decades of work in digital information access.

Here’s what the company sponsoring the survey offers:

Concept Searching [the survey promulgator], the global leader in semantic metadata generation, auto-classification, and taxonomy management software, and developer of the Smart Content Framework™, is compiling the statistics from its 2015 SharePoint and Office 365 Metadata survey, currently unpublished. One of the findings, gathered from over 360 responses, indicates a renewed focus on improving enterprise search.

The focus seems to be on SharePoint. I thought SharePoint was a mishmash of content management, collaboration, and contacts along with documents created by the fortunate SharePoint users. Question: Is enterprise search conflated with SharePoint?

I would not make this connection.

If I understand this, the survey makes clear that some of the companies in the “sample” (method of selection not revealed) want better search. I want better information access, not search per se.

Each day I have dozens of software applications which require information access activity.  I also have a number of “enterprise” search systems available to me. Nevertheless, the finding suggests to me that enterprise search is and has not been particularly good. If I put on my SharePoint sunglasses, I see a glint of the notion that SharePoint search is not very good. The dying sparks of Fast Search technology smoldering in fire at Camp DontWorkGud.

Images, videos, and audio content present me with a challenge. Enterprise search and metatagging systems struggle to deal with these content types. I also get odd ball file formats; for example, Framemaker, Quark, and AS/400 DB2 UDB files.

The survey points out that the problem with enterprise search is that indexing is not very good. That may be an understatement. But the remedy is not just indexing, is it?

After reading the news release, I formed the opinion that the fix is to use the type of system available from the survey sponsor Concept Searching. Is that a coincidence?

Frankly, I think the problems with search are more severe than bad indexing, whether performed by humans or traditional “smart” software.

According the news release, my view is not congruent with the survey or the implications of the survey data:

A new focus on enterprise search can be viewed as a step forward in the management and use of unstructured content. Organizations are realizing that the issue isn’t going to go away and is now impacting applications such as records management, security, and litigation support. This translates into real business currency and increases the risk of non-compliance and security breaches. You can’t find, protect, or use what you don’t know exists. For those organizations that are using, or intend to deploy, a hybrid environment, the challenges of leveraging metadata across the entire enterprise can be daunting, without the appropriate technology to automate tagging.

Real business currency. Is that money?

Are system administrators still indexing human resource personnel records, in process legal documents related to litigation, data from research tests and trials in an enterprise search system? I thought a more fine-grained approach to indexing was appropriate. If an organization has a certain type of government work, knowledge of that work can only be made available to those with a need to know. Is indiscriminate and uncontrolled indexing in line with a “need to know” approach?

Information access has a bright future. Open source technology such as Lucene/Solar/Searchdaimon/SphinxSearch, et al is a reasonable approach to keyword functionality.

Value-added content processing is also important but not as an add on. I think that the type of functionality available from BAE, Haystax, Leidos, and Raytheon is more along the lines of the type of indexing, metatagging, and coding I need. The metatagging is integrated into a more modern system and architecture.

For instance, I want to map geo-coordinates in the manner of Geofeedia to each item of data. I also want context. I need an entity (Barrerra) mapped to an image integrated with social media. And, for me, predictive analytics are essential. If I have the name of an individual, I want that name and its variants. I want the content to be multi-language.

I want what next generation information access systems deliver. I don’t want indexing and basic metatagging. There is a reason for Google’s investing in Recorded Future, isn’t there?

The future of buggy whip enterprise search is probably less of a “strategic application” and more of a utility. Microsoft may make money from SharePoint. But for certain types of work, SharePoint is a bit like Windows 3.11. I want a system that solves problems, not one that spawns new challenges on a daily basis.

Enterprise search vendors have been delivering so-so, flawed, and problematic functionality for 40 years. After decades of vendor effort to make information findable in an organization, has significant progress been made. DARPA doesn’t think search is very good. The agency is seeking better methods of information access.

What I see when I review the landscape of enterprise search is that today’s “leaders”  (Attivio, BA Insight, Coveo, dtSearch, Exorbyte, among others) remind me of the buggy whip makers driving a Model T to lecture farmers that their future depends on the horse as the motive power for their tractor.

Enterprise search is a digital horse, an one that is approaching break down.

Enterprise search is a utility within more feature rich, mission critical systems. For a list of 20 companies delivering NGIA with integrated content processing, check out www.xenky.com/cyberosint.

Stephen E Arnold, March 20, 2015

Accenture Makes a Big Purchase to Chase Government Clients

March 20, 2015

Accenture Federal Services (AFS) is one of the leading companies that provide technology and digital solutions for the US federal government. The parent company Accenture LLP has sought to increase its amount of federal contracts as well a products and services, so the company decided to purchase Agilex Technologies, Inc says Big News Network in “Accenture Unit To Agilex Technologies.”

” ‘Acquiring Agilex will help AFS further solidify our position as an innovative leader in the federal market. Combining our digital capabilities and agile methods will accelerate our ability to help clients harness the power of emerging digital technologies and rapid, predictable systems deployment for the federal government’s most complex challenges,’ said David Moskovitz, Accenture Federal Services chief executive.”

AFS plans to use Agilex’s technology to improve its own analytics, cloud, and mobile technology for federal organizations. Agilex, like its new owner, has worked with every cabinet-level department and federal agencies in defense, intelligence, public safety, civilian and military health organizations.

AFS will have more to offer its federal clients, but it does beg the question if it will lead to a monopoly on government contracts or increase the competition?

Whitney Grace, March 20, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Legal Technology Update

March 20, 2015

It seems that the field of legal tech is making progress. Above the Law reports on “Today’s (Legal) Tech: The State of Legal Technology in 2015.” Writer Nicole Black attended the LegalTech New York conference. She highly recommends this conference to her colleagues in the legal technology field, by the way. She also came away with a list of new legal tools. Be aware, though, that e-discovery and information-governance solutions are not among them; those areas just aren’t her cup of tea. Black writes:

“Whenever I attend LegalTech, one of my goals is to learn about new and interesting legal tools that are NOT related to e-discovery or information governance, since these areas simply don’t interest me. Trying to locate vendors with offerings outside of these two categories is no small task at LegalTech. The conference organizers seem to be single-mindedly focused on these subjects and you can’t walk more than two feet in the Exhibit Hall without tripping over a booth offering software related to either topic.

“But, I doggedly sifted through the slew of emails I received from vendors until I found a few with products that interested me. As is the case every year, a theme seems to emerge after I’ve met with the various vendors, and this year it was documents, documents, and more documents.”

Black goes on to list several vendors of interest. She met with three offering litigation-prep document management, Factbox, Allegory, and Opus 2 Magnum. Each works a little differently from the others, she notes. Then there’s Redact Assistant, which simplifies the removal of sensitive content; Plainlegal, which supplies document automation for IP filings; and Brainloop, which offers virtual data rooms to enhance collaboration. The final entry, Box, is a general online-document storage and collaboration tool that has been making inroads into the legal space.

Black wraps up her article with a description of swag found at the conference, but I’ll let you navigate to the article for those card-game-related details. It sounds like the conference was a lot of fun.

Cynthia Murrell, March 19, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Adobe: A Document Cloud Looms

March 19, 2015

Adobe is moving from PDF creation to document management. I avoid Adobe Acrobat because it bedeviled me years ago with a PDF dongle. The dongle had a counter. After we created the number of documents authorized by the dongle, the opportunity to purchase another dongle arose. Exciting. That warned me off the outfit.

I brushed against Adobe when I researched the original Enterprise Search Report in 2003. That was a mere 12 years ago, yet the memory is still fresh. I was trying to figure out what vendor provided the search system for Adobe products. After reading publicly accessible information and making fruitless attempts to speak to a person who knew about search at Adobe, I learned by accident the name of the provider.

Do you recognize the name Lextek. I sure did not. I offer a no cost summary of this company and its search system at this link. I was fascinated with Lextek because I had difficulty locating information using the Adobe products which incorporated this system. I had a short list of other search systems Adobe has used over the years to the same result. I invite you to fire up an Adobe product and try to locate the information needed to solve a problem or learn a procedure or figure out what state an Adobe software product is in. Let me know how that works out for you.

I read “Adobe Unveils Cloud Electronic Document Service.” I learned that “Adobe Systems will launch a cloud-based document management service within the month.” That’s soon. The article continued:

The company said the core of the new service is Adobe Acrobat, the world’s most sought-after document management software. The upgraded Adobe Acrobat Document Cloud enables document managers to produce, check and confirm official documents on both personal computers and mobile devices. They also can put an electronic signature to the Portable Document Format (PDF) file to give it a legal force, the company said.

Yikes, another silo of data for an organization to “federate.”

Several questions crossed my mind:

  • What is the search system for the system? (Lextek’s owners operate a confectionary store if I understood the research my team assembled.)
  • What is the programmatic access Adobe will provide to an organization placing its PDF documents in the Adobe Document Cloud?
  • What is the security provided for these customers?

Adobe’s play is an interesting one. I wonder if the company will allow its customers to mark documents “public” and then provide an online access service? Worth watching.

Stephen E Arnold, March 19, 2015

Enterprise Search: Messages Confuse, Confound

March 19, 2015

I review a couple of times a week a free digital “newspaper” called Paper.li. I learned about this Paper.li “newspaper” When Vivisimo sent me its version of “search news.” The enterprise search newspaper I receive is assembled under the firm hand of Edwin Stauthamer. The stories are automatically assembled into “The Enterprise Search Daily.”

The publication includes a wide range of information. The referrer’s name appears with each article. The title page for the March 18, 2015, issue is looks like this.

image

In the last week or so, I have noticed a stridency in the articles about search and the disciplines the umbrella term protects from would-be encroachers. Search is customer support, but from the enterprise search vendors’ viewpoint, enterprise search is the secret sauce for a great customer support soufflé. Enterprise search also does Big Data, business intelligence, and dozens of other activities.

The reason for the primacy of search, as I understand the assertions of the search companies and the self appointed search “experts” is that information retrieval makes the business work. Improve search. It follows, according to the logic, that revenues will increase, profits will rise, and employee and customer satisfaction will skyrocket.

Unfortunately enterprise search is difficult to position at the alpha and omega of enterprise software. Consider this article from the March 18 edition of The Enterprise Search Daily.

Why Enterprise Search is a Must Have for Any Enterprise Content Management Strategy

The article begins:

Enterprise search has notoriously been a problem in the content management equation. Various content and document management systems have made it possible to store files. But the ability to categorize that information intuitively and in a user-friendly way, and make that information easy to retrieve later, has been one of several missing pieces in the ECM market. When will enterprise search be as easy to use and insightful as Google’s external search engine? If enterprise search worked anywhere near as effectively as Google, it might be the versatile new item in our content management wardrobes, piecing content together with a clean sophistication that would appeal to users by making everything findable, accessible and easy to organize.

I am not sure how beginning with the general perception that enterprise search has been, is, and may well be a failure flips to a “must have” product. My view is that keyword search is a utility. For organizations with cash to invest, automated indexing and tagging systems can add some additional findability hooks. The caveat is that the licensee of these systems must be prepared to spend money on a professional who can ride herd on the automated system. The indexing strays have to be rounded up and meshed with the herd. But the title’s assertion is a dream, a wish. I don’t think enterprise content management is particularly buttoned up in most organizations. Even primitive search systems struggle to figure out what version is the one the user needs to find. Indexing by machine or human often leads to manual inspection of documents in order to locate the one the user requires. Google wanders into the scene because most employees give Google.com a whirl before undertaking a manual inspection job. If the needed document is on the Web somewhere, Google may surface it if the user is lucky enough to enter the secret combination of keywords. Google is deeply flawed, but for many employees, it is better than whatever their employer provides.

Read more

Apache Samza Revamps Databases

March 19, 2015

Databases have advanced far beyond the basic relational databases. They need to be consistently managed and have real-time updates to keep them useful. The Apache Software Foundation developed the Apache Samza software to help maintain asynchronous stream processing network. Samza was made in conjunction with Apache Kafka.

If you are interested in learning how to use Apache Samza, the Confluent blog posted “Turning The Database Inside-Out With Apache Samza” by Martin Keppmann. Kleppmann recorded a seminar he gave at Strange Loop 2014 that explains his process for how it can improve many features on a database:

“This talk introduces Apache Samza, a distributed stream processing framework developed at LinkedIn. At first it looks like yet another tool for computing real-time analytics, but it’s more than that. Really it’s a surreptitious attempt to take the database architecture we know, and turn it inside out. At its core is a distributed, durable commit log, implemented by Apache Kafka. Layered on top are simple but powerful tools for joining streams and managing large amounts of data reliably.”

Learning new ways to improve database features and functionality always improve your skill set. Apache Software also forms the basis for many open source projects and startups. Martin Kleppman’s talk might give you a brand new idea or at least improve your database.

Whitney Grace, March 20, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

SharePoint Gets Serious with Information Governance

March 19, 2015

SharePoint has enjoyed continued success over the last 15 years, but it has not been without some bumps along the way. Information governance is one of the noted areas in which Share has fallen flat. Read more in the CMS Wire article, “Keeping SharePoint In Check with Information Governance.”

The article begins:

“Historically, SharePoint was thought to cause as many information governance problems as it solved. The 2001 to 2003 versions did not show Microsoft putting much effort into helping customers with information governance. But after the massive take up of SharePoint Portal Server 2007 licenses, and the often negative conversations coming out of the sizable SharePoint user community, Microsoft started to take governance issues seriously.”

In addition to keep an eye on your news feed for the latest SharePoint buzz, staying tuned to experts in the field is a great way to save time and get pointed information pertaining to improving a SharePoint installation. Stephen E. Arnold has one such SharePoint feed on his Web site, ArnoldIT.com. Focusing on tips, tricks, and news, Arnold collocates much of content that users and managers alike will find helpful for navigating day-to-day SharePoint operations.

Emily Rae Aldridge, March 19, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Give Employees the Data they Need

March 19, 2015

A classic quandary: will it take longer to reinvent a certain proverbial wheel, or to find the documentation from the last time one of your colleagues reinvented it? That all depends on your organization’s search system. An article titled “Help Employees to ‘Upskill’ with Access to Information” at DataInformed makes the case for implementing a user-friendly, efficient data-management platform. Writer Diane Berry, not coincidentally a marketing executive at enterprise-search company Coveo, emphasizes that re-covering old ground can really sap workers’ time and patience, ultimately impacting customers. Employees simply must be able to quickly and easily access all company data relevant to the task at hand if they are to do their best work. Berry explains why this is still a problem:

“Why do organizations typically struggle with implementing these strategies? It revolves around two primary reasons. The first reason is that today’s heterogeneous IT infrastructures form an ‘ecosystem of record’ – a collection of newer, cloud-based software; older, legacy systems; and data sources that silo valuable data, knowledge, and expertise. Many organizations have tried, and failed, to centralize information in a ‘system of record,’ but IT simply cannot keep up with the need to integrate systems while also constantly moving and updating data. As a result, information remains disconnected, making it difficult and time consuming to find. Access to this knowledge often requires end-users to conduct separate searches within disconnected systems, often disrupting co-workers by asking where information may be found, and – even worse – moving forward without the knowledge necessary to make sound decisions or correctly solve the problem at hand.

“The second reason is more cultural than technological. Overcoming the second roadblock requires an organization to recognize the value of information and knowledge as a key organizational asset, which requires a cultural shift in the company.”

Fair enough; she makes a good case for a robust, centralized data-management solution. But what about that “upskill” business? Best I can tell, it seems the term is not about improving skills, but about supplying employees with resources they need to maximize their existing skills. The term was a little confusing to me, but I can see how it might be catchy. After all, marketing is the author’s forte.

Cynthia Murrell, March 19, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta