October 15, 2014
I assume this statement is a surprise to some folks:
Now, disclosure text has become very small, and the shading very subtle, meaning users often don’t realize they are clicking through to ads rather than the most relevant result for their query.
In an increasingly important quest for revenue, these allegedly deceptive ads may be just the beginning of math club maneuvers. Relevance has a new meaning. Perhaps it is a synonym for revenue?
Stephen E Arnold, October 14, 2014
October 15, 2014
Web site design used to be reserved for graphic designers with a fancy degree and background in computer science. Times have changed from the daunting trials of coding to simple click and drag selections. The advent of WordPress, Tumblr, Wix, Weebly, and Squarespace Web site design services simplify the process so anyone can create a decent site in seconds. If, however, you are interested in building a site that is more interactive than standard templates, then start taking advantage of UICloud.
“UICloud is a project created by Double-J Design. It collects the best UI element designs from the Internet all over the world and provides a search engine for you to find the best UI element that you need. We are aiming to create the biggest platform for designers to showcase their top user interface designs and for developer to get the best UI elements for their project easily and quickly.”
UICloud combines elements of Web site browsing and searching in one place. If you search for a specific topic, the results appear in thumbnails so you can preview the art. It takes advantage of the “magazine” format that’s grown popular. Categories are reminiscent of old webrings and link lists that used to collect related Web sites in one place. Categories are a neat feature, because it saves the trouble of searching and takes you straight to browsing. Remember how half the links used to be defunct? It is easy to see that happening.
Users can submit their user interface design to UICloud and then it will be added to the search results. All the listings might not be under the creative commons agreement. The UICloud team notes that you need to check with the artist before you use them.
October 14, 2014
Is it true that if one bets on enough race horses, one will win? Seems logical to those who hang out at Churchill Downs I suppose.
Miles away from the race track, I found the audiences I addressed at last week’s intelligence and law enforcement conference skeptical of Google’s search results. In 2013, there was more surprise when I demonstrated how queries for “twerk” did not involve too much “search.”
After the sessions, attendees commented on how much work is required to ferret out relevant results to queries. The notion that LE and intel professionals had to learn command line syntax to get useful information was a situation I did not think would arise. Hey, Google has smart software, artificial intelligence, the world’s fastest search engine, yada yada.
Searching Google is actually difficult if one wants to answer certain types of questions; for example, who in Scotland sells tactical shotgun silencers. Give that query a whirl in your spare time.
I read “Google Set to Lead Huge Investment in Magic Leap and Its “Cinematic Reality”. The write up provides a surprisingly poignant glimpse of into the Google business machine. Google is no longer content to borrow notions like head mounted mobile phones. Google wants to lift beyond big balloons which rhymes with “loons.” Google does not want to solve death.
Nope. Google is betting that it can invest in companies and tap into a new, swelling revenue stream. Search, it seems, has become an optimization task for the Googlers. The future lies in “cinematic reality.”
Google wants to be the lead elephant in the investing parade for Magic Leap. You can work through the original document to get a sense of the Fancy Dan augmented reality technology Magic leap allegedly possesses.
My view is that Google has to find a way to sustain revenue growth. Search is not the prize winning stallion it once was. I assume that Google believes that investments in companies that deliver magic will produce big bucks.
For me, I am concerned that the utility of the Google search system will continue to decrease for the types of research I do. If the feedback I received from LE and intel professionals is representative, there are a number of serious individuals who want a Google search to return relevant results, not ads and promotions for Google products and services.
I am all for magic, but magic involves tricks. Search requires more than wild bets and a faith in magic.
I do not crave a more realistic three dimensional experience. I am okay with a system that:
- Includes useful content in an accessible interface. Google’s convoluted blog search is not what I call accessible.
- Presents results that are in line with needs of the user, not the needs of the advertiser.
- Provides more frequently refreshed indexes for pages with content that are not focused on Dancing with the Stars, vacations, and hotels.
I want some of that old time search magic. Maybe a futuristic, robotic pony clone will make Google billions. I prefer a search donkey that gets the job done. Onward, precision and recall.
Stephen E Arnold, October 14, 2014
October 13, 2014
In the supplemental lecture added after the intel conference ended, I addressed the topic of disappearing content. The “right to be forgotten” is one of the great ideas emerging from government committees. I wonder who wants to be forgotten? I provided some basic information about finding information about these forgotten entities.
One of the attendees at my lecture alerted me to Profile Engine. I navigated to the link and learned:
Profile Engine is a fairly low-budget-looking search engine, started in 2007 in New Zealand and partly owned by the Auckland University of Technology. It allows you to find people on social networks. Google has been getting a lot of requests to reverse this trend—almost 3,300 results from Profile Engine have been taken down by Google since May, when the “right to be forgotten” came into effect.
You can find Profile Engine at http://profileengine.com/. We can’t endorse the system, but we will check it out, and I will have an update for my next lecture. Conference organizers extend invitations via email. If you don’t hear about an event, you need to get yourself unforgotten. That’s a bit of humor for this Monday morning.
Stephen E Arnold, October 13, 2014
October 12, 2014
Oracle’s Secure Enterprise Search offered advanced security. Perfect Search stressed its speed. SES has been marginalized. That particular security pitch did not work. Perfect Search also has faded from the scene.
Perhaps pitching both security and speed will yield more together than as separate features.
SRCH2 asserts that it is four times faster than open source search engines. None of the open source search engines is a speed demon. Speed boosts require additional work on the specific subsystem introducing the latency for a particular deployment.
SRCH2’s “Real Time Computer Requires Faster Search” makes a case for the optimization built in to SRCH2’s system. The article states:
SRCH2 offers the world’s fastest search engine. Why is speed so important? After all, the human eye can’t detect the difference between a 10-millisecond and 50-millisecond response time.
Some data backing this assertion would be helpful. In a direct comparison of Lucid Works’ technology with ElasticSearch’s technology, the ArnoldIT team found that one was faster in indexing and the other was faster in query processing. Both could be improved with focused optimization. Perhaps SRCH2 will share some of their data which backs up the “four time faster claim? (I am not at liberty to release the performance data a client requested my team compile from live tests on my test corpus.
SRCH2’s “SRCH2 Introduces Access Control Lists to Improve Search Security.” The article states:
SRCH2 took the approach of providing native support of access control to set restrictions on search results. With SRCH2’s ACL feature, developers can restrict user permissions to access either certain records in an index, or specific attributes within a record or set of records.
The approach is useful. However, it is less robust that the Oracle approach which implemented a wider range of features provided by specialized Oracle subsystems.
Will the combination of security and speed pay off for SRCH2? Good question. I do not have an answer.
Stephen E Arnold, October 11, 2014
October 10, 2014
A happy quack to the reader who forwarded me a link to the biographical information for Gerald Burnand. I learned from this information page that Convera lives on as Ntent. The page reports:
Ntent. Privately Held; Search Technology, Semantics, Advertising and Marketing company. Previously was Vertical Search Works, born from the merger of Convera and Firstlight ERA (a UK company).
Stephen E Arnold, October 11, 2014
October 10, 2014
I worked through the 34 page report “Industry Watch. Search and Discovery. Exploiting Knowledge, Minimizing Risk.” The report is based on a sampling of 80,000 AIIM community members. The explanation of the process states:
Graphs throughout the report exclude responses from organizations with less than 10 employees, and suppliers of ECM products and services, taking the number of respondents to 353.
The demographics of the sample were tweaked to discard responses from organizations with fewer than 10 employees. The sample included respondents from North America (67 percent), Europe (18 percent) and “rest of world” (15 percent).
Some History for the Young Reader of Beyond Search
AIIM has roots in imaging (photographic and digital imaging). Years ago I spent an afternoon with Betty Steiger, a then well known executive with a high profile in Washington, DC’s technology community. She explained that the association wanted to reach into the then somewhat new technology for creating digital content. Instead of manually indexing microfilm images, AIIM members would use personal computers. I think we connected in 1982 at her request. My work included commercial online indexing, experiments in full text content online, a CD ROM produced in concert with Predicasts’ and Lotus, and automated indexing processes invented by Howard Flank, a sidekick of mine for a very long time. (Mr. Flank received the first technology achievement award from the old Information Industry Association, now the SIIA).
AIIM had its roots in the world of microfilm. And the roots of microfilm reached back to University Microfilms at the close of World War II. After the war, innovators wanted to take advantage of the marvels of microimaging and silver-based film. The idea was to put lots of content on a new medium so users could “find” answers to questions.
The problem for AIIM (originally the National Micrographics Association) was indexing. As an officer at a company considered in the 1980 as one of the leaders in online and semi automated indexing methods, Ms. Steiger and I had a great deal to discuss.
But AIIM evokes for me:
Microfilm —> Finding issues —> Digital versions of microfilm —> CD ROMs —> On premises online access —> Finding issues.
I find the trajectory of a microfilm leading to pronouncements about enterprise search, content processing, and eDiscovery fascinating. The story of AIIM is a parallel for the challenges the traditional publishing industry (what I call the “dead tree method”) has, like Don Quixote, galloped, galloped into battle with ones and zeros.
Asking a trade association’s membership for insights about electronic information is a convenient idea. What’s wrong with sampling the membership and others in the AIIM database, discarding those who belong to organizations with fewer than 10 employees, and tallying up the survey “votes.” For most of those interested in search, absolutely nothing. And that may be part of the challenge for those who want to get smart about search, findability, and content processing.
Let’s look at three findings from the 30 plus page study. (I have had to trim because the number of comments and notes I wrote when reading the report is too massive for Beyond Search.)
Finding: 25 percent have no advanced or dedicated search tools. 13 percent have five or more [advanced or dedicated search tools].
Talk about good news for vendors of findability solutions. If one thinks about the tens of millions of organizations in the US, one just discards the 10 percent with 10 or fewer employees, and there are apparently quite a large percentage with simplistic tools. (Keep in mind that there are more small businesses than large businesses by a very wide margin. But that untapped market is too expensive for most companies to penetrate with marketing messages.) The study encourages the reader to conclude that a bonanza awaits the marketer who can identify these organizations and convince them to acquire an advanced or dedicated search tool. There is a different view. The research Arnold IT (owner of Beyond Search) has conducted over the last couple of decades suggests that this finding conveys some false optimism. For example, in the organizations and samples with which we have worked, we found almost 90 percent saturation of search. The one on one interviews reveal that many employees were unaware of the search functions available for the organization’s database system or specialized tools like those used for inventory, the engineering department with AutoCAD, or customer support. So, the search systems with advanced features are in fact in most organizations. A survey of a general population reveals a market that is quite different from what the chief financial officer perceives when he or she tallies up the money spent for software that includes a search solution. But the problems of providing one system to handle the engineering department’s drawings and specifications, the legal departments confidential documents, the HR unit’s employee health data, and the Board of Director’s documents revealing certain financial and management topics have to remain in silos. There is, we have found, neither an appetite to gather these data nor the money to figure out how to make images and other types of data searchable from a single system. Far better to use a text oriented metasearch system and dismiss data from proprietary systems, images, videos, mobile messages, etc. We know that most organizations have search systems about which most employees know nothing. When an organization learns about these systems and then gets an estimate to creating one big federated system, the motivation drains from those who write the checks. In our research, senior management perceives aggregation of content as increasing risk and putting an information time bomb under the president’s leather chair.
Finding: 47% feel that universal search and compliant e-discovery is becoming near impossible given the proliferation of cloud share and collaboration apps, personal note systems and mobile devices. 60% are firmly of the view that automated analytics tools are the only way to improve classification and tagging to make their content more findable.
The thrill of an untapped market fades when one considers the use of the word “impossible.” AIIM is correct in identifying the Sisyphean tasks vendors face when pitching “all” information available via a third party system. Not only are the technical problems stretching the wizards at Google, the cost of generating meaningful “unified” search results are a tough nut to crack for intelligence and law enforcement entities. In general, some of these groups have motivation, money, and expertise. Even with these advantages, the hoo hah that many search and eDiscovery vendors pitch is increasing potential customers’ skepticism. The credibility of over-hyped findability solutions is squandered. Therefore, for some vendors, their marketing efforts are making it more difficult for them to close deals and causing a broader push back against solutions that are known by the prospects to be a waste of money. Yikes. How does a trade association help its members with this problem? Well, I have some ideas. But as I recall, Ms. Steiger was not too thrilled to learn about the nitty gritty of shifting from micrographics to digital. Does the same characteristic exist within AIIM today? I don’t know.
October 9, 2014
I just returned after three days at a content processing conference. This was not one of the search engine optimization or vendor rah rah shows for search and business intelligence. Nevertheless, several presentations and numerous participants voiced a need for a “big red button.”
I think search and content processing vendors may want to spend a few minutes thinking about this metaphor.
So what’s a big red button? The idea among the law enforcement and intelligence professionals at this conference in the suburbs of Washington, DC embrace an office supply vendor advertising campaign. The ad made the point that ordering from the vendor was as easy as pressing a big red button labeled “Easy.”
How does an ad for ink jet cartridges and pencils relate to six and seven figure enterprise search and content processing systems?
Easy, of course.
At this conference, the attendees and a number of speakers talked about the need to simplify findability, tracking, and analysis systems. The fancy visualizations, the ability to store massive amounts of data in a secure cloud, and the appetite among investigators for data is rising.
The usability of the systems is either choked by work cycles that do not produce useful outputs, held back by a shortage of specialists who can operate these systems, or weighted down with bells and whistles that get in the way of some essential functions.
Enterprise search, analysis systems, and intelligence systems were described as one exhibitor as “the major barrier to sales.” One of the speakers from an investigative unit groused, “Once set up, my team has a very difficult time making changes to get the outputs in line with our operational needs.”
A recent study by the Association for Information and Image Management (AIIM) reported that “more than half of the organizations surveyed show little maturity in their approach to search, with no strategy, no allocated budget and no identified owner.”
How can vendors deliver solutions when the customers exhibit indifference to useful technology? How can the technology deliver results to the user so that more informed decisions can be made?
These are important questions that cannot be answered by references to low cost search options, buzzwords, and bootlegging fixes so a single user or a small unit can access digital information.
Several observations are warranted:
First, the sales cycle is becoming longer for many vendors, not just those at the intelligence trade show I attended. Digital solutions are procured in a way that defers a decision. None of the individuals involved wants to make a choice that will lead to pushback from users or scrutiny from the accounting department.
Second, the users get tangled in complex systems. When new systems are explored, the users want simplicity and the vendors deliver complexity. The “failure to communicate” adds bureaucratic friction and in some cases flare ups among vendors, decision makers, and technical staff reviewing a solution.
Third, the benefits of a system or an incumbent system are often very difficult if not impossible to demonstrate. Without concrete data about cost/benefit or crimes solved or good decisions/bad decision ratios—search and content processing has a credibility problem.
The big red button is a powerful metaphor that suggests a pivotal moment in digital information access has arrived. Without a big red button, search and content processing may face even stronger headwinds going forward.
Stephen E Arnold, October 9, 2014
October 8, 2014
One contender in the site search market was recently upgraded to support more than 30 languages. We learn of the enhancement in, “Hawk Search Widens its Global Reach” at Direct Marketing News. The write-up tells us:
“Hawk Search’s solution offers support for more than twice as many languages as other site search providers, according to the company. The multi-language feature—which offers such languages as Spanish, Mandarin, German, French, and Russian—can be implemented site-wide for a unified customer experience with search and browse functions. The multi-language feature allows Hawk Search to do its thing on established international sites in addition to serving as a boon for online retailers looking to expand their regions.”
Hawk Search is produced by Thanx Media, which was founded in 2005 and makes its home in Glen Ellyn, Illinois. The company offers a suite of information-management tools including data collection and ecommerce as well as site search. They also appear to be hiring as of this writing, in case anyone is interested.
Cynthia Murrell, October 08, 2014
October 7, 2014
Google, believe it or not, is responsible in part for the problems with enterprise search. The idea is advanced in “Why the “Google Paradigm” Has Damaged Enterprise Search.” The core of the argument is that people use Google for Web search. The resulting perception is that “enterprise search is as easy as Google web search, and that a central index of an enterprise is the right way to do enterprise search. ”
Google’s entrance into enterprise search was one of the companies earliest attempts to enter a market in which revenue came from a subscription or license, not a fee for advertising. The Google Search Appliance was a server loaded with a version of Google’s Web search system. Based on our work with the first GSA, it was clear that like many other Google products and services from the 2001 to 2004 period, Google was operating on some Googley assumptions; for example:
- Google assumed that a version of its Web search system stripped of its ad matching was good enough for finding textual content in an organization
- The company assumed that Autonomy, Endeca, and Fast Search & Transfer, the dominant enterprise search vendors at this time were too complex for most technical staff in an organization. The time and complexity of these systems contributed to the high user dissatisfaction with these systems. The high cost of these industry leaders’ systems contributed to management grousing about search.
- Google assumed that it could disintermediate traditional information technology departments and deal directly with end users.
Google crafted a server that was positioned as a “search toaster.” The low price of the basic unit was less than $2,000 and sported an interface that required the licensee to plug in basic information and click a button to start the indexing process.
The Google Search Appliance by 2007 had an estimated 50,000 licensees. At that time, the product line had expanded but the locked down nature of the Google Search Appliance and the key word approach of the system was creating sales opportunities for other search appliance vendors; namely, Thunderstone, Maxxcat, and Index Engines.
Google added features and fiddled with the license fees, hardening the GSA product line with hot backups, connectors, and extensibility via licensed vendors. Few analysts paid much attention to the product licensing fees for the various “GB” or Google boxes. If you want to get a sense of the costs for building out a GSA system that can process 100 million documents, navigate to www.GSAadvantage.gov and search for the Google’s search appliances. The costs work out to be comparable or slightly higher than a similar installation from Autonomy, Endeca, or Fast. The high prices remain today.
Google learned from the GSA experience. Instead of offering an enterprise cloud solution, the company has left a limited and pricey GSA product line in the market and provided a modest commitment to this enterprise search solution. Google’s cloud solution manifests itself in Google’s site search features. I am waiting for Google either to kill the product line or amp up its commitment. In my opinion, the GSA is in no man’s land at this time. It appears that not even Google can respond to the needs of the enterprise findability users. If any company could crack the code, would it not be Google or a Xoogler’s start up?
As the GSA emerged as placeholder product, professionals became more and more dependent on Google’s Web search. In Europe, for example, Google’s Web search commands an market share in excess of 80 percent. In Denmark, Google’s share of Web search is north of 90 percent. In the US, Google has a 65 to 75 percent share of Web search, depending on which consultancies’ numbers one uses.
The word “search” became synonymous with Google. Enterprise search vendors began to use jargon other than search. This step was a natural reaction to hearing from prospects, “We want a search that works just like Google.” What the prospects meant was a system that was easy to use and seemed to deliver useful results in the hits displayed at the top of a results list, a page of images, or a map showing a location.
Google Web search, not the Google Search Appliance, reflected a broader shift in the information access market. Users of Web search and enterprise systems wanted and still want:
- Systems that do not require the user to invest much time and effort in getting an answer
- Systems that can produce useful outputs whether text, images, or maps with data displayed on them
- Systems that delivered “answers” without the delays (latency) many enterprise systems force on users.
Google’s ability to respond to this enterprise demand has been ineffectual. Like other Web search vendors, key word retrieval does not solve the problems basic search systems spawn. The GSA is evidence that Google does not have the key to unlock the revenue vault for enterprise search.
What Google search has done (inadvertently I might add) has been to make crystal clear that users do not want to work hard for information users perceive as useful. Precision and recall are irrelevant because voting and advertisers influence Google Web search results. Users love Google’s outputs.
In the organization, procurement teams, individual users, and senior management boil their needs down to one simple statement: “We want search that is just like Google.”
That’s a big, big problem for search and content processing vendors. Google Web search is not about relevance, objective information, or accuracy. Google is easy and “good enough.” In an organization, people want easy. But in an organization the results have to be timely, comprehensive in terms of what information is available to an organization, and accurate.
On the Web Google can skip content that is malformed or stored on a server that does not respond to a Google spider quickly enough. In an organization, the content has to be available. On the Web, the advertisers and the uses’ own behavioral data pays the bills. In an organization, the organization has to pay the bills. Google has more money from a different business model than most organizations. Google pumps money into plumbing to deliver the service that makes money. Organizations want to fix the amount spent on search and the funds are not infinite.
For search vendors, the problem of Google’s dominance in Web search makes product differentiation difficult. Google’s business model creates challenges for vendors who have to justify the “value” and hence the “cost” of their search systems. For traditional search vendors, ease of use is very, very difficult because of the nature of the questions enterprise system users have.
Google is a mirror in which societal, cultural, and intellectual changes in information access are reflected. For many years, I have called attention to the verbal push ups search vendors use to try and make sales. The struggled Hewlett Packard have had with Autonomy provide a glimpse of how “value” can be difficult to change into hard cash Microsoft’s Delve illustrates that search for Office 365 is a combination of contacts, alerts, and personalization, not key word search. The dependence on enterprise search companies for cash from venture capital sources illustrates that traditional search is a very, very tough business to make into something sustainable and profitable without financial life support. The expectations that Watson will become a $10 billion business in 60 months is disconnected from the experience of other smart companies. In the history of enterprise search, only Autonomy reported revenues of more than $800 million from enterprise licenses. IBM projects more than 10 X this revenue in 60 months. It took Autonomy more than a decade to hit $500 million.
The reality is that Google is not the problem. Google is a metaphor for what users want when it comes to information access.
The write up asserts:
The Google paradigm also ignores the challenge of scalability. Indexing the enterprise for a centralized enterprise search capability requires major investment. In addition, centralization runs counter to the realities of the working world where information must be distributed globally across a variety of devices and applications. The amount of information we create is overwhelming and the velocity with which that information moves increases daily.
Interesting statement. For me, the problem of the Google paradigm is that it another bit of jargon that sidesteps a what information retrieval must deliver in today’s business environment. Whoever cracks the code can make money. My hunch is that Google probed the enterprise search market and is trying to figure out how to make it pay off in a significant way. Google may be trapped in the same problem space through which other enterprise search and content processing vendors slog. The question may be, “Is there a way out of the swamp and into a land of milk, honey, sustainable revenue, and healthy margins?”
Stephen E Arnold, October 7, 2014