October 16, 2014
This may be old news. We were updating out list of search engines and received an error from the service called Gool.li, a metasearch system. Our last check for this system was in January 2013. At that time the company’s Web site was online and an Android app was available. The name is a variant of the Arabic phrase for “tell me”. More information about the system is available in a nine deck slide presentation at this link.
As you may recall, the service used a panel-style interface or what the company called “cards design”. Each panel corresponded to particular types of content.
The system was described as delivering “knowledge as a service.” One interesting feature of the search results was a grouping of links by domains.
The company was based in Montréal and was a project of Al Akhawayn University. My search file suggests that the system architect may have been Jawad Jari and the service utilized Amazon Web services.
Web metasearch seems to be a harsh taskmaster.
Stephen E Arnold, October 17, 2014
October 15, 2014
There is a presentation “Kicking the Bukkit: Anatomy of an Open Source Meltdown” by Ryan Michela, a developer with experience in open source. Over several years, a game open source project rose and fell. I am not too interested in open source games. At the end of the Slideshare document, there are five reasons an open source game project failed.
Let me summarize these and encourage you to work through he full 55 slide deck. How many of these issues may have an impact on open source search systems. Keep in mind that commercial enterprises like Attivio and IBM make use of open source technology.
- Inclusion of decompiled code in an open source project
- License issues
- Ties ups within the community before a project gains momentum
- No contributor license agreement
- Disgruntled developers in the community.
The presentation includes a quote that I noted:
It only takes one unhappy developer to kill an unprotected project.
Is there an open source search company vulnerable to one or more of these issues? I can name a couple. I wonder if the firm’s funding sources are concerned about their investment “kicking the bucket”?
Stephen E Arnold, October 15, 2014
October 15, 2014
I assume this statement is a surprise to some folks:
Now, disclosure text has become very small, and the shading very subtle, meaning users often don’t realize they are clicking through to ads rather than the most relevant result for their query.
In an increasingly important quest for revenue, these allegedly deceptive ads may be just the beginning of math club maneuvers. Relevance has a new meaning. Perhaps it is a synonym for revenue?
Stephen E Arnold, October 14, 2014
October 15, 2014
Web site design used to be reserved for graphic designers with a fancy degree and background in computer science. Times have changed from the daunting trials of coding to simple click and drag selections. The advent of WordPress, Tumblr, Wix, Weebly, and Squarespace Web site design services simplify the process so anyone can create a decent site in seconds. If, however, you are interested in building a site that is more interactive than standard templates, then start taking advantage of UICloud.
“UICloud is a project created by Double-J Design. It collects the best UI element designs from the Internet all over the world and provides a search engine for you to find the best UI element that you need. We are aiming to create the biggest platform for designers to showcase their top user interface designs and for developer to get the best UI elements for their project easily and quickly.”
UICloud combines elements of Web site browsing and searching in one place. If you search for a specific topic, the results appear in thumbnails so you can preview the art. It takes advantage of the “magazine” format that’s grown popular. Categories are reminiscent of old webrings and link lists that used to collect related Web sites in one place. Categories are a neat feature, because it saves the trouble of searching and takes you straight to browsing. Remember how half the links used to be defunct? It is easy to see that happening.
Users can submit their user interface design to UICloud and then it will be added to the search results. All the listings might not be under the creative commons agreement. The UICloud team notes that you need to check with the artist before you use them.
October 14, 2014
Is it true that if one bets on enough race horses, one will win? Seems logical to those who hang out at Churchill Downs I suppose.
Miles away from the race track, I found the audiences I addressed at last week’s intelligence and law enforcement conference skeptical of Google’s search results. In 2013, there was more surprise when I demonstrated how queries for “twerk” did not involve too much “search.”
After the sessions, attendees commented on how much work is required to ferret out relevant results to queries. The notion that LE and intel professionals had to learn command line syntax to get useful information was a situation I did not think would arise. Hey, Google has smart software, artificial intelligence, the world’s fastest search engine, yada yada.
Searching Google is actually difficult if one wants to answer certain types of questions; for example, who in Scotland sells tactical shotgun silencers. Give that query a whirl in your spare time.
I read “Google Set to Lead Huge Investment in Magic Leap and Its “Cinematic Reality”. The write up provides a surprisingly poignant glimpse of into the Google business machine. Google is no longer content to borrow notions like head mounted mobile phones. Google wants to lift beyond big balloons which rhymes with “loons.” Google does not want to solve death.
Nope. Google is betting that it can invest in companies and tap into a new, swelling revenue stream. Search, it seems, has become an optimization task for the Googlers. The future lies in “cinematic reality.”
Google wants to be the lead elephant in the investing parade for Magic Leap. You can work through the original document to get a sense of the Fancy Dan augmented reality technology Magic leap allegedly possesses.
My view is that Google has to find a way to sustain revenue growth. Search is not the prize winning stallion it once was. I assume that Google believes that investments in companies that deliver magic will produce big bucks.
For me, I am concerned that the utility of the Google search system will continue to decrease for the types of research I do. If the feedback I received from LE and intel professionals is representative, there are a number of serious individuals who want a Google search to return relevant results, not ads and promotions for Google products and services.
I am all for magic, but magic involves tricks. Search requires more than wild bets and a faith in magic.
I do not crave a more realistic three dimensional experience. I am okay with a system that:
- Includes useful content in an accessible interface. Google’s convoluted blog search is not what I call accessible.
- Presents results that are in line with needs of the user, not the needs of the advertiser.
- Provides more frequently refreshed indexes for pages with content that are not focused on Dancing with the Stars, vacations, and hotels.
I want some of that old time search magic. Maybe a futuristic, robotic pony clone will make Google billions. I prefer a search donkey that gets the job done. Onward, precision and recall.
Stephen E Arnold, October 14, 2014
October 13, 2014
In the supplemental lecture added after the intel conference ended, I addressed the topic of disappearing content. The “right to be forgotten” is one of the great ideas emerging from government committees. I wonder who wants to be forgotten? I provided some basic information about finding information about these forgotten entities.
One of the attendees at my lecture alerted me to Profile Engine. I navigated to the link and learned:
Profile Engine is a fairly low-budget-looking search engine, started in 2007 in New Zealand and partly owned by the Auckland University of Technology. It allows you to find people on social networks. Google has been getting a lot of requests to reverse this trend—almost 3,300 results from Profile Engine have been taken down by Google since May, when the “right to be forgotten” came into effect.
You can find Profile Engine at http://profileengine.com/. We can’t endorse the system, but we will check it out, and I will have an update for my next lecture. Conference organizers extend invitations via email. If you don’t hear about an event, you need to get yourself unforgotten. That’s a bit of humor for this Monday morning.
Stephen E Arnold, October 13, 2014
October 12, 2014
Oracle’s Secure Enterprise Search offered advanced security. Perfect Search stressed its speed. SES has been marginalized. That particular security pitch did not work. Perfect Search also has faded from the scene.
Perhaps pitching both security and speed will yield more together than as separate features.
SRCH2 asserts that it is four times faster than open source search engines. None of the open source search engines is a speed demon. Speed boosts require additional work on the specific subsystem introducing the latency for a particular deployment.
SRCH2’s “Real Time Computer Requires Faster Search” makes a case for the optimization built in to SRCH2’s system. The article states:
SRCH2 offers the world’s fastest search engine. Why is speed so important? After all, the human eye can’t detect the difference between a 10-millisecond and 50-millisecond response time.
Some data backing this assertion would be helpful. In a direct comparison of Lucid Works’ technology with ElasticSearch’s technology, the ArnoldIT team found that one was faster in indexing and the other was faster in query processing. Both could be improved with focused optimization. Perhaps SRCH2 will share some of their data which backs up the “four time faster claim? (I am not at liberty to release the performance data a client requested my team compile from live tests on my test corpus.
SRCH2’s “SRCH2 Introduces Access Control Lists to Improve Search Security.” The article states:
SRCH2 took the approach of providing native support of access control to set restrictions on search results. With SRCH2’s ACL feature, developers can restrict user permissions to access either certain records in an index, or specific attributes within a record or set of records.
The approach is useful. However, it is less robust that the Oracle approach which implemented a wider range of features provided by specialized Oracle subsystems.
Will the combination of security and speed pay off for SRCH2? Good question. I do not have an answer.
Stephen E Arnold, October 11, 2014
October 10, 2014
A happy quack to the reader who forwarded me a link to the biographical information for Gerald Burnand. I learned from this information page that Convera lives on as Ntent. The page reports:
Ntent. Privately Held; Search Technology, Semantics, Advertising and Marketing company. Previously was Vertical Search Works, born from the merger of Convera and Firstlight ERA (a UK company).
Stephen E Arnold, October 11, 2014
October 10, 2014
I worked through the 34 page report “Industry Watch. Search and Discovery. Exploiting Knowledge, Minimizing Risk.” The report is based on a sampling of 80,000 AIIM community members. The explanation of the process states:
Graphs throughout the report exclude responses from organizations with less than 10 employees, and suppliers of ECM products and services, taking the number of respondents to 353.
The demographics of the sample were tweaked to discard responses from organizations with fewer than 10 employees. The sample included respondents from North America (67 percent), Europe (18 percent) and “rest of world” (15 percent).
Some History for the Young Reader of Beyond Search
AIIM has roots in imaging (photographic and digital imaging). Years ago I spent an afternoon with Betty Steiger, a then well known executive with a high profile in Washington, DC’s technology community. She explained that the association wanted to reach into the then somewhat new technology for creating digital content. Instead of manually indexing microfilm images, AIIM members would use personal computers. I think we connected in 1982 at her request. My work included commercial online indexing, experiments in full text content online, a CD ROM produced in concert with Predicasts’ and Lotus, and automated indexing processes invented by Howard Flank, a sidekick of mine for a very long time. (Mr. Flank received the first technology achievement award from the old Information Industry Association, now the SIIA).
AIIM had its roots in the world of microfilm. And the roots of microfilm reached back to University Microfilms at the close of World War II. After the war, innovators wanted to take advantage of the marvels of microimaging and silver-based film. The idea was to put lots of content on a new medium so users could “find” answers to questions.
The problem for AIIM (originally the National Micrographics Association) was indexing. As an officer at a company considered in the 1980 as one of the leaders in online and semi automated indexing methods, Ms. Steiger and I had a great deal to discuss.
But AIIM evokes for me:
Microfilm —> Finding issues —> Digital versions of microfilm —> CD ROMs —> On premises online access —> Finding issues.
I find the trajectory of a microfilm leading to pronouncements about enterprise search, content processing, and eDiscovery fascinating. The story of AIIM is a parallel for the challenges the traditional publishing industry (what I call the “dead tree method”) has, like Don Quixote, galloped, galloped into battle with ones and zeros.
Asking a trade association’s membership for insights about electronic information is a convenient idea. What’s wrong with sampling the membership and others in the AIIM database, discarding those who belong to organizations with fewer than 10 employees, and tallying up the survey “votes.” For most of those interested in search, absolutely nothing. And that may be part of the challenge for those who want to get smart about search, findability, and content processing.
Let’s look at three findings from the 30 plus page study. (I have had to trim because the number of comments and notes I wrote when reading the report is too massive for Beyond Search.)
Finding: 25 percent have no advanced or dedicated search tools. 13 percent have five or more [advanced or dedicated search tools].
Talk about good news for vendors of findability solutions. If one thinks about the tens of millions of organizations in the US, one just discards the 10 percent with 10 or fewer employees, and there are apparently quite a large percentage with simplistic tools. (Keep in mind that there are more small businesses than large businesses by a very wide margin. But that untapped market is too expensive for most companies to penetrate with marketing messages.) The study encourages the reader to conclude that a bonanza awaits the marketer who can identify these organizations and convince them to acquire an advanced or dedicated search tool. There is a different view. The research Arnold IT (owner of Beyond Search) has conducted over the last couple of decades suggests that this finding conveys some false optimism. For example, in the organizations and samples with which we have worked, we found almost 90 percent saturation of search. The one on one interviews reveal that many employees were unaware of the search functions available for the organization’s database system or specialized tools like those used for inventory, the engineering department with AutoCAD, or customer support. So, the search systems with advanced features are in fact in most organizations. A survey of a general population reveals a market that is quite different from what the chief financial officer perceives when he or she tallies up the money spent for software that includes a search solution. But the problems of providing one system to handle the engineering department’s drawings and specifications, the legal departments confidential documents, the HR unit’s employee health data, and the Board of Director’s documents revealing certain financial and management topics have to remain in silos. There is, we have found, neither an appetite to gather these data nor the money to figure out how to make images and other types of data searchable from a single system. Far better to use a text oriented metasearch system and dismiss data from proprietary systems, images, videos, mobile messages, etc. We know that most organizations have search systems about which most employees know nothing. When an organization learns about these systems and then gets an estimate to creating one big federated system, the motivation drains from those who write the checks. In our research, senior management perceives aggregation of content as increasing risk and putting an information time bomb under the president’s leather chair.
Finding: 47% feel that universal search and compliant e-discovery is becoming near impossible given the proliferation of cloud share and collaboration apps, personal note systems and mobile devices. 60% are firmly of the view that automated analytics tools are the only way to improve classification and tagging to make their content more findable.
The thrill of an untapped market fades when one considers the use of the word “impossible.” AIIM is correct in identifying the Sisyphean tasks vendors face when pitching “all” information available via a third party system. Not only are the technical problems stretching the wizards at Google, the cost of generating meaningful “unified” search results are a tough nut to crack for intelligence and law enforcement entities. In general, some of these groups have motivation, money, and expertise. Even with these advantages, the hoo hah that many search and eDiscovery vendors pitch is increasing potential customers’ skepticism. The credibility of over-hyped findability solutions is squandered. Therefore, for some vendors, their marketing efforts are making it more difficult for them to close deals and causing a broader push back against solutions that are known by the prospects to be a waste of money. Yikes. How does a trade association help its members with this problem? Well, I have some ideas. But as I recall, Ms. Steiger was not too thrilled to learn about the nitty gritty of shifting from micrographics to digital. Does the same characteristic exist within AIIM today? I don’t know.
October 9, 2014
I just returned after three days at a content processing conference. This was not one of the search engine optimization or vendor rah rah shows for search and business intelligence. Nevertheless, several presentations and numerous participants voiced a need for a “big red button.”
I think search and content processing vendors may want to spend a few minutes thinking about this metaphor.
So what’s a big red button? The idea among the law enforcement and intelligence professionals at this conference in the suburbs of Washington, DC embrace an office supply vendor advertising campaign. The ad made the point that ordering from the vendor was as easy as pressing a big red button labeled “Easy.”
How does an ad for ink jet cartridges and pencils relate to six and seven figure enterprise search and content processing systems?
Easy, of course.
At this conference, the attendees and a number of speakers talked about the need to simplify findability, tracking, and analysis systems. The fancy visualizations, the ability to store massive amounts of data in a secure cloud, and the appetite among investigators for data is rising.
The usability of the systems is either choked by work cycles that do not produce useful outputs, held back by a shortage of specialists who can operate these systems, or weighted down with bells and whistles that get in the way of some essential functions.
Enterprise search, analysis systems, and intelligence systems were described as one exhibitor as “the major barrier to sales.” One of the speakers from an investigative unit groused, “Once set up, my team has a very difficult time making changes to get the outputs in line with our operational needs.”
A recent study by the Association for Information and Image Management (AIIM) reported that “more than half of the organizations surveyed show little maturity in their approach to search, with no strategy, no allocated budget and no identified owner.”
How can vendors deliver solutions when the customers exhibit indifference to useful technology? How can the technology deliver results to the user so that more informed decisions can be made?
These are important questions that cannot be answered by references to low cost search options, buzzwords, and bootlegging fixes so a single user or a small unit can access digital information.
Several observations are warranted:
First, the sales cycle is becoming longer for many vendors, not just those at the intelligence trade show I attended. Digital solutions are procured in a way that defers a decision. None of the individuals involved wants to make a choice that will lead to pushback from users or scrutiny from the accounting department.
Second, the users get tangled in complex systems. When new systems are explored, the users want simplicity and the vendors deliver complexity. The “failure to communicate” adds bureaucratic friction and in some cases flare ups among vendors, decision makers, and technical staff reviewing a solution.
Third, the benefits of a system or an incumbent system are often very difficult if not impossible to demonstrate. Without concrete data about cost/benefit or crimes solved or good decisions/bad decision ratios—search and content processing has a credibility problem.
The big red button is a powerful metaphor that suggests a pivotal moment in digital information access has arrived. Without a big red button, search and content processing may face even stronger headwinds going forward.
Stephen E Arnold, October 9, 2014