February 24, 2014
Yahoo may not be able to wriggle out of the Microsoft Bing search deal. Microsoft may not be m making much progress in catching Google, and Yahoo may want to swizzle a different spin on Web search. Microsoft’s voice enabled technology seems to be disappointing Ford. The US auto maker may be embracing BlackBerry’s QNX system. Yep, BlackBerry, a stellar outfit in my experience. Microsoft has some issues to resolve particularly if it loses a major account to the shareholder-pleasing Waterloo, Ontario company.
I read “Yahoo Launches $10 Million Research Effort to Invent a Smarter Siri.” I find the notion that a large company can invent voice search that is “better” than another voice search system interesting. Google has a voice search system, and there are a number of companies eager to make their voice search technology available to Yahoo. But Yahoo apparently has confidence in Carnegie Mellon University, the outfit that delivered Lycos, Vivisimo, and Claritech to information seekers in the past.
According to the Technology Review article:
Ron Brachman, head of Yahoo Labs, says that he expects the InMind project to experiment with apps that are capable of rudimentary conversation—for example, asking a person follow-up questions and making suggestions based on new information. “This is missing from Siri,” he says, adding that although Apple’s personal assistant is impressive, it doesn’t attempt to understand the context in which it is being asked a question: it doesn’t understand what the user is doing or might need at the moment.
With Web search shifting to mobile like iron filings following a magnet, users find typing less facile on a mobile device. Will Yahoo crack the code in five years with the help of the CMU professors and students?
Five years is a long time. Like Facebook and Google, Yahoo may find it more expedient to start buying voice recognition companies and licensing available technology. WhatsApp, a company that Facebook bought in February, promptly said, it would not change. I learned today that Facebook will be adding voice calls to WhatsApp. How long did that “will not change” statement endure? WhatsApp did not have five days.
Yahoo may not have five years.
Stephen E Arnold, February 24, 2014
February 24, 2014
Finally, there are easier ways to find out whether your great idea has already been patented by an earlier-rising birdie. GCN reveals two new tools in, “Patent Search Engines Aim to Open Innovations to the World.”
The Lens is an open search engine created specifically for hunting down patent information, created by Richard Jefferson of the Queensland University of Technology. The Lens crawls through about 100 million documents in 90 countries, and its creator hopes it will help level the innovation playing field. Interestingly, Jefferson traces his lineage directly to Thomas Jefferson, who started the U.S. patent system in the first place. Perhaps that is why Richard Jefferson seeks to rectify the “dire straits” he feels that system is now in: being gamed by companies “incredibly skilled in hiding the ball in intentionally opaque patents.” The article tells us:
“The Lens already hosts several tools for analysis and exploration of the patent literature, including graphical representations of search results to advanced bioinformatics tools. In 2014 developers will be working to create forms of the Lens that can allow all annotations, commentary and sharing to be behind firewalls for those who need it, without forsaking the open and inclusive cyberinfrastructure, the organization said on its website.”
Meanwhile, the U.S. Patent and Trademark Office (PTO) itself seeks to address the need for streamlined patent search with its Global Patent Search Network. The article doesn’t say how many countries this engine reaches, but does mention that the PTO has worked with China’s government to make their patent documentation searchable; that cooperation is nothing to sneeze at. The article reveals:
“Users can search documents, including published documents and granted patents, recorded from 2008 to 2011. The records are available in in English machine translations, which PTO acknowledged could sometimes generate awkward wording, but ‘provided an excellent way to determine the gist of the information in a foreign patent.’”
So, next time you want to know whether your invention has already been invented, turn to these tailor-made search engines.
Cynthia Murrell, February 24, 2014
February 17, 2014
The article titled Twitter.com Gets New Search Filters for News, Videos, and People You Follow on TNW pronounces that Twitter has improved search (a little bit.) In sum, Twitter’s search will now allow its users to search in the categories of photos, videos, news, people you follow, and locations. This is certainly meant to make search easier on its users. When it comes to sorting through the millions of Tweets, it might come in handy to have more specific filters. The article explains,
“Twitter revealed the new features today with a tweet, but it’s not clear exactly when the filters began rolling out. Earlier filters let you specify whether you were searching for photos or people. The official iOS and Android Twitter apps got new search filters last November. Twitter’s Advanced Search feature still exists for those who need the extra search operator functions.”
The announcement tweet read, “We’re bringing new filters to search on ?http://twitter.com : by videos, news, people you follow, and more.” This small change might not be the most exciting innovation in search, but the article does express some interest in the new ability to weed out irrelevant tweets when searching for something read earlier in the day.
Chelsea Kerwin, February 17, 2014
February 16, 2014
The article on ITWorld titled China’s Baidu Testing Search Engines for Brazil, Egypt, Thailand explores the ambition of China’s premier search engine. For some years the company has contemplated moving beyond China, and in 2008 began targeting Japan. Now they are readying to move into Egpyt, Thailand and Brazil, although the search sites are still currently in the internal testing phase according to Baidu spokesman Kaiser Kuo. The article explains,
“The three sites can be found at www.baidu.com.eg, www.baidu.co.th, www.baidu.net.br and are designed in the local language of each market. In addition to a search bar, the landing pages to the sites offer direct links to popular services such as Facebook, YouTube, as well as Hao123, Baidu’s own local Web directory. Besides Web search, the sites also contain different features such as image and video search, along with language translation.”
The expansion into international waters means contending with Google, the giant that claims just under 70% of all searches as of December 2013. In the same month Baidu accounted for just under 20% of searches on desktop PCs. Spokesman Kuo made it clear that Baidu is not content to stop at Egypt, Thailand and Brazil, but plans to develop search engines for other nations too, and is currently building an office in Shenzhen solely for international operations.
Chelsea Kerwin, February 16, 2014
February 13, 2014
I have an iPod and an iPad kicking around. We even have a Mac computer. My wife has an iPhone. The gizmo provides her iPhone equipped friends with myriad opportunities to look at baby pictures, check lunch dates on their calendars, and make phone calls. None of the gizmos is without flaws. The Apple product line up is premium priced and designed to meet the perceived needs of semi-affluent or pretend-affluent customers.
I read “A Look at Apple’s R&D Expenditures from 1995-2013.” I urge you to read the story but the main point is the diagram showing Apple’s spending for research and development. I translate “research and development” to “innovation” but you may have a different way to define the phrase. I have snipped a small segment of the chart to illustrate what has happened to Apple, based on the data presented in the write up.
Look at that ramp up. What is fascinating is that the scale in 2013 noses into the $4 billion range. The take away is that the amount of money Apple is spending is rising pretty quickly. Apple has money in the bank and some products that continue to sell well.
Apple is able to invest increasing amounts of money in innovation because it has money.
Search vendors face innovation problems. The chatter on LinkedIn and in the write ups for conferences that flood my email talk around innovation. The discussion pivots on some well worn themes, only tangentially related to information retrieval.
Innovation in search has stalled. Apple is spending aggressively to help ensure its innovation flow.
But what happens when a search vendor with far less money has to innovate. DARPA will award a handful of contracts. Venture funding sources will want a pay off.
The net net is that the cost of innovation in search is not that different in its need for financial investment. Apple can write the checks. Most search vendors—despite the flashy webinars and mindless news releases—cannot.
Stephen E Arnold
February 12, 2014
I read “Gödel, Escher, Bach: An Eternal Golden Braid” in 1999 or 2000. My reaction was, “I am glad I did not have Dr. Douglas R. Hofstadter critiquing my lame work for the PhD program at my university. Dr. Hofstadter’s intellect intimidated me. I had to look up “Bach” because I knew zero about the procreative composer of organ music. (Heh, heh)
Imagine my surprise when I read “Why Watson and Siri Are Not Real AI” in Popular Mechanics magazine. Popular Mechanics is not my first choice as an information source for analysis of artificial intelligence and related disciplines. Popular Mechanics explains saws, automobiles, and gadgets.
But there was the story, illustration with one of those bluish Jeopardy Watson photographs. The write up is meaty because Popular Mechanics asked Dr. Hofstadter questions and presented his answers. No equations. No arcane references. No intimidating the fat, ugly grad student.
The point of the write up is probably not one that IBM and Apple will like. Dr. Hofstadter does not see the “artificial intelligence” in Watson and Siri as “thinking machines.” (I share this view along with DARPA, I believe.)
Here’s a snippet of the Watson analysis:
Watson is basically a text search algorithm connected to a database just like Google search. It doesn’t understand what it’s reading. In fact, read is the wrong word. It’s not reading anything because it’s not comprehending anything. Watson is finding text without having a clue as to what the text means. In that sense, there’s no intelligence there. It’s clever, it’s impressive, but it’s absolutely vacuous.
I had to look up vacuous. It means, according to the Google “define” function: “having or showing a lack of thought or intelligence; mindless.” Okay, mindless. Isn’t IBM going to build a multi-billion dollar a year business on Watson’s technology? Isn’t IBM delivering a landslide business to the snack shops adjacent its new Watson offices in Manhattan? Isn’t Watson saving lives in Africa?
The interview uses a number of other interesting words; for example:
Yet my favorite is the aforementioned—vacuous.
Please, read the interview in its entirety. I am not sure it will blunt the IBM and Apple PR machines, but kudos to Popular Mechanics. Now if the azure chip consultants, the failed Webmasters turned search experts, and the MBA pitch people would shift from hyperbole to reality, some clarity would return to the discussion of information retrieval.
Stephen E Arnold, February 11, 2014
February 12, 2014
Last I knew, the Google Search Appliance (GAS) had trimmed its product line, eliminated the impulse buy option for the Mini, and kept the price at the higher end of the appliance market.
I learned over the last two years that Google has placed more than 60,000 GSAs in organizations. I have no idea if the number is valid, but if it is, the GSA is one of the top dogs in enterprise search. I also heard that there was a small team working on the GSA and an even smaller team handling customer support. Google pushes functions to resellers who deal with the customers. Google outsources manufacturing of the GSA. Most important, Google seems to have an off-again, on-again interest in on premises search. The future, as I understand it, is the cloud. The GSA is, in my opinion, an anachronism in the Nest, X Labs, and Android-Chrome world. But, hey, I have been wrong before. I once asserted that basic search should not be a challenge for most organizations. Wow, did I get that wrong! Jail time, law suits, and DARPA’s almost admission that search is not working notwithstanding.
The GSA has been around almost a decade. Version 7.2 is “a leader in the Garnet Enterprise Search MQ.” I certainly don’t doubt the word of an estimable azure chip consulting firm. No, no, no.
The new version, according to Google, delivers:
- Metadata sorting. A function available in the 1983 version of Fulcrum Technologies’ system
- language translation. A function available from Delphes in the 1990s
- A document preview function. iPhrase in 1999 delivered this feature
- Entity recognition. Verity implemented this function in the 1980s
- Dynamic navigation. Endeca rolled out this feature in 1998
In my opinion, the GSA is catching up to innovations available for many years from other vendors. Comparing the EPI Thunderstone and Maxxcat appliances to the GSA emphasizes that the GSA is not quite at parity with other products in the channel.
According to “Google Updates Enterprise Search Appliance Tool,”
The GSA 7.2 update comes more than a year after the firm upgraded the GSA to version 7.0, and builds on the features included in that update. The most notable includes the ability to improve the way data can be indexed with key attributes, such as author name, or the date it was created.
How much does a GSA cost? According to the US government’s GSAadvantage.gov, a 36 month license for a GB 7007 is $69,296 for 500,000 documents. Have more documents? Pay for an upgrade. However, I can use a hosted service like Blossom Software to index my content for about $2,400 per month. I can use the low cost dtSearch solution for $160 per seat. I can download an open source solution and do it myself.
For an organization with 20 million documents to index, the cost of the GSA solution noses into HP Autonomy territory. Too rich for my blood, and I think that lower cost appliance vendors will see the Google Search Appliance as a lead generator.
I wonder if those azure chip consultants have licensed the GSA to handle their Intranet information retrieval tasks?
Stephen E Arnold, February 12, 2014
February 11, 2014
One of my two or three readers sent me a link to “DARPA-BAA-14-21: Memex.” The item is interesting because it reaches back to the idea of Vannevar Bush, sidesteps the use of the word “Memex” by a search vendor once operating in the United Kingdom, and provides pretty clear proof that DARPA is not happy with search. You can dig into the details at https://www.fbo.gov/utils/view?id=32c351ba7850360e140a29f363819052.
US government content has some interesting characteristics. One of the most interesting is that items like DARPA-BAA-14-21 appear without context. For example, there is not a hint, nary a whisper of In-Q-Tel’s investments in search and content processing. Years ago, I heard at an intel conference that In-Q-Tel funds promising companies but few of these deliver operational payoffs. You can see a list of In-Q-Tel investments at https://www.iqt.org/portfolio/. Some of these companies deliver darned interesting demonstration systems. Others have offered solutions that were eventually abandoned. Others are like Fourth of July fireworks; that is, the financial support and walk arounds provide the type of show that some decision makers perceive as progress and purposeful action.
The net net is that this DARPA item underscores that information retrieval system is not appropriate for the future needs of DARPA. For me, this is one indication that my assertion about the troubled state of information retrieval.
Perhaps the funding, the TREC tests, and the DARPA solicitation will yield a payoff for operational personnel. “Perhaps” is a bit soft even if the devalued dollars are real. Our research offers some interesting facts that finding information today is more difficult than it was five years ago.
Stephen E Arnold, February 11, 2014
February 10, 2014
I read “Fastgründer John Markus Lervik dømt til fengsel.” Assuming the story is accurate, Dr. John Lervik, the founder of Fast Search & Transfer, will serve at least one year in prison. The issue is related to the financial reporting of Fast Search & Transfer.
In 2008, Microsoft purchased the company for about $1 billion, a deal compared to the price Hewlett Packard paid for Autonomy and about what Oracle paid for Endeca. Mr. Lervik will pay to pay legal fees. He will take appropriate legal steps to overturn the decision.
Enterprise search is a tough nut to crack technically and financially. The monetary challenges stem from the brutal costs of marketing and customer support. But these are at least as expensive as the cost of dealing adequately with technical challenges of enterprise search. For example:
- The time required to make a system deliver what the marketers assure customers are “ready to deploy” functions. Most large scale search solutions are not products. These are complex systems. Because each customer has specific requirements, the marketers do not understand that what they sold may take time to create, test, and deliver. Time is money. With an open ended problem, the cost is staggering.
- The problem of responding to crashes. When an enterprise search system flips over and dies, the cause may be the vendor, the reseller, or the client. Unfortunately the vendor takes the heat because many tech centric managers feel the “buck stops here.” Responding when a client is crazy mad is expensive. Failing to address the client’s need may delay payments or trigger legal action. Expensive stuff.
- The need to invest to keep pace with the information environment. Most of the mainstream search systems, including Fast Search and other older systems, focused on text. Handling different file types and different content types is an expensive operation for some vendors. The choice is stark: Spend and develop the components in house, spend money for third party solutions and then spend more to integrate those solutions into the core system, buy a company that has the people and the software needed, or ignore the client. There may be other options, but these four have big price tags. The cost of keeping up is brutal because information retrieval does not stand still.
- Figuring out why routine operations are slow or output unexpected results. Most search systems are far trickier to set up than licensees expect. With many knobs to turn, Fast Search could be tweaked so that results could boost certain content or address relevancy under specific circumstances. In a complex system, like Fast and many others, turning one knob and experimenting with threshold values could cause some darned exciting consequences. Rolling back those changes was an exciting operation in itself. When a Fast engineer had to figure out how to get the system back on track, the work was not trivial. What’s it cost to get an expert engineer to figure out what a licensee did? In many instances, a lot.
If you add up the costs of the technical work required for a complex search system, the need for money is significant. Dr. Lervik is not a financial expert; he is an expert in information retrieval. Not even ex-Googlers are adept managers. Witness the AOL goof related to “distressed babies.”
But a senior manager is expected to find solutions to difficult managerial, technical, and financial challenges. If the news story is true, it seems that Dr. Lervik was caught in a situation that set the stage for the unfortunate drama that has been playing out over the last five years.
The big question is:
Will other search and content processing vendors find themselves in a similar situation?
In my opinion, yes.
Warning signs are easy to spot. When search vendors that are seven or 12 years old continue to suck in venture funding, the warning flags are flying in my opinion. Search is essentially a zero license fee utility at this point. Firms that have yet to return a profit or show significant growth may find themselves taking financial short cuts.
The Xenky analyses make clear that financial stress is nothing new to search vendors. Check out the Convera, Delphes, and Fulcrum Technologies profiles. What’s different is that in today’s business environment, the consequences may be increasingly severe. You can find case studies of search vendors at www.xenky.com/vendor-profiles. There is no charge for these reports. Many describe enterprise search solutions that struggled financially and either shut down or sold out.
Enterprise search is a tough business. A sad quack for Dr. Lervik.
Stephen E Arnold, February 10, 2014
February 10, 2014
I came across www.news-spectrum.com. The system looked a bit like some of Autonomy’s visualizations. Here’s the splash screen for the service:
The idea is that a story can be viewed through time. There are news spectra for the UK, Europe, the US, politics, business, and a handful of other categories. A click on the “detail” button displays stories in the topic stream.
I navigate to a who is service and learned that the domain name is registered to Autonomy, now a unit of Hewlett Packard.
In my lectures for law enforcement and intelligence professionals, the challenges of locating information in “news” is getting more difficult. Sites like News Spectrum and some others do not include a search function. When the search function is present, the user has to turn cartwheels to get useful information. For example, navigate to World News and run a query. I used Sochi. Here’s the result list:
Scroll down the page and this is what I saw:
Videos. Videos. More videos. Where is the text? You have to do some experimenting. A tip is to select a language and then rerun the query.
The problem, of course, is that most people just take what a system displays in the case of News Spectrum and World News.
Any type of in depth research requires some specialized, and often time consuming, tactics. You can learn more about how to get through the Kevlar padding sites that wrap their indexes in Kevlar.
Net net: A news search can look good and run little videos. But for in depth information, news search is getting increasingly difficult.
Stephen E Arnold, February 10, 2014