Bing’s Got Some Useful Search Features

June 11, 2009

The goslings have been fumbling around because web feet don’t equate with Web searching. Nevertheless, we have gathered together six tips that we found particularly useful when running “bings” on the Microsoft search site. Tell your friends to “bing it.” Remember: in the examples below omit the initial and trailing quotation marks.

Tip 1: to see what’s hot in feeds. Enter this string in front of your query: “feed:”. The resulting query looks like this: “feed:Cleveland +”business information”.

Tip 2: this is a direct match for the “other guy’s” syntax. To locate only Adobe PDF documents you use this string in front of your query: “filetype:” The resulting query looks like this: “filetype:ppt +sharepoint”.

Tip 3: this is another direct match for the “other guy’s” syntax. To locate only documents within a particular Web site you use this string in front of your query: “site:” The resulting query looks like this: “site:search +arnoldit.com

Tip 4: this tip is essential if you are looking for hits from a little known or unpopular site like the National Railway Retirement Board. To determine if a Web site is in the Bing index, enter this string in front of your query: “url:”. The resulting query looks like this: “url:marad.gov”

Tip 5: this tip is popular with the goslings who enjoy online music. To locate hits on sites that have links to specific filetypes, precede the query with the string “contains:” The resulting query looks like this: “contains:mp3”

Tip 6: Boolean operators are available. Use “+” for AND and  “-“ for NOT.

Remember to put bound phrases such as “White House” in quotes to minimize false drops.

We think the system has some useful features. You can get other tips for running bings on the new Microsoft system at these locations:

  • Help for the system is in tiny gray type at the foot of the Bing splash page. You can go directly to the online information pages by clicking here.
  • MalekTips has some useful tips here.
  • Digital Inspiration has some interesting tips here. I quite liked tip 1 so the full Bing is available to users where access may be limited in some way.

Remember don’t say the “other guy’s” name when you mean Web search. Say, “Bing it.” I am working on making this word an active part of my vocabulary.

Stephen Arnold, June 11, 2009

Microsoft Health: A New Thrust

June 4, 2009

Shift your attention from Bing.com to a sector that is a must-win for Microsoft. Ina Fried reported here that Microsoft acquired Rosetta Biosoftware from the struggling pharmaceutical company, Merck. Rosetta Biosoftware is a unit of of Rosetta Inpharmatics. Based in Seattle, the 300 person firm had been hit with cutbacks due to the financial climate. The software unit, which had about 60 employees, was expected to keep it lights on. According to Ms. Fried’s “Microsoft Buys Merck Unit in Life Sciences Push” here,

Microsoft, which has a separate Amalga product family for hospitals, announced in April that it would offer Microsoft Amalga Life Sciences as an effort to help in the drug research software arena. The tools are designed to help manage and analyze the large amounts of data gathered in the process of designing new drugs.

What’s Rosetta Biosoftware’s business? According to a profile here, the company

develops informatics solutions and provides services that enable research organizations to efficiently and effectively conduct life-saving discoveries and develop drugs.

Microsoft’s Amalga, according to Microsoft here, the company

develops its own powerful health solutions, such as Amalga and HealthVault. Together, Microsoft and its industry partners are working to advance a vision of unifying health information and making it more readily available, ensuring the best quality of life and affordable care for everyone.

Looks to me as if the dust up between Microsoft and Google in the health sector is likely to become more intense.

Stephen Arnold, June 3, 2009

Container Vessel Vertical Search

May 3, 2009

Short honk: MarketWatch here reported here that a free vertical search system for ocean going container vessels has been launched. The story “Linescape Launches a Free and Independent Ocean Container Schedule Search Engine” pointed to http://www.linescape.com. For me the most interesting comment in the description of this vertical search engine was:

Linescape has introduced several advanced features unique to its search engine, including an innovative “Route Planner” that presents users with a matrix of all the possible port combinations between two geographical areas and which carriers serve those routes. This very powerful and easy to use feature will simplify one of the most difficult tasks of a user — trying to choose the best possible origin and destination ports. A unique aspect of the website is the visibility of the number of transhipments in a journey, allowing a user to balance journey times with numbers of tranships to keep risks of delay to a minimum. Another advanced feature is the innovative “Multiple Lines” feature, whereby users are able to automatically build routes via transhipments, even between two unrelated shipping lines.

Linescape is a leading online provider of comprehensive ocean container shipping schedules for shippers and freight forwarders worldwide. Headquartered in Burlingame, California, Linescape was created in 2008 by a team of professionals who have spent many years in industries that require the shipping of goods to customers all over the world.

NetBase and Content Intelligence

April 30, 2009

Vertical search is alive and well. Technology Review described NetBase’s Content Intelligence here. The story, written by Erica Naone, was “A Smarter Search for What Ails You”. Ms. Naone wrote:

organizes searchable content by analyzing sentence structure in a novel way. The company created a demonstration of the platform that searches through health-related information. When a user enters the name of a disease, he or she is most interested in common causes, symptoms, and treatments, and in finding doctors who specialize in treating it, says Netbase CEO and cofounder Jonathan Spier. So the company’s new software doesn’t simply return a list of documents that reference the disease, as most search engines would. Instead, it presents the user with answers to common questions. For example, it shows a list of treatments and excerpts from documents that discuss those treatments. The Content Intelligence platform is not intended as a stand-alone search engine, Spier explains. Instead, Netbase hopes to sell it to companies that want to enhance the quality of their results.

NetBase (formerly Accelovation) has developed a natural language processing system.Ms. Naone reported:

NetBase’s software focuses on recognizing phrases that describe the connections between important words. For example, when the system looks for treatments, it might search for phrases such as “reduce the risk of” instead of the name of a particular drug. Tellefson notes that this isn’t a matter of simply listing instances of this phrase, rather catching phrases with an equivalent meaning. Netbase’s system uses these phrases to understand the relationship between parts of the sentence.

At this point in the write up, I heard echoes of other vendors with NLP, semantics, bound phrase identification, etc. Elsevier has embraced the system for its illumin8 service. You can obtain more information about this Elsevier service here. Illumin8 asked me, “What if you could become an expert in any topic in a few minutes?” Wow!

The NetBase explanation of content intelligence is:

… understanding the actual “meaning” of sentences independent of custom lexicons. It is designed to handle myriads of syntactical sentence structures – even ungrammatical ones – and convert them to logical form. Content Intelligence creates structured semantic indexes from massive volumes of content (billions of web-pages and documents) used to power question-and-answer type of search experiences.

NetBase asserts:

Because NetBase doesn’t rely on custom taxonomies, manual annotations or coding, the solutions are fully automated, massively scalable and able to be rolled-out in weeks with a minimal amount of effort. NetBase’s semantic index is easy to keep up-to-date since no human editing or updates to controlled vocabulary are needed to capture and index new information – even when it includes new technical terms.

Let me offer several observations:

  • The application of NLP to content is not new and it imposes some computational burdens on the search system. To minimize those loads, NLP is often constrained to content that contains a restricted terminology; for example, medicine, engineering, etc. Even with a narrow focus, NLP remains interesting.
  • “Loose” NLP can squirm around some of the brute force challenges, but it is not yet clear if NLP methods are ready for center stage. Sophisticated content processing often works best out of sight, delivering to the user delightful, useful ways to obtain needed information.
  • A number of NLP systems are available today; for example, Hakia. Microsoft snapped up PowerSet. One can argue that some of the Inxight technology acquired first by Business Objects then by the software giant SAP are NLP systems. To my knowledge, none of these has scored a hat trick in revenue, customer uptake, and high volume content processing.

You can get more information about NetBase here. You can find demonstrations and screenshots. A good place to start is here. According to TechCrunch:

NetBase has been around for a while. Originally called Accelovation, it has raised $9 million in two rounds of venture funding over the past four years, has 30 employees…

In my files, I had noted that the funding sources included Altos Ventures and ThomVest, but these data may be stale or just plain wrong. I don’t have enough information about Netbase to offer substantive comments. NLP requires significant computing horsepower. I need to know more about the plumbing. Technology Review provided the sizzle. Now we need to know about the cow from which the prime rib comes.

Stephen Arnold, April 30, 2009

Autonomy Thrives in Lousy Economic Climate

April 29, 2009

I am at Day Two of the Boston Search Engine Meeting. At the break, I talked with a small group and the subject was the impact of the financial climate on the enterprise search vendors. I heard the names of two vendors who in the opinion of a couple of people with whom I spoke are gasping for nutrients in the form of dollars and euros. I don’t feel comfortable mentioning the name of one semantic-centric vendor and one non-US vendor who were the subject of speculation. In my opinion, there are probably a half dozen or more of the companies that I track in a resource pickle.

image

One notable exception is the UK based vendor Autonomy. I did not see a representative of Autonomy at this conference, but I have been too busy to conduct an inventory of the attendees. Autonomy reported a week or two ago that it was likely to have a solid financial performance. I did a quick check, and it is evident, if I understand Autonomy’s data, that the lousy climate is not inhibiting Autonomy’s growth.

You can Kathy Sandler’s take here. She reported on April 23, 2009, that Autonomy plans to upgrade its 2009 earnings projections. I am not a financial whiz, but the information in Ms. Sandler’s write up looks good across the board – revenue, earnings, and cost management.

My high school history teacher was fond of repeating the alleged anecdote about the drunk General US Grant and President Lincoln’s alleged comment: “Find out what he’s drinking and send a case to my other generals?”

Is it time for other enterprise search companies to take a hard look at what fuels Autonomy’s crops. Say what you will about the company’s acquisition strategy, the firm seems to be harvesting.

Are Autonomy’s competitors to arrogant to look at Mr. Lynch and determine what he does to harvest cash as others shrivel?

Stephen Arnold, April 29, 2009

Demographics and Their Search Implications: Breathing Room for Online Dinosaurs

April 25, 2009

ReadWriteWeb.com’s “The Technology Generation Gap at Work is Oh So Wide” pointed to a study that I had heard about but not seen. A happy quack to RW2 for the link the the LexisNexis results here. RW2 does a good job of summarizing the highlights of the research, conducted for this unit of Reed Elsevier, the Anglo Dutch giant that provides access to the US legal content in its for fee service. You can read Sarah Perez’s summary here.

I wanted to add three observations that diverge from the RW2 report and are indirectly referenced in the WorldOne Research 47 page distillation of the survey data and accompanying analysis. Keep in mind that the research is now about nine months old and aimed at a sample of those involved in the world’s most honorable profession, lawyering.

First, the demographics are bad news for the for fee vendors of online information. As each cohort makes it way from the Wii to the iPhone, the monetization methods, the expectations of the users, and the content forms themselves must be set up to morph without paying humans to fiddle.

Second, as I zoomed through the data, I came away convinced that lawyers’ perception of technology and mine are different. As a result, I think the level of sophistication in this sample is low compared to that of the goslings swimming in my pond filled with mine run off water. The notion that lawyers who are younger are more technologically adept may be little more than an awareness of the iPhone, not next generation text and content processing systems.

Third, the overall direction of the survey and the results themselves make it clear that it will be a while before the traditional legal information sources are replaced by a gussied up Google Uncle Sam, but it will happen.

My conclusion is that LexisNexis got the reassurance it wanted from these data. Is that confidence warranted as law firms furlough or rationalize staff, face clients who put caps on certain expenses, and look at the lower cost legal services available in the land of outsourcing, India.

Stephen Arnold, April 25, 2009

Google Local Push in Australia

April 24, 2009

I don’t care too much for print directories. The Google has a formidable directory initiative. I found the story in The Standard here interesting. Kathryn Edwards’ “Google to Boost Local Businesses with AdWords Offer” here wrote:

Google Australia Wednesday announced it will offer a free A$75 (US$53) search marketing campaign to help more than one million Australian small and medium businesses. According to the search engine giant, more Australians than ever before are researching products and services online, before venturing into a shop, with Monash University research showing that this trend makes up 50 per cent of Australian shoppers.

Google is offering a helping hand to get businesses to shift into a higher gear for online marketing. Will this type of booster program find its way elsewhere. If the Australian program takes off, the Google may become the de facto online information source for small and mid sized businesses. Bad news for print directory businesses.

Stephen Arnold, April 23, 2009

eBay: Another Yahoo

April 22, 2009

I am tired of the Microsoft Yahoo grade school love affair. The Google “dossier” function is old news because the GOOG disclosed similar functions in a patent application containing the Michael Jackson example two or three years ago. (If you are a reader of this Web log, you know that Googler Cyrus [last name withheld by me as a courtesy] told anyone who would listen that I made up the example. I am not interested in how much Yahoo’s revenues have fallen. Not much of a surprise because the Yahooligans are drifting despite senior Yahoo technologists’ insisting that I am wrong and the Yahooligans are right. They aren’t. Now the revenue problem puts the company into flashing yellow light mode.

No Google, no Microsoft, and no Yahoo.

Let’s think about this recent Forbes.com article “eBay: Back to Basics” by Taylor Buley here. The write up does a credible job of summarizing some of eBay’s challenges. I want to be more specific. eBay is a bit like the snail darter. The creature is like Kentucky’s own Townsend’s Big-eared Bat. Endangered.

Here’s my take on the eBay problem which includes some points either deleted, ignored, or known to be too addled for the buttoned up Forbes crowd and their pop up ads:

  1. Search. The eBay search system does not work for me. Let’s say I want to buy a refurbished Hewlett Packard NC4010 laptop. Try the query. What do you get? The same lousy results I do. Anything with the string. Try to limit to the actual computer itself. Not possible. This problem is sufficiently annoying to make me endure the wackiness of Google Shopping or the Amazon outfit that routinely changes my one click settings. Anything but eBay. The search system is awful in my opinion. eBay tried Thunderstone. eBay rolled its own. I am not sure what can be done to fix the core finding engine. My hunch is that there is neither the money, the desire, or the expertise to pull this baby out of the fire.
  2. Fraud. I learned a long time ago that eBay works overtime to keep the magnitude of the fraud issue under wraps. I have been snookered a number of times. There was the famous mobile phone play by a person with the alias of Wiguna. I took matters into my own hands, located the Wiguna person, and in person relayed my experience to an individual familiar with the outfit where Wiguna did some work. With this indirect method, I got my money back. I don’t think I am alone. eBay is trying to “do” security but not spending enough management time and attention to understand the magnitude of the problem and then not having enough resources to take action. Too little and too late, much too late.
  3. Vendors. The actions eBay has taken have angered some folks who sell products on eBay. Maybe this is a replay of the tragedy of the commons. But up the road from Harrods’s Creek, a small eBay service company has shut its doors. The company could not cope with the combined effects of the hassle, the fees, and the fraud. A couple more Kentucky folks have to shoot squirrels to eat I suppose. The cause was eBay, not these honest people who sold my used computer gear for me.

The most recent actions underscore the cluelessness of the company. Skype and StumbleUpon are in play. Skype was a poorly taken decision. StumbleUpon was simply fumbled by inept running backs. The management teams have not been able to take purposeful action, underscoring the fact that MBAs and folks with great personalities cannot run companies anchored in technology that cannot be fossilized. Change has marginalized eBay and now the company is in the same canoe as Yahoo.

Forbes is skeptical. I’m not. I think eBay’s radar is not working. Like a blinded big eared bat, a collision is inevitable.

Stephen Arnold, April 22, 2009

Google and Guha: The Semantic Steamroller

April 17, 2009

I hear quite a lot about semantic search. I try to provide some color on selected players. By now, you know that I recycle in this Web log, and this article is no exception. The difference is that few people pay much attention to patent documents. In general, these are less popular than a printed dead tree daily paper, but in my opinion quite a bit more exciting. But that’s what makes me an addled goose, and you a reader of free Web log posts.

You will want to snag a copy of US20090100036 from our ever efficient USPTO. Please, read the instructions for running a query on the USPTO system. I don’t provide for free support to public facing, easy to use, elegant interfaces such as that available from the Federal government.

weights 20090100036

The “eyes” of Googzilla. From US20090100036, Figure 21, Cyrus, in case you want to see what your employer is doing these days.

The title of the document is “Methods and Systems for Classifying Search Results to Determine Page Elements” by a gaggle of Googlers, one of whom is Ramanathan Guha. If you read my Google Version 2.0 or the semantic white paper I wrote for Bear Stearns when it was respected and in business, you know that Dr. Guha is a bit of a superstar in my corner of the world. The founder of Epinions.com and a blue chip wizard with credentials (Semantic Web RDF, Babelfish, Open Directory, etc.) that will take away the puffery of newly minted search consultants, Dr. Guha invented, wrote up, and filed five major inventions. These five set forth the Programmable Search Engine. You will have to chase down one of my for fee writings to get more detail about how the PSE meshes with Google’s data management inventions. If you are IBM or Microsoft, you will remind me that patents are products and that Google is not doing anything particularly new. I love those old eight track tapes, don’t you.

The new invention is the work of Tania Bedrax-Weiss, Patrick Riley, Corin Anderson, and Ramanathan Guha. His name is spelled “Ramanthan” in the patent snippet I have. Fish & Richardson, Google’s go-to search patent attorney may have submitted it correctly in October 2007 but it emerged from the USPTO on April 16, 2009, with the spelling error.

The application is a 33 page long document, which is beefy by Google’s standard. Google dearly loves brevity so the invention is pushing into Gone with the Wind length for the GOOG. The Fish & Richardson synopsis said:

This invention relates to determining page elements to display in response to a search. A method embodiment of this invention determines a page element based on a search result. The method includes: (1) determining a set of result classifications based on the search result, wherein each result classification includes a result category and a result score; and (2) determining the page element based on the set of result classifications. In this way, a classification is determined based on a search result and page elements are generated based on the classification. By using the search result, as opposed to just the query, page elements are generated that corresponds to a predominant interpretation of the user’s query within the search results. As result, the page elements may, in most cases, accurately reflect the user’s intent.

Got that? If you did not, you are not alone. The invention makes sense in the context of a number of other Google technical initiatives ranging from the non hierarchical clustering methods to the data management innovations you can spot if you poke around Google Base. I noted classification refinement, snippets, and “signal” weighting. If you are in the health biz, you might want to check out the labels in the figures in the patent application. If you were at my lecture for Houston Wellness, I described some of Google’s health related activities.

On the surface, you may think, “Page parsing. No big deal.” You are not exactly right. Page parsing at Google scale, the method, and the scores complement Google’s “dossier” function about which Sue Feldman and I wrote in our September 2008 IDC client only report. This is IDC paper 213562.

What does a medical information publisher need with those human editors anyway?

Stephen Arnold, April 17, 2009

Newspapers Are Goners: Lawyers Will Not Save the Day

April 7, 2009

The patricians at the big name media companies are busy dissing the GOOG. I don’t think Googzilla has much to do with the sorry state of newspaper publishing. Check out Eric Savitz’ One Classified Ad Web Site to Rule Them All” here. Note: Barron’s is a dead tree outfit owned by News Corp., an enterprise that finds little room in its heart for Google love. Mr. Savitz reported that Craigslist.org is a de facto classified ad monopoly. As he stated the matter:

According to new data from Hitwise, traffic to online classified advertising sites increased 84% in February from a year ago. The sector has seen positive growth in all but one month over the last three years. And while hardly the only player in the game, the single biggest beneficiary of the trend is Craigslist. According to Hitwise, of the top 100 classified ad Web sites, all but 3 were localized versions of Craigslist.

Mr. Savitz’ employer may want to put the evil on the owners of Craigslist.org. The GOOG is innocent when it comes to sucking classified ad revenue sweets from the dying trees supporting the newspaper industry. Keep in mind that if the dead tree patricians have children under the age of 24, the progeny are users of Craigslist.org. How else does one find an apartment in Alphabet City? Newspaper killers are using the vertical search provided by Craigslist.org. Even the Googlers use Craigslist.org from what I hear.

Stephen Arnold, April 7, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta