Twitter Bashing

May 1, 2009

Short honk: If you hate Twitter, you will love this criticism of Twitter. It appeared on the MadAtoms.com Web log here. The author of “The Devolution of the Internet” by Farley Elliott is entertaining and insightful. Among the weaknesses of Twitter, Mr. Elliott highlighted:

… perhaps the most disgusting part of Twitter is it’s most basic: it is a chatroom. A quick check of the calendar reveals that it’s not 1995. Yet twitter allows in the same riffraff that early chatrooms attracted, but without any of the moderation, or the ability to spend more than 140 characters wording up trolls and goons.

A keeper for sure.

Stephen Arnold, April 30, 2009

Microsoft Has a Top Search Term. Google.

May 1, 2009

The Guardian dropped its Google voodoo doll and pins and picked up a story about Microsoft’s Live.com and the service’s most popular search term. The story ran in the dead tree outfit’s Web log, called PDA The Digital Blog, which is quite trendy and quite a mouthful. The title of the story is “Most Search Term on Microsoft’s Live Search is … Google”. You can read it here. The story, which I found somewhat hard to follow with odd comments such as “More after the jump” inserted in paragraphs in the middle of the text, provides a smattering of statistics and a reference to “a Live Search overhaul” later this spring also puzzled me. I found the write up interesting for two reasons:

First, many people use a default search engine as a portal. It is easier I have been told to type the name of the service in a search box than keying the full location in the browser’s address bar. With lots of Internet Explorers in front of people, it makes sense that a widely used search service like Google would be one of the top terms in any browser.

Second, the data displayed in the write up show (if indeed they are accurate) that only Microsoft is not a top destination on either Google or Yahoo top search listings. I would conclude that people will use Microsoft to go run their queries on other services. Not good news for Microsoft in my part of the goose pond.

Stephen Arnold, May 1, 2009

SearchMe Changes

May 1, 2009

SearchMe, http://www.searchme.com, promotes itself as “true, blended multimedia search.” You get video, images, music, web pages, Twitter results and more organized by relevance. It’s a visual slideshow interface, so you see a miniature web page instead of having to click through a link. Results returned for “Iron Chef Japan” varied, including a Flickr picture, a Yahoo! video, an About.com listing for Japanese food and the Fine Living channel profile of the show. Results for “NASA shuttle launch” were less impressive, returning the NASA home page, a CBS news article and a CNN news article, but no videos. I didn’t see any social media results on either search. The web site functions like Viewzi, which I talked about here, but doesn’t have the various entertaining display options. Searchme also has a best-selling iApp and is configured for several mobile platforms, which gives it a leg up on other visual search engines.

Jessica Bratcher, May 1, 2009

Web Site Search: More Confusion

May 1, 2009

Diane Sterling, e-Commerce Times, wrote a story that appeared in my newsreader as a MacNewsWorld.com story called “The Wide Open World of Web Site Search”.

. You can find the article here. The write up profiles briefly several search systems; namely:

  • SLI systems here. I think of this company as providing a product that makes it easy to display items from a catalog, find indexed items, and buy a product. The company has added a number of features over the years to deliver facets, related searches, and suggestions. In my mind, the product shares some of the features of EasyAsk, Endeca, and Mercado (now owned by Omniture), among others.
  • PicoSearch here is a hosted service, and I think of it as a vendor offering indexing in a way that resembles Blossom.com’s service (used on this Beyond Search Web log) or the “old” hosted service provided by Fast Search & Transfer prior to its acquisition by Microsoft. Google offers this type of search as well. Google’s Site Search makes it easy to plop a Google search box on almost any site, but the system does not handle structured content in the manner of SLI Systems, for example.
  • LTU Technologies here. I first encountered LTU when it was demonstrating its image processing technology. The company has moved from its government and investigative focus to e-commerce. The company’s core competency, in my view, is image and video processing. The system can identify visual similarity. A customer looking at a red sweater will be given an opportunity to look at other jacket-type products. No human has to figure out the visual similarity.

Now the article is fine but I was baffled by the use of the phrase “Web site search”. The idea I think is to provide the user with a “finding experience” that goes beyond key word searching. On that count, SLI and LTU are good examples for e-commerce (online shopping). PicoSearch is an outlier because it offers a hosted text centric search solution.

Another issue is that the largest provider of site search is our good pal Googzilla. Google does not rate a mention, and I think that is a mistake. Not only does Google make it possible to search structured data but the company offers its Site Search service. More information about Site Search is here.

These types of round up articles, in my opinion, confuse those looking for search solutions. What’s the fix? I think the write up should have made the focus on e-commerce in the title of the article and probably early in the write up included the words “e-commerce search”. Second, I think the companies profiled should have been ones who deliver e-commerce search functions. None of the profiled companies have a big footprint in the site search world that I track. This does not mean that the companies don’t have beefy revenue or satisfied customers. I think that the selection is off by 15 degrees and a bit of a fruit salad, not a plate of carrots.

Why do I care?

There is considerable confusion about search. There are significant differences between a search system for a text centric site and a search system for a structured information site such as an e-commerce site. One could argue that Endeca is a leader in e-commerce. That’s fine but most people don’t know this side of Endeca. The omission is confusing. The result, in my experience, is that the reader is confused. The procurement team is confused. And competitors are confused. Search is tough enough without having the worlds of image, text, and structured data scrambled unnecessarily.

Stephen Arnold, May 1, 2009

The Google Causes a Swallowing Problem

May 1, 2009

The Financial Times has picked up the Google bat and taken a whack at Googzilla. The article “Gagging on Google” appeared in the Financial Times here. The article, written by Maverecon, a serious looking fellow named Willem Buiter has a killer lead:

Google is to privacy and respect for intellectual property rights what the Taliban are to women’s rights and civil liberties: a daunting threat that must be fought relentlessly by all those who value privacy and the right to exercise, within the limits of the law, control over the uses made by others of their intellectual property.  The internet search engine company should be regulated rigorously, defanged and if necessary, broken up or put out of business.  It would not be missed.

I have been involved in online information for a long time. I have written numerous and dull articles and monographs. Never did the notion of comparing an online vendor to the Taliban and civil liberties cross my mind. In a way, it is quite imaginative and provides a good example how the dead tree crowd is responding to Google. Metaphors in a blog can be a powerful weapon. At least, I surmise that’s what the top dogs at the FT believe.

Mr. Buiter touches upon the “m” word via indirection, copyright head on, and Google’s street view as “the universal voyeur.” I don’t have the energy to see if Mr. Buiter knows that Udi Manber’s street images for Amazon’s A9 was the pioneer in this type of content enrichment. I suppose Mr. Buiter is happy with a dead tree telephone directory and no photo of the business or home that he is trying to find in the rain in heavy traffic. Mr. Buiter has also discovered Google’s tracking cookies. I was disappointed that he cited another source instead of my 2005 The Google Legacy for his explanation of cookies. The wrap up is a call to readers to accept this assertion:

Google company’s founding motto is: ‘Don’t be evil.’  But it does evil.  It has indeed, become the new evil empire of the internet.  It is time for people to take a stand, as individual consumers and internet users, and collectively through laws and regulations, to tame this new Leviathan.  When I get back from this trip, I will do my best to remove every trace of Google from my computers, even the tracking cookies (if I can!).

I think that the FT has trumped the vituperation directed at Google by the Guardian and the Telegraph. I am looking forward to what these dead tree outfits write. In my opinion, that Taliban comparison is going to be tough to beat.

Unfortunately for the dead tree crowd, the GOOG has been plodding along for a decade. Now the newspaper folks have discovered how online works. What revelations await me? What do I know? I just captured the information I unearthed about Google in publishing in my new monograph. I wonder if the FT will review it? Probably not. I focus on what Google will be doing in a year or two, not what Google has been doing for a decade. Ah, for the days of yesteryear. When grapes were not sour. When newspapers were the information giants. When paper was cheap. When eight year olds would peddle them for a few pence. When ink was economical. When there was no other way to get information…

Stephen Arnold, May 1, 2009

The Beeb and Alpha

April 30, 2009

I am delighted that the BBC, the once non commercial entity, has a new horse to ride. I must admit that when I think of the UK and horse to ride, my mind echoes with the sound of Ms. Sperling saying, “Into the valley of death rode the 600”. The story (article) here carries a title worthy of the Google-phobic Guardian newspaper: “Web Tool As Important as Google.” The subject is the Wolfram Alpha information system which is “the brainchild of British-born physicist Stephen Wolfram”.

Wolfram Alpha is a new content processing and information system that uses a “computational knowledge engine”. There are quite a few new search and information processing systems. In fact, I mentioned two of these in recent Web log posts: NetBase here and Veratect here.

image

Can Wolfram Alpha or another search start up Taser the Google? Image source:

In my reading of the BBC story includes a hint that Wolfram Alpha may have a bit of “fluff” sticking to its ones and zeros. Nevertheless, I sensed a bit of glee that Google is likely to face a challenge from a math-centric system.

Now let’s step back:

First, I have no doubt that the Wolfram Alpha system will deliver useful results. Not only does Dr. Wolfram have impeccable credentials, he is letting math do the heavy lifting. The problem with most NLP and semantic systems is that humans are usually needed to figure out certain things regarding “meaning” of and in information. Like Google, Dr. Wolfram lets the software machines grind away.

Second, in order to pull of an upset of Google, Wolfram Alpha will need some ramp up momentum. Think of the search system as a big airplane. The commercial version of the big airplane has to be built, made reliable, and then supported. Once that’s done, the beast has to taxi down a big runway, build up speed, and then get aloft. Once aloft, the airplane must operate and then get back to ground for fuel, upgrades, etc. The Wolfram Alpha system is in it early stages.

Third, Google poses a practical problem to Wolfram Alpha and to Microsoft, Yahoo, and the others in the public search space. Google keeps doing new things. In fact, Google doesn’t have to do big things. Incremental changes are fine. Cumulatively these increase Google’s lead or its “magnetism”, if you will. So competitors are going to have to find a way to leapfrog Google. I don’t think any of the present systems have the legs for this jump, including Wolfram Alpha because it is not yet a commercial grade offering. When it is, I will reassess my present view. What competitors are doing is repositioning themselves away from Google. Instead of getting sand kicked in one face on the beach, the competitors are swimming in the pool at the country club. Specialization makes it easier to avoid Googzilla’s hot breath.

To wrap up, I hope Wolfram Alpha goes commercial quickly. I want to have access to its functions and features. Before that happens, I think that the Beeb and other publishing outfits will be rooting for the next big thing in the hopes that once of these wizards can Taser the Google. For now, the Tasers are running on a partial charge. The GOOG does not feel them.

Stephen Arnold, May 1, 2009

USA.gov Gets Social

April 30, 2009

What a stunning announcement. Navigate to AllFacebook.com here and read the story “Facebook Signs Agreement with GSA”. At first glance, I thought “GSA” meant the Google Search Appliance. Ho hum. I have heard that the GOOG will be interested in the contract now held by other vendors when recompete time rolls around. Old news. But when I scanned the AllFacebook.com item I learned something quite remarkable. The US government has inked a deal with Facebook.com. The party to the deal is the US General Services Administration, one of the US government’s purchasing and administrative arms. These are big arms, too. Think World Wrestling Federation. The site with the Facebook.com deal is http://www.usa.gov (formerly FirstGov.gov).

Facebook.com is one of the social networking sites that boasts a pretty good retention rate. I have heard that about 65 to 70 percent of sign ups use the service. The Twitter critter retains only about 40 percent. Check my figures because I am operating on conference baloney today. Your taste in stats sandwiches may vary.

The story, written by Nick O’Neill, features a logo of USA.gov that say, “Government made easy.” Okay, how does Facebook.com fit in. The story quotes administration officials who said:

“USA.gov is breaking new ground by migrating to new media sites to provide a presence and to open up a dialog with the public. We know that many other agencies want to do the same, and having these agreements is an important first step.” Under the new administration and the leadership of a new CTO and CIO, government agencies will get access to many of the publicly available technologies that would have previously been impossible to include within projects. I know it’s silly but advertising a government job on Facebook would have taken so many hurdles previously that in the end it would not be worth it.

I don’t want to speculate on how the USA.gov site will leverage the Facebook.com service. I must go on record as honking, “The GSA is showing some teen spirit.” I do have some questions flapping around. I will capture one before it wings away: “What security provisions will be put in place to deal with the issues related to personal or sensitive information?”

Facebook.com is reasonably secure unless a careless person becomes careless with friend lists, user name and password, and what’s posted. A happy honk to Facebook.com for the deal. The security folks at the GSA will be popular in the near future I wager.

Stephen Arnold, April 30, 2009

NetBase and Content Intelligence

April 30, 2009

Vertical search is alive and well. Technology Review described NetBase’s Content Intelligence here. The story, written by Erica Naone, was “A Smarter Search for What Ails You”. Ms. Naone wrote:

organizes searchable content by analyzing sentence structure in a novel way. The company created a demonstration of the platform that searches through health-related information. When a user enters the name of a disease, he or she is most interested in common causes, symptoms, and treatments, and in finding doctors who specialize in treating it, says Netbase CEO and cofounder Jonathan Spier. So the company’s new software doesn’t simply return a list of documents that reference the disease, as most search engines would. Instead, it presents the user with answers to common questions. For example, it shows a list of treatments and excerpts from documents that discuss those treatments. The Content Intelligence platform is not intended as a stand-alone search engine, Spier explains. Instead, Netbase hopes to sell it to companies that want to enhance the quality of their results.

NetBase (formerly Accelovation) has developed a natural language processing system.Ms. Naone reported:

NetBase’s software focuses on recognizing phrases that describe the connections between important words. For example, when the system looks for treatments, it might search for phrases such as “reduce the risk of” instead of the name of a particular drug. Tellefson notes that this isn’t a matter of simply listing instances of this phrase, rather catching phrases with an equivalent meaning. Netbase’s system uses these phrases to understand the relationship between parts of the sentence.

At this point in the write up, I heard echoes of other vendors with NLP, semantics, bound phrase identification, etc. Elsevier has embraced the system for its illumin8 service. You can obtain more information about this Elsevier service here. Illumin8 asked me, “What if you could become an expert in any topic in a few minutes?” Wow!

The NetBase explanation of content intelligence is:

… understanding the actual “meaning” of sentences independent of custom lexicons. It is designed to handle myriads of syntactical sentence structures – even ungrammatical ones – and convert them to logical form. Content Intelligence creates structured semantic indexes from massive volumes of content (billions of web-pages and documents) used to power question-and-answer type of search experiences.

NetBase asserts:

Because NetBase doesn’t rely on custom taxonomies, manual annotations or coding, the solutions are fully automated, massively scalable and able to be rolled-out in weeks with a minimal amount of effort. NetBase’s semantic index is easy to keep up-to-date since no human editing or updates to controlled vocabulary are needed to capture and index new information – even when it includes new technical terms.

Let me offer several observations:

  • The application of NLP to content is not new and it imposes some computational burdens on the search system. To minimize those loads, NLP is often constrained to content that contains a restricted terminology; for example, medicine, engineering, etc. Even with a narrow focus, NLP remains interesting.
  • “Loose” NLP can squirm around some of the brute force challenges, but it is not yet clear if NLP methods are ready for center stage. Sophisticated content processing often works best out of sight, delivering to the user delightful, useful ways to obtain needed information.
  • A number of NLP systems are available today; for example, Hakia. Microsoft snapped up PowerSet. One can argue that some of the Inxight technology acquired first by Business Objects then by the software giant SAP are NLP systems. To my knowledge, none of these has scored a hat trick in revenue, customer uptake, and high volume content processing.

You can get more information about NetBase here. You can find demonstrations and screenshots. A good place to start is here. According to TechCrunch:

NetBase has been around for a while. Originally called Accelovation, it has raised $9 million in two rounds of venture funding over the past four years, has 30 employees…

In my files, I had noted that the funding sources included Altos Ventures and ThomVest, but these data may be stale or just plain wrong. I don’t have enough information about Netbase to offer substantive comments. NLP requires significant computing horsepower. I need to know more about the plumbing. Technology Review provided the sizzle. Now we need to know about the cow from which the prime rib comes.

Stephen Arnold, April 30, 2009

Veratect: Trend Prediction

April 30, 2009

A happy quack to the reader who sent me a link to the New Zealand Herald’s story “Tech Start-Up Picked Up Swine Flu Trends ‘Weeks Ago’ here. The story carried a Seattle dateline, so this may be old news. The story reported that

Veratect, a two-year-old company with less than 50 employees, combines computer algorithms with human analysts to monitor online and off-line sources for hints of disease outbreaks and civil unrest worldwide.  It tracks thousands of “events” each month – an odd case of respiratory illness, or a run on over-the-counter medicines, for example – then ranks them for severity and posts them on a subscription-only web portal for clients who want early warnings.

Veratect extracts information from online sources and online data flows. If you are interested in the company, you can obtain more information here.

Stephen Arnold, April 30, 2009

Bandwidth Cost

April 29, 2009

A happy quack to the reader who wrote, asking me to comment on the cost of bandwidth. His point of reference was the New York Times’s article “In Developing Countries, Web Grows Without Profit” here.

“I believe in free, open communications,” Dmitry Shapiro, the company’s chief executive, said. “But these people are so hungry for this content. They sit and they watch and watch and watch. The problem is they are eating up bandwidth, and it’s very difficult to derive revenue from it.”

My views on this issue are well documented in my books and studies. Let me recap three ideas and invite feedback on these.

First, most users and content centric outfits make errors when estimating the costs of online access. Unexpected spikes in telco fees are even today in my experience greeted with surprise and indignation. I hesitate to suggest that bandwidth is assumed to be cheap, readily available, and without much technical interest. As the New York Times’s article points out, bandwidth is an issue, and it can be a deal breaker financially and technically.

Second, in theory bandwidth is unlimited. The “unlimited” comes with two trap doors. One is the money available to apply to the problem. Bandwidth, even today, is not free. Someone has to build the plumbing, pay for infrastructure, hire the technical staff, and work the back office procedures. The second trap door is time. It is possible in Kentucky to make a call and get more bandwidth. But within the last two months, we found that making this call did not result in immediate bandwidth. The vendor said, “We can reprovision you within 72 hours. Take it or leave it.” The reason the vendor made the statement I learned was a result of tightening financial noose around the vendor’s neck. The vendor in turn told me to wait.

Third, user expectations are now being shaped in a direction that makes bandwidth, infrastructure, and technical resources increasingly fragile. Here’s an example. Last night in a restaurant, a young man at a table next to mine watched a YouTube.com video on a mobile device. That young man in Boston and young people throughout the world see the Internet (wireless or wireline) as a broadcast channel. In my experience, this shift to rich media will put financial and technical pressure on infrastructure needed for this use of the Internet.

In short, I think there’s a cost problem looming. Will it arrive before the technical problem? Pick your poison.

Stephen Arnold, April 29, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta