The Beeb and Alpha

April 30, 2009

I am delighted that the BBC, the once non commercial entity, has a new horse to ride. I must admit that when I think of the UK and horse to ride, my mind echoes with the sound of Ms. Sperling saying, “Into the valley of death rode the 600”. The story (article) here carries a title worthy of the Google-phobic Guardian newspaper: “Web Tool As Important as Google.” The subject is the Wolfram Alpha information system which is “the brainchild of British-born physicist Stephen Wolfram”.

Wolfram Alpha is a new content processing and information system that uses a “computational knowledge engine”. There are quite a few new search and information processing systems. In fact, I mentioned two of these in recent Web log posts: NetBase here and Veratect here.

image

Can Wolfram Alpha or another search start up Taser the Google? Image source:

In my reading of the BBC story includes a hint that Wolfram Alpha may have a bit of “fluff” sticking to its ones and zeros. Nevertheless, I sensed a bit of glee that Google is likely to face a challenge from a math-centric system.

Now let’s step back:

First, I have no doubt that the Wolfram Alpha system will deliver useful results. Not only does Dr. Wolfram have impeccable credentials, he is letting math do the heavy lifting. The problem with most NLP and semantic systems is that humans are usually needed to figure out certain things regarding “meaning” of and in information. Like Google, Dr. Wolfram lets the software machines grind away.

Second, in order to pull of an upset of Google, Wolfram Alpha will need some ramp up momentum. Think of the search system as a big airplane. The commercial version of the big airplane has to be built, made reliable, and then supported. Once that’s done, the beast has to taxi down a big runway, build up speed, and then get aloft. Once aloft, the airplane must operate and then get back to ground for fuel, upgrades, etc. The Wolfram Alpha system is in it early stages.

Third, Google poses a practical problem to Wolfram Alpha and to Microsoft, Yahoo, and the others in the public search space. Google keeps doing new things. In fact, Google doesn’t have to do big things. Incremental changes are fine. Cumulatively these increase Google’s lead or its “magnetism”, if you will. So competitors are going to have to find a way to leapfrog Google. I don’t think any of the present systems have the legs for this jump, including Wolfram Alpha because it is not yet a commercial grade offering. When it is, I will reassess my present view. What competitors are doing is repositioning themselves away from Google. Instead of getting sand kicked in one face on the beach, the competitors are swimming in the pool at the country club. Specialization makes it easier to avoid Googzilla’s hot breath.

To wrap up, I hope Wolfram Alpha goes commercial quickly. I want to have access to its functions and features. Before that happens, I think that the Beeb and other publishing outfits will be rooting for the next big thing in the hopes that once of these wizards can Taser the Google. For now, the Tasers are running on a partial charge. The GOOG does not feel them.

Stephen Arnold, May 1, 2009

USA.gov Gets Social

April 30, 2009

What a stunning announcement. Navigate to AllFacebook.com here and read the story “Facebook Signs Agreement with GSA”. At first glance, I thought “GSA” meant the Google Search Appliance. Ho hum. I have heard that the GOOG will be interested in the contract now held by other vendors when recompete time rolls around. Old news. But when I scanned the AllFacebook.com item I learned something quite remarkable. The US government has inked a deal with Facebook.com. The party to the deal is the US General Services Administration, one of the US government’s purchasing and administrative arms. These are big arms, too. Think World Wrestling Federation. The site with the Facebook.com deal is http://www.usa.gov (formerly FirstGov.gov).

Facebook.com is one of the social networking sites that boasts a pretty good retention rate. I have heard that about 65 to 70 percent of sign ups use the service. The Twitter critter retains only about 40 percent. Check my figures because I am operating on conference baloney today. Your taste in stats sandwiches may vary.

The story, written by Nick O’Neill, features a logo of USA.gov that say, “Government made easy.” Okay, how does Facebook.com fit in. The story quotes administration officials who said:

“USA.gov is breaking new ground by migrating to new media sites to provide a presence and to open up a dialog with the public. We know that many other agencies want to do the same, and having these agreements is an important first step.” Under the new administration and the leadership of a new CTO and CIO, government agencies will get access to many of the publicly available technologies that would have previously been impossible to include within projects. I know it’s silly but advertising a government job on Facebook would have taken so many hurdles previously that in the end it would not be worth it.

I don’t want to speculate on how the USA.gov site will leverage the Facebook.com service. I must go on record as honking, “The GSA is showing some teen spirit.” I do have some questions flapping around. I will capture one before it wings away: “What security provisions will be put in place to deal with the issues related to personal or sensitive information?”

Facebook.com is reasonably secure unless a careless person becomes careless with friend lists, user name and password, and what’s posted. A happy honk to Facebook.com for the deal. The security folks at the GSA will be popular in the near future I wager.

Stephen Arnold, April 30, 2009

SAP Makes Tiny TREX in the Enterprise

April 30, 2009

I pay attention to SAP for three reasons. First, it is a compass company that helps me get a sense of what’s ahead for large, old-fashioned software and services companies. As more vendors push on premises software plus for fee services, the SAP “example” is quite important in my opinion.

Second, SAP made an investment in its own search system TREX and then last year permitted its venture arm to pump money into Endeca, a company that was founded in 1999. I think there was some Inktomi and Yahoo DNA in the company at its inception. The company, therefore, can provide a baked in search solution or point to a third party such as Endeca to crack the findability problems that the SAP file systems pose to some SAP users.

Third, Microsoft wanted to buy SAP. I learned in Europe last year that more than half of SAP’s installations are Windows centric. This means that SAP drinks and must survive on the Microsoft Kool-Aid. With search an interesting initiative at Microsoft, the dependence on Microsoft suggests to me that Fast ESP could be provided as an add on or a component of one or more Microsoft servers. In a nonce (a word I quite like), the TREX and third party solutions have a new challenge with which to deal.

I read with interest the story on CNBC called “SAP Profit Misses Expectations, Keeps Outlook”. You can read the item here. After mentioning tht operating profit slipped eight percent, SAP did return a profit. The CNBC item then stated:

Analysts have said cost management would be important this year as investors are keen to see whether SAP can defend its operating margin and market share against Oracle. SAP said it expects 2009 operating environment to remain challenging and that it will continue with cost-cutting measures, adding that measures initiated in October had “really taken hold”, without elaborating. SAP implemented measures to save costs in October after sales dropped sharply and said it would continue to slash costs amid a reduction of its workforce to 48,500 from 51,800 at the time. Walldorf-based SAP has given no target for its key software and software-related sales this year.

The questions that winged through my mind were:

  1. If Microsoft bundles Fast ESP with servers destined for SAP customers, what happens to TREX? Will it find itself in the same position as the Oracle SES10g sytem that is desperately seeking connectors and customers?
  2. Will SAP put more money into Endeca or step back? Either choice will have some implications for Endeca? Uncertainty in uncertain times is like an unknown exponent in an equation in my opinion.
  3. What about third party vendors who assert support for SAP systems? I don’t want to mention any of these folks, but I know that SAP connectivity and search can be fraught with challenges. Heck, integration can be downright exciting. If SAP goes Microsoft for search, this action may further churn the enterprise search waters.

My opinion is that SAP will have to focus on revenues and margins. If that means sacrificing employees and partners, my hunch is that the SAP ship will toss some excess baggage overboard. I don’t recall many life rafts dotting the sides of the building in lovely Waldorf.

Stephen Arnold, April 30, 2009

NetBase and Content Intelligence

April 30, 2009

Vertical search is alive and well. Technology Review described NetBase’s Content Intelligence here. The story, written by Erica Naone, was “A Smarter Search for What Ails You”. Ms. Naone wrote:

organizes searchable content by analyzing sentence structure in a novel way. The company created a demonstration of the platform that searches through health-related information. When a user enters the name of a disease, he or she is most interested in common causes, symptoms, and treatments, and in finding doctors who specialize in treating it, says Netbase CEO and cofounder Jonathan Spier. So the company’s new software doesn’t simply return a list of documents that reference the disease, as most search engines would. Instead, it presents the user with answers to common questions. For example, it shows a list of treatments and excerpts from documents that discuss those treatments. The Content Intelligence platform is not intended as a stand-alone search engine, Spier explains. Instead, Netbase hopes to sell it to companies that want to enhance the quality of their results.

NetBase (formerly Accelovation) has developed a natural language processing system.Ms. Naone reported:

NetBase’s software focuses on recognizing phrases that describe the connections between important words. For example, when the system looks for treatments, it might search for phrases such as “reduce the risk of” instead of the name of a particular drug. Tellefson notes that this isn’t a matter of simply listing instances of this phrase, rather catching phrases with an equivalent meaning. Netbase’s system uses these phrases to understand the relationship between parts of the sentence.

At this point in the write up, I heard echoes of other vendors with NLP, semantics, bound phrase identification, etc. Elsevier has embraced the system for its illumin8 service. You can obtain more information about this Elsevier service here. Illumin8 asked me, “What if you could become an expert in any topic in a few minutes?” Wow!

The NetBase explanation of content intelligence is:

… understanding the actual “meaning” of sentences independent of custom lexicons. It is designed to handle myriads of syntactical sentence structures – even ungrammatical ones – and convert them to logical form. Content Intelligence creates structured semantic indexes from massive volumes of content (billions of web-pages and documents) used to power question-and-answer type of search experiences.

NetBase asserts:

Because NetBase doesn’t rely on custom taxonomies, manual annotations or coding, the solutions are fully automated, massively scalable and able to be rolled-out in weeks with a minimal amount of effort. NetBase’s semantic index is easy to keep up-to-date since no human editing or updates to controlled vocabulary are needed to capture and index new information – even when it includes new technical terms.

Let me offer several observations:

  • The application of NLP to content is not new and it imposes some computational burdens on the search system. To minimize those loads, NLP is often constrained to content that contains a restricted terminology; for example, medicine, engineering, etc. Even with a narrow focus, NLP remains interesting.
  • “Loose” NLP can squirm around some of the brute force challenges, but it is not yet clear if NLP methods are ready for center stage. Sophisticated content processing often works best out of sight, delivering to the user delightful, useful ways to obtain needed information.
  • A number of NLP systems are available today; for example, Hakia. Microsoft snapped up PowerSet. One can argue that some of the Inxight technology acquired first by Business Objects then by the software giant SAP are NLP systems. To my knowledge, none of these has scored a hat trick in revenue, customer uptake, and high volume content processing.

You can get more information about NetBase here. You can find demonstrations and screenshots. A good place to start is here. According to TechCrunch:

NetBase has been around for a while. Originally called Accelovation, it has raised $9 million in two rounds of venture funding over the past four years, has 30 employees…

In my files, I had noted that the funding sources included Altos Ventures and ThomVest, but these data may be stale or just plain wrong. I don’t have enough information about Netbase to offer substantive comments. NLP requires significant computing horsepower. I need to know more about the plumbing. Technology Review provided the sizzle. Now we need to know about the cow from which the prime rib comes.

Stephen Arnold, April 30, 2009

Veratect: Trend Prediction

April 30, 2009

A happy quack to the reader who sent me a link to the New Zealand Herald’s story “Tech Start-Up Picked Up Swine Flu Trends ‘Weeks Ago’ here. The story carried a Seattle dateline, so this may be old news. The story reported that

Veratect, a two-year-old company with less than 50 employees, combines computer algorithms with human analysts to monitor online and off-line sources for hints of disease outbreaks and civil unrest worldwide.  It tracks thousands of “events” each month – an odd case of respiratory illness, or a run on over-the-counter medicines, for example – then ranks them for severity and posts them on a subscription-only web portal for clients who want early warnings.

Veratect extracts information from online sources and online data flows. If you are interested in the company, you can obtain more information here.

Stephen Arnold, April 30, 2009

SharePoint Joel Is Revved Up

April 30, 2009

My newsreader delivered “Five No Brainer Reasons to Install SP2 (When you are ready!!!)” The three exclamation points were in the original post. The author of this article was Joel Oleson. It is clear that he put considerable work into the lists that make up this Web log post. You can read his five reasons to install Service Pack 2 for SharePoint here. If you skip over the five reasons and navigate to the list of links, you will find a number of useful pointers. The cumulative effect of these lists are positive and negative. The positive side is that the lists of SharePoint resources will save hours if not days of hunting and pecking for information. The negative side is that the lists are a mute reminder of the crazy engineering that created the need for Mr. Oleson’s post in the first place.

Stephen Arnold, April 30, 2009

Consultant Discerns Subtle Trend: Shocks Business Community

April 30, 2009

I found “A Value Focus for Enterprise IT” here a darned remarkable write up. Andrew Parker reported:

Recently, Forrester Research has been picking up signals of a shift in mindset among IT executives. For weeks after the credit crunch hit, most IT leaders welcomed “survive the recession” advice. But, after months of firefighting, downturn fatigue has set in. No one thinks our troubles are over, but apparently many IT leaders feel they have taken ­ or at least defined ­ the steps needed to support business stability in the short term. Now they want to refocus on more thoughtful topics.

The economic news has been reasonably easy to figure out. I am not able to summarize the entire write up. But I can highlight one of the consulting firm’s findings about the impact of the economy on information technology:

IT leaders must drive value individually. The CIO and direct reports must engage with business leaders to identify and execute on areas of value delivery. Inside the IT group this means leading by example ­ for instance, by devoting time to business issues and prominently featuring business impact in internal communications. Outside IT, building personal networks across the wider business, and marketing IT activities to business leaders.

I was thinking that IT budgets are going to be under scrutiny, managers who muff their numbers will be out of work, and procurements will be under great pressure to deploy systems that * actually work *. I did not know that the economy was bad. Thank you, consultants and journalists for this insight.

Stephen Arnold, April 30, 2009

Bandwidth Cost

April 29, 2009

A happy quack to the reader who wrote, asking me to comment on the cost of bandwidth. His point of reference was the New York Times’s article “In Developing Countries, Web Grows Without Profit” here.

“I believe in free, open communications,” Dmitry Shapiro, the company’s chief executive, said. “But these people are so hungry for this content. They sit and they watch and watch and watch. The problem is they are eating up bandwidth, and it’s very difficult to derive revenue from it.”

My views on this issue are well documented in my books and studies. Let me recap three ideas and invite feedback on these.

First, most users and content centric outfits make errors when estimating the costs of online access. Unexpected spikes in telco fees are even today in my experience greeted with surprise and indignation. I hesitate to suggest that bandwidth is assumed to be cheap, readily available, and without much technical interest. As the New York Times’s article points out, bandwidth is an issue, and it can be a deal breaker financially and technically.

Second, in theory bandwidth is unlimited. The “unlimited” comes with two trap doors. One is the money available to apply to the problem. Bandwidth, even today, is not free. Someone has to build the plumbing, pay for infrastructure, hire the technical staff, and work the back office procedures. The second trap door is time. It is possible in Kentucky to make a call and get more bandwidth. But within the last two months, we found that making this call did not result in immediate bandwidth. The vendor said, “We can reprovision you within 72 hours. Take it or leave it.” The reason the vendor made the statement I learned was a result of tightening financial noose around the vendor’s neck. The vendor in turn told me to wait.

Third, user expectations are now being shaped in a direction that makes bandwidth, infrastructure, and technical resources increasingly fragile. Here’s an example. Last night in a restaurant, a young man at a table next to mine watched a YouTube.com video on a mobile device. That young man in Boston and young people throughout the world see the Internet (wireless or wireline) as a broadcast channel. In my experience, this shift to rich media will put financial and technical pressure on infrastructure needed for this use of the Internet.

In short, I think there’s a cost problem looming. Will it arrive before the technical problem? Pick your poison.

Stephen Arnold, April 29, 2009

Google Pirate Bay Parallel

April 29, 2009

Silicon.com here published “Does Pirate Bay verdict spell trouble for Google?” Simon Levine, the author, reviews the basics of the Pirate Bay file sharing matter. One of the comments that caught my attention was:

One interesting feature of the case is that the founders of the site were found guilty of helping to make copyright-protected content available, a secondary act of infringement which is different to actual copyright infringement in the traditional or primary sense. Similar to Napster, The Pirate Bay site did not host audio and video files, but instead included links to material hosted elsewhere on the internet. In other words, the site was found to be guilty of facilitating illegal file-sharing.

In my opinion, the Pirate Bay matter is likely to cast a shadow across other indexing services. Whether the shadow presages dawn or sunset is unknown to me. I don’t think the matter will go away quietly into that good night.

Stephen Arnold, April 29, 2009

Google Education

April 29, 2009

In my Google: The Digital Gutenberg here, I comment about Google’s increasing utility in education. The Google Channel, available on YouTube.com, can serve as a basic course in computer science or it can be a resource for a PhD student. The Google services, when carefully filtered by a professor, can provide a full educational resource. With educational publishers under increasing financial pressure, an online alternative will emerge. Will Google emerge as the dominant player? Who knows?

If you are interested in this aspect of Google, you will find Google Apps Education: The Rise of Cloud Computing on Campushere a useful Educom session to attend. The angle taken in the article reflects from Google Apps. For me the most interesting comment in the write up was the identification of Jeff Keltner as one of the Googlers playing a role in Google’s educational initiative. His title is Google Applications for Education.

Stephen Arnold, April 29, 2009

Next Page »