Overflight Enhancements

December 9, 2008

ArnoldIT.com’s Google monitoring service made some changes over the last few days. You can access the service by clicking here. Overflight Google allows you to look at the most recent Web log posts on more than 70 Google Web logs. The change is the addition of a link that says, “Show Overflight Update Stream”. When you click it, we display the additions to Google Web logs and put the date on each item. The Update Stream function has been added for each of the Google Web log clusters. If you want to scan headlines, you can browse the most recent items for each of the Google Web logs.

The other enhancement is the addition of entity extraction to the Exalead search system’s index of the corpus of Google Web logs. I am not too happy with the phrase “vertical search”, but I must admit, the Exalead index of more than 70 Web logs is a sharply focused vertical search engine. Here’s a screen shot of the Exalead entity extraction. You can use it to learn the name of the Google customer at Genentech and similar interesting ways to learn about the GOOG.

entity extraction 1

A happy quack to the Exalead team. More enhancements are coming. If you would like an Overflight service on your Web site, write seaky2000 at Yahoo dot come.

Stephen Arnold, December 9, 2008

Google and Salesforce.com: The Plot Thickens

December 8, 2008

For years, I have heard that Google had an interest in Salesforce.com. In my for-fee briefings, I dig into the Salesforce.com technology for multi tenant applications. I am certainly no wizard in the magical world of patent documents, but I thought some of the Salesforce.com methods were somewhat elaborate. In those briefings, I commented that Google seemed to have another approach that exploited some of its more unusual inventions. One example is the elaborate system to determine the context of a user. I refer to these as the Guha patent documents. There are others, of course. My point is that Google seemed to be building functions into its broader data management and container operations. (Please, don’t write and ask me for these briefings. I don’t release that type of information into the wild nor in these largely recycled Web log musings.)

I read “Force.com + Google App Engine = Cloud Relationship Management” by Steve Gillmor here with thought, “Yep, the GOOG is on the move.” Mr. Gillmor’s write up’s lead paragraph hit the nail on the head. He wrote on December 7, 2008, “Salesforce and Google have extended their strategic partnership with Force.com for the Google App Engine.” His article provides useful technical background and some observations about Google’s approach to an “operating system.” You will want to read this article and then save it to your GoogleOnTheMove folder.

My take on this expanded use of the Google App Engine reaches outside the boundaries of Mr. Gillmor’s story. My thoughts are:

  • Google gets Salesforce.com to hook into more Google technology without significant risk or cost. If Salesforce.com’s multi-tenant technology is suitably impressive, Google could increase its involvement with Salesforce.com. If the merged clouds don’t work too well, Google has learned possibly significant information about the Salesforce.com approach.
  • Google receives valuable information about such factors as the efficiency of the Salesforce.com system
  • Google has a reasonably well-controlled lab test for hooking clouds together. The Salesforce.com cloud is more of a wrapper around the data stores at the core of Salesforce.com. Google is more of a next-generation cloud engineered to minimize certain types of bottlenecks associated with traditional database management systems.

Salesforce.com, on the other hand, has more marketing clout. I have heard that the Google relationship makes otherwise dry explanations of multi-tenant technology more interesting. Who knows? Sales presentations are like magic. What you see is often not what allows the magician to entertain and enthrall the audience.

The big loser in the deal is Microsoft. The Google and Salesforce.com relationship comes at a time when Microsoft is making a push for its Dynamics system. Customers will want to hear about the new Google-Salesforce.com deal. That can complicate some procurements and maybe derail some others.

But the best is that Google still retains its freedom with regard to CRM. Google can still buy Salesforce.com or it can pass. Google can sign similar cloud federation deals with other vendors, or at some point, stitch together existing Google services to offer its own cloud-based CRM solution. To sum up, the Google is once again using its mass to distort the enterprise information market. Google’s “dark matter” lets it exert influence in ways that can be difficult to detect.

Stephen Arnold, December 8, 2008

Google: Putting Capex on a Diet

December 8, 2008

The point to keep in mind is that Google has been working for a decade to build out its infrastructure. One of the benefits of the company’s willingness to tackle hard engineering problems is that Google obtains a better return on its hardware dollar. Data included in my 2005, The Google Legacy suggested that Google can spend a dollar and get as much as five times to performance that a non-Googlized data center would get. The data appeared in Google technical papers. Some of these papers were written by big Googlers; others by small Googlers. What the performance data share is information that provides a glimpse of the computing capability in Google’s data centers. If we flip the performance data around, a competitor would have to invest as much as five times what Google spends to get comparable performance. Is Google’s engineering that cost effective? Well, a five hundred percent performance boost may be optimistic, but when a data center can cost $600 million the implications are interesting. A competitor would have to spend more than Google to match Google’s performance on data manipulation, disk reads, and queries per second. Let’s assume that Google gets a 25 percent boost. For a competitor to match Google’s performance, the competitor would have to have the known bottlenecks under control and then spend another $125 million which makes a $600 million data center hit the books at $725 million. If you pick a larger performance boost such as two hundred percent, the $600 million data center will require $1.2 billion in capex to match Google’s capacity. Of course, no one would believe that Google wrings such a performance advantage from its commodity hardware. Competitors prefer branded equipment. What’s in the back of my mind is that Google may be keeping its cards close to its chest.

The Washington Post’s “Google Turns Down Some of NC’s Tax Incentives” explains that the economic downturn, among other factors, may be causing Google to trim its capital expenditures. The Washington Post here quotes a letter Google sent to North Carolina officials. For me the key phrase was:

While Google “remains pleased and committed to its Lenoir operations,” economic conditions make it too difficult to be sure the $600 million data center complex will expand as fast as previously thought, the letter said. “Yet the company fully expects to achieve employment and capital investment levels that are consistent with those that the state announced in 2007,” Charlotte attorney John N. Hunter wrote on behalf of Google.

The Google capex expenditures are going to become more important. The economic downturn is affecting most organizations, and I think the GOOG may be battening down its hatches. Good Morning Silicon Valley takes this position. You can read its take on the capex shift here.

What happens if Google does trim its capex for data centers? Maybe Microsoft’s new data centers will leap frog over Google? Google could find itself on the wrong side of high performance if Microsoft builds its own super performance innovations into its data centers. What the Washington Post makes clear is that Google is slowing down at least in North Carolina. The Google may be trying to trim costs by rethinking certain investments. This is another sign of Google’s increasing maturity and could indicate the opening that Microsoft needs to hobble the search Googzilla.

Stephen Arnold, December 6, 2008

Arnold on Disintermediation in New Italian Compendium

December 8, 2008

December 2008 is shaping up as a busy book month. I received on December 6, 2008, my copy of “Galassia Web: La Cultura nella Rete”, published by Civita Associazione with the support of Boeing. I contributed a chapter that begins on page 67 and ending on page 80. My contribution was “Giochi di Open Access e altre nuove tecnologie di communicazione: la tentazione disintermediazion”. If your Italian is a bit rusty, the approximate English translation is “The Interplay of Open Access and Other New Technologies.”

italy01

The main point of my contribution hinges on Disintermediation. Institutions such as museums and libraries want to provide an online catalog and some type of access to the information under their stewardship. But large companies such as Google are slowly aggregating a broad range of content. For now, commercial enterprises have not shown a desire to create an aggregated service that includes indexes, images, music, and other information public institutions have created. The risk is that unless groups of institutions take the lead in aggregation, the commercial service may by default become the library or the museum for Internet users. In short, the disintermediation that ravaged commercial online services and corporate libraries may now have an impact on the information now in the control of universities, public agencies, privately-endowed institutions, and governmental entities. I don’t have a timeline but I make the point that acting in a parochial way may waste time. Action can provide a countermeasure for the forces of disintermediation.

I want to send a happy quack to the publisher, Moira Macpherson, and the editorial team that made this collection of essays a reality. So, here comes, “Quack!”

Stephen Arnold, December 8, 2008

Yahoo Jumping Ahead of Google

December 7, 2008

On December 7, 2008, PCWorld reported that Yahoo will offer abstracts, not laundry lists of search results. The news story I saw appeared in the Yahoo technology news service. You can read “Yahoo Technology Will Offer Abstracts of Search Results” here. If the link goes dead, try the PCWorld site itself here. When I saw the story, the search engine on the PCWorld site couldn’t locate the story. Nothing new there, of course. The key point in the unsigned article was that Yahoo’s Bangalore research facility has figure out how to abstract key information on the page. The idea is that when a user searches for “hotel”, the system would provide an address, map, and other information. I described a similar function in my description of Google’s dossier function. See US20070198481. According to the news story, Yahoo will roll out this service in 2009. My thought is that these types of smart services work really well when described on paper. The value of these “reports” or “answer” type systems is that language can be tricky. Google’s approach relies on “context”, a system and method disclosed in the February 2007 patent documents filed by Google’s Ramanathan Guha. My hunch is that Yahoo went public because of the rumors that Google was starting to use some of its niftier technology in certain public facing services. The Googler with whom I had interaction in London knew zero about the dossier function. Maybe Yahoo is trying to jump ahead of Google. We’ll see. I think Yahoo needs to address the shortcomings of its core search service first.

Stephen Arnold, December 7, 2008

Social Software Failures

December 7, 2008

On the flight from London to lovely Kentucky, I reflected on the “big buzz” at the International Online Conference. Delegates seemed fascinated by “social” software companies, features, applications, and technology. From the keynote to the endnote, social was the cat’s pajamas.

You will want to read J.W. Crump’s “A Look at Failed Social Networks”. You can find the article at BivingsReport.com here. The write up presents two cases in sufficient detail to provide useful insights into the use of “crowdsourcing” to provide various features and benefits to users. His analysis of Wal*Mart’s The Hub reveals that the service did not allow its users sufficient freedom.

The second case was VitalSkate, a site for those who enjoy ice skating. The lesson extracted from this social software service was the operator did not understand the users of the site.

The third case was iYomu, which was a social software site for folks like me who are older. Among the reasons this site failed was it was its lack of purpose.

For me, the most interesting chunk of information was the inclusion of a timeline prepared by Danah Boyd and Nicole Ellison. Scanning the list is an easy way to identify the major players and the steady increase in these types of Web sites.

The thoughts that struck me as I reviewed Mr. Crump’s article were:

  1. Social networks seem to require purpose. entertainment, and keen understanding of the needs of the site’s users.
  2. Social software appeals to those who are young in heart or who a significant need and find that requirement satisfied by a service that is appropriate and fun to use.
  3. Social software and its attendant services are not a slam dunk. Despite the excitement MySpace.com and Facebook.com seem to offer services with wide appeal to a youthful demographic.

When the social “big buzz” is transported to an organization, a number of questions may require some consideration before deploying one of the zippy new services I saw demonstrated at the International Online Show; for example:

  1. A small organization may not have the number of employees and authorized users to make a social site generate sufficient traffic and information to warrant keeping the service online
  2. The cost of implementing and verifying a workable security system may be too onerous for most organizations. With a slap dash approach, the security and privacy methods may leave the organization exposed
  3. Are employees in an organization willing to participate in social software services? If the financial pressures are increasing, employees may be unwilling to allocate the time necessary to participate in meaningful ways.

I understand the interest in social software and the functions it makes available. The question is, “Will these services offer up enough tangible benefits to make the investment worthwhile?”

For me, the answer to the question is, “It depends.” Some governmental agencies and not-for-profit outfits may find social software helpful. In regulated businesses, I am skeptical.

Stephen Arnold, December 7, 2008

Information 2009: Challenges and Trends

December 4, 2008

Before I was once again sent back to Kentucky by President Bush’s appointees, I recall sitting in a meeting when an administration official said, “We don’t know what we don’t know.” When we think about search, content processing, assisted navigation, and text mining, that catchphrase rings true.

Successes

But we are learning how to deliver some notable successes. Let me begin by highlighting several.

Paginas Amarillas is the leading online business directory in Columbia. The company has built a new systems using technology from a search and content processing company called  Intelligenx. Similar success stories and be identified for Autonomy, Coveo, Exalead, and ISYS Search Software. Exalead has deployed a successful logistics information system which has made customers’ and employees’ information lives easier. According to my sources, the company’s chief financial officer is pleased as well because certain time consuming tasks have been accelerated which reduces operating costs. Autonomy has enjoyed similar success at the US Department of Energy.

Newcomers such as Attivio and Perfect Search also have satisfied customers. Open source companies can also point to notable successes; for example, Lemur Consulting’s use of Flax for a popular UK home furnishing Web site. In Web search, how many of you use Google? I can conclude that most of you are reasonably satisfied with ad-supported Web search.

Progress Evident

These companies underscore the progress that has been made in search and content processing. But there are some significant challenges. Let me mention several which trouble me.

These range from legal inquiries into financial improprieties at Fast Search & Transfer, now part of Microsoft to open Web squabbles about the financial stability of a Danish company which owns Mondosoft, Ontolica, and Speed of Mind. Other companies have shut their doors; for example, Alexa Web search, Delphes, and Lycos Europe. Some firms such as one vendor in Los Angeles has had to slash its staff to three employees and take steps to sell the firm’s intellectual property which rightly concerns some of the company’s clients.

User Concerns

Another warning may be found in the results from surveys such as the one I conducted for a US government agency in 2007 that found dissatisfaction with existing search systems in the 65 percent range. AIIM, a US trade group, out-of-orderreported slightly lower levels of dissatisfaction. Jane McConnell’s recently released study in Paris reports data in line with my findings. We need to be mindful that user expectations are changing in two different ways.

First, most people today know how to search with Google and get useful information most of the time. The fact that Google is search for upwards of 65 percent of North American users and almost 75 percent of European Union users means that Google is the search system by which users measure other types of information access. Google’s influence has been essentially unchecked by meaningful competition for 10 years. In my Web log, I have invested some time in describing Microsoft’s cloud computing initiatives from 1999 to the present day.

For me and maybe many of you, Google has become an environmental factor, and it is disrupting, possibly warping, many information spaces, including search, content processing, data management, applications like word processing, mapping, and others.

time-space-warping

Microsoft is working to counter Google, and its strategy is a combination of software and low adoption costs. I believe that Microsoft’s SharePoint has become the dominant content management, collaboration, and search platform with 100 million licenses in organizations. SharePoint, however, is not well understood as technically complex and a work in progress. Anyone who asserts that SharePoint is simple or easy is misrepresenting the system. Here’s a diagram from a Microsoft Certified Gold vendor in New Zealand. Simple this is not.

sharepoint-vendor-diagram

Read more

FastForward Search Blog on the Future of Blogs

December 3, 2008

I find sponsored Web logs fascinating. These quasi-promotional information services can be informative and quirky. Years ago, the Fast Search & Transfer company fired up the FastForward Web log to provided me and thousands of others with snippets of information about the Fast ESP user group conference. I asked about the user group focus a while back and learned that the Fast Forward Web log was reaching beyond that narrow focus.

That extension is quite evident. In fact, I recently read two posts here about the future of Web logs. One article was “The Uncertain Future of Blogging” by Jevon MacDonald; the other, “In 2010 What Will Replace Newspapers and Network TV?” I found the information in both interesting, in Mr. MacDonald’s piece, the data about media found their way into my statistics file. Then I began to reflect on a sponsored Web log’s role in the future of media. Here’s my chain of reasoning:

  1. A company Web log morphs into a community Web log and the company that started the Web log is acquired by Microsoft. I have little doubt about the potential financial support for the Web log will be available no matter what happens in the wide world of blogging in the months ahead.
  2. The future of media appears to be pretty grim with big companies embracing Web logs. Furthermore, the tools of blogging will now become powerful instruments in the hands of trained media professionals. If print newspapers can’t fly, the pilots will get a new airplane. That airplane may be blogging.
  3. Web log writers today have to innovate and shake blogging out of its doldrums. Big changes are coming fast.

I have over simplified the arguments in these two posts, so you must read the original write ups. What troubles me is that I expect to read about search and content processing, not about the problems of newspapers and other media companies. I want to know about the method Microsoft Fast used to get a government installation in a Scandinavian company up and running to make its spotlight function work well. I want to know how Microsoft Fast will handle voice to text in media files? I want to know how Microsoft Fast will integrate with Dynamics’ information stores held in SQL Server tables? I to know the status of the Microsoft Fast investigation underway in Norway and how to explain the issue to a contract officer who asks me for my view on the subject?

My opinion is that these search-centric topics are now out of bounds or out of information gas. I also think that the Web log is now a philosophical sounding board with a touch of consultant flummery added for color. To some search is less exciting than thinking about the future of Web logs when more newspapers bite the dust. Not to me. I want to read about ESP.

I would be eager to read FastForward if it returned to its roots and presented more substantive information about Microsoft Fast search, content processing, and information technology. I may be too limited in my thinking but a Web log anchored in Fast ESP should address topics germane to the software. But I’m an addled goose, easily confused by buzzwords like Enterprise 2.0 and analyses of the death of old media. What do you think? Should I re evaluate the FastForward blog?

Stephen Arnold, December 3, 2008

New Overflight Features

December 2, 2008

ArnoldIT.com has added two new features to its public Overflight service. Overflight provides a round up of news stories on more than 70 Google Web logs.

Exalead, developers of the CloudView information access system, indexes the Google Web logs. You can access the service from the Overflight splash page here or navigate to this vertical content collection here. Among the features of this index are:

  • A thumbnail view of the Web page in a relevance ranked results list
  • Full text searching of the entire corpus of Google’s Web logs. This means that you can identify a topic such as “semantics” and locate every reference to the subject. Given the recent suggestion that Google is not interested in semantics, you will find that when you run a query for “semantics”, Google’s Web logs make it clear that semantics are an important subject at Google.
  • You can access content with a date filter. With this feature you can segment a results list by time.
  • If a query matches documents in languages other than English, you can select the documents by language. The query “pdf” returns hits in English, Japanese, Portuguese, Spanish, and Korean.

A happy quack to the Exalead technical team. Use the comments section of this Web log to share your ideas with us.

The highest traffic on Overflight is flowing to the Google Ads Web logs here. We have added a continuously updated stream to this selection of Google Web logs. The “river” makes it easy to watch for important new developments in Google advertising services. You can try it here.

Watch for other developments on the public Overflight site. If your firm would like to create this type of information service for your Intranet, Web log, or competitive intelligence program, write us at seaky2000 at yahoo dot com.

Stephen Arnold, December 1, 2008

Social Media and a Search Postscript

December 2, 2008

A happy quack to my one Canadian reader who sent me a link to “Most Businesses Don’t Have a Handle on Social Media Marketing”. You can find this story by Nestor Arellano here. The article appeared in ITBusiness, a Canadian Web site. Mr. Arellano summarizes the results of a survey by the Marketing Executives Networking Group in the US. The study revealed that executives realize social media is important, but at the time of the survey about 60 percent those surveyed in the sample allocate significant money to the this type of marketing. For me the most interesting comment in the article was:

For instance, 33.1 per cent of them said they were never able to determine ROI, while an additional 25.6 per cent said they “hardly ever” knew if their social media efforts were worthwhile.  In addition, only 26.3 per cent thought social media marketing was more effective than using other online media tools for marketing, such as search advertising or display ads.

Mr. Arellano also notes that more than half of the IT staff and senior executives are not too keen about the use of social media for marketing. The concern is that employees will waste time fiddling with the applications. Social tools are likely to enter organizations the way personal computers did in the early 19080s; that is, some employees will just use these systems under the radar of management. One of the applications that may have applicability is a social wiki. Employees and possibly customers can contribute information.

After reading Mr. Arellano’s write up, I remain balanced on the fence with regard to social media in organizations. Employees are often spurred by their advisors and blue chip consultants to use social tools to improve operations, reduce costs, or whatever other benefit attributed to these systems. The issues that disconcert me are:

  1. Liability. Most consultants and employees are not the targets of legal action. As a result, their desires are not tempered by the liabilities certain information systems create the moment the systems become available. Consultants will weasel word their way around responsibility, and the employees often change roles or even jobs, leaving the “problem” to their successors.
  2. Finding what’s in these systems. Most employees and consultants deal with the surface or one aspect of a system. However, when someone asks, “Who gave the customer this information about this product?” someone has to hunt for the answer. My team has had this thankless task, and it is neither easy nor cheap to pull together the information from and within a social media system. We love this type of job, but it is expensive and time consuming. Furthermore, what’s “in” these systems can be quite interesting.
  3. Controlling costs. It is easy to fire up a wiki. It can be expensive to put in place a reliable mechanism to manage the wiki, keep the content fresh, and expand the system so it does not die on the vine. Like newsgroups, participation is often sporadic and brief. Social systems have to be marketed and managed. If ignored, these systems become a cost sinkhole when a problem surfaces. There is neither staff nor time to figure out what happened and then to fix the problem.

There are legitimate uses of messaging systems within organizations and for customers. The approach with which I am most comfortable is one that includes planning, budgeting, controlled testing, focused deployment, assessment, and then next steps, if any. Creating a customer-employee wiki and stepping back to see what happens is too risky for my taste. Companies have learned the importance of managing user groups. Now most organizations with active customer involvement try to schedule “summits” or some other type of organized activity over which the company exercises control. A user group can turn into a snake pit left to its own devices. An unmanaged wiki may have the same characteristics.

The problem of capturing the information, indexing it, and making it searchable is not intractable. Exalead has a social search capability, for example. But some effort is required before the system is deployed. Otherwise, the cost of playing catch up may erode the financial payoff  of the system itself.

We have investigated a number of start ups in the last year. We’ve found that when the organization bakes in social media, the problems of shoe horning a social media system into an established organization are minimized. Consultants and some bureaucrats assume that the new functions can be layered on to the existing infrastructures and business methods. In our experience, the assumption may be incorrect. Furthermore, organizations engineered to use Internet centric applications as part of their basic business methods have a cost advantage. There is not much information about this aspect of social media available to us at this time. We also have a working hypothesis that start ups with social media baked in may have a competitive advantage over established operations that are taking the add on approach to social media.

Stephen Arnold, December 2, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta