The Knol Way: A Google Wobbler on the Information Highway

September 1, 2008

Harry McCracken greeted Google on September 1, 2008, with a less than enthusiastic discussion of Knol, Google’s user-generated repository of knowledge. The story ran in Technologizer, a useful Web log for me. You can read the full text of the story here. The thesis of the write up, as I understand the argument, is that while a good idea, the service lacks depth. The key point for me was this statement:

Knol’s content will surely grow exponentially in the months to come, but quantity is only one issue. Quality needs to get better, too–a Knol that’s filled with swill would be pretty dismaying, and the site in its current form shows that the emphasis on individual authors creates problems that Wikipedia doesn’t have. Basic functionality needs to get better, too: The Knol search engine in its current form seems to be broken, and I think it needs better features for separating wheat from chaff. And I’d give the Knol homepage a major overhaul that helps people find the best Knols rather than featuring some really bad ones.

I agree. One important point is that the Wikipedia method of allowing many authors to fiddle has its ups and downs. Knol must demonstrate that it is more than a good idea poorly executed and without the human editorial input that seems to be necessary under its present set up.

I have a mental image of the Knol flying across the information super highway and getting hit by a speeding Wikipedia. Splat. Feathers but no Knol.

In closing, let me reiterate that I think Knol is not a Wikipedia. It is a source of input for Google’s analytical engines. The idea is that an author is identified with a topic. A “score” can be generated so that the GOOG has another metric to use when computing quality. My hunch is that the idea is to get primary content that copyright free in the sense that Google doesn’t have to arm wrestle publishers who “own” content. The usefulness to the user is a factor of course, but I keep thinking of Knol as useful to Google first, then me.

Will Google straighten up and fly right the way the ArnoldIT.com logo does? Click here to see the logo in action. Very consistent duck, I’m sure. Will Knol be as consistent? I don’t know. Like the early Google News, the service is going to require programmatic and human resources,which may be a while in coming. For now, Google is watching clicks. When the Google has sufficient data, then more direction will be evident. If there’s no traffic, then this service will be an orphan. I hope Googzilla dips into its piggy back to make Knol more useful and higher quality.

Stephen Arnold, September 1, 2008

IBM and Sluggish Visualizations: Many-Eyes Disappointment

September 1, 2008

IBM’s Boston research facility offers a Web site called Many Eyes. This is another tricky url. Don’t forget the hyphen. Navigate to the service at http://www.many-eyes.com. My most recent visit to the site on August 31, 2008, at 8 pm eastern timed out. The idea is that IBM has whizzy new visualization tools. You can explore these or, when the site works, upload your own data and “visualize” it. The site makes clear the best and the worst of visualization technology. The best, of course, is the snazzy graphics. Nothing catches the attention of a jaded Board of Directors’ compensation committee like visualizing the organization’s revenue. The bad is that visualization is still tricky, computationally intensive, and capable of producing indecipherable diagrams. A happy quack to the reader who called my attention to this site, which was apparently working at some point. IBM has a remarkable track record in making its sites unreliable and difficult to use. That’s a type of consistency I suppose.

Stephen Arnold, September 1, 2008

Citation Metrics: Another Sign the US Is Lagging in Scholarship

August 31, 2008

Update: August 31, 2008. Mary Ellen Bates provides more color on the “basic cable” problem for professional informatoin. Worth reading here. Econtent does an excellent job on these topics, by the way.

Original Post

A happy quack to the reader who called my attention to Information World Review’s “Numbers Game Hots Up.” This essay appeared in February 2008 and I overlooked it. For some reason, I am plagued by writers who use the word “hots” in their titles. I am certain Tracey Caldwell is a wonderful person and kind to animals. She does a reasonable job of identifying problems in citation analysis. Dr. Gene Garfield, the father of this technique, would be pleased to know that Mr. Caldwell finds his techniques interesting. The point of the long essay which you can read here is that some publishers’ flawed collections yields incorrect citation counts. For me, the most interesting point in the write up was this statement:

The increasing complexity of the metrics landscape should have at least one beneficial effect: making people think twice before bandying about misleading indicators. More importantly, it will hasten the development of better, more open metrics based on more criteria, with the ultimate effect of improving the rate of scientific advancement.

Unfortunately, traditional publishers are not likely to do much that is different from what the firms have been doing since commercial databases became available. The reason is money. Publishers long to make enough money from electronic services to enjoy the profit margins of the pre digital era. But digital information has a different cost basis from the 19th century publishing model. The result is reduced coverage and a reluctance to move too quickly to embrace content produced outside of the 19th century model.

Services that use other methods to determine link metrics exist in another world. If you analyze traditional commercial information, the Web dimension is either represented modestly or ignored. Mr. Caldwell’s analysis looks at the mountain tops, but it does not explore the valleys. In those crevices is another story; namely, researchers who rely on commercial databases are likely to find themselves lagging behind those researchers in countries where commercial databases are simply too expensive for most researchers to use. A researcher who relies on a US or European commercial database is likely to get only an incomplete picture.

Stephen Arnold, August 31, 2008

Google Maps Attract Flak

August 31, 2008

Google inked a deal with GeoEye to deliver 0.5 meter resolution imagery. One useful write up appears in Softpedia here. The imagery is not yet available but will be when the GeoEye-1 satellite begins streaming data. The US government limits commercial imagery resolution. Th Post Chronicle here makes this comment, illustrating the keen insight of traditional media:

Google did not have any direct or indirect financial interest in the satellite or in GeoEye, nor did it pay to have its logo emblazoned on the rocket. [emphasis added]

In my opinion, Google will fiddle the resolution to comply. Because GeoEye-1 was financed in part by a US government agency, my hunch is that Google will continue to provide geographic services to the Federal government and its commercial and Web users. The US government may get the higher resolution imagery. The degraded resolution will be for the hoi polloi.

Almost coincident with news of this lash up, Microsoft’s UK MSN ran “UK Map Boss Says Google Wrecking Our Heritage.” You can read this story here. The lead paragraph to this story sums up the MSN view:

A very British row appears to be brewing after the president of the British Cartographic Society took aim at the likes of Google Maps and accused online mapping services of ignoring valuable cultural heritage. Mary Spence attacked Google, Multimap and others for not including landmarks like stately homes and churches.

The new GeoEye imagery will include “valuable cultural heritage” as well as cows in the commons and hovels in Herfortshire.

Based on my limited knowledge of British security activities, I would wager a curry that Google’s GeoEye maps will be of some use to various police and intelligence groups working for Queen and country. Microsoft imagery in comparison will be a bit low resolution I surmise. MSN UK will keep me up to date on this issue I hope.

Stephen Arnold, August 31, 2008

No Google Killer Yet

August 31, 2008

I think it is still August 30, 2008, here in the hollow. My newsreader delivered to me a September 1, 2008, article. DMNews is getting a jump on publishing in order to make a picnic. The authors are a team–Ellen Keohane and Mary Elizabeth Hurn. You can read the article here.

The main point of the article is that Google is the leader in search. There were two interesting points for me.

First, the authors identified a search engine of which I knew not–UBExact. The url is http://www.ubexact.com. I ran one test query. I selected the geographic search for Louisville and entered the term “blacktop resurfacing”. The system generated zero results. I will check it out in a couple of months.

Second, the duo made a comment I found intriguing:

And, as with Wikia Search, Mahalo,OrganizedWisdom.com and Scour.com, UBExact also uses humans to improve the search experience. Human editors are contracted to eliminate spam, malicious content, unwanted ads and dead links and pages, Stephenson said. In addition to vetting content, the con­tractors also organize Web sites based on content so users can search on UBExact by category.

Humans are expensive, and it will be interesting to see if privacy and click fraud impair Google. Oracle SES10g pitched security. Customers did not value security, and I’m not sure if UBExact’s hooks will either. Agree? Disagree? Let me know.

Stephen Arnold, September 1, 2008

Why Dataspaces Matter

August 30, 2008

My posts have been whipping super-wizards into action. I don’t want to disappoint anyone over the long American “end of summer” holiday. Let’s consider a problem in information retrieval and then answer in a very brief way why dataspaces matter. No, this is not a typographical error.

Set Up

A dataspace is somewhat different from a database. Databases can be within a dataspace, but other information objects, garden variety metadata, and new types of metadata which I like to call meta metadata, among others can be encompassed. These are represented in an index. For our purpose, we don’t have to worry about the type of index. We’re going to look up something in any of the indexes that represent our dataspace. You can learn more about dataspaces in the IDC report #213562, published on August 28, 2008. It’s a for fee write up, and I don’t have a copy. I just contribute; I don’t own these analyses published by blue chip firms.

Now let’s consider an interesting problem. We want to index people, figure out what those people know about, and then generate results to a query such as “Who’s an expert on Google?” If you run this query on Google, you get a list of hits like this.

google expert

This is not what I want. I require a list of people who are experts on Google. Does Live.com deliver this type of output? Here’s the same query on the Microsoft system:

live expert output

Same problem.

Now let’s try the query on Cluuz.com, a system that I have written about a couple of times. Run the query “Jayant Madhavan” and I get this:

cluuz

I don’t have an expert result list, but I have a wizard and direct links to people Dr. Madhavan knows. I can make the assumption that some of these people will be experts.

If I work in a company, the firm may have the Tacit system. This commercial vendor makes it possible to search for a person with expertise. I can get some of this functionality in the baked in search system provided with SharePoint. The Microsoft method relies on the number of documents a person known to the system writes on a topic, but that’s better than nothing. I could if I were working in a certain US government agency use the MITRE system that delivers a list of experts. The MITRE system is not one whose screen shots I can show, but if you have a friend in a certain government agency, maybe you can take a peek.

None of these systems really do what I want.

Enter Dataspaces

The idea for a dataspace is to process the available information. Some folks call this transformation, and it really helps to have systems and methods to transform, normalize, parse, tag, and crunch the source information. It also helps to monitor the message traffic for some of that meta metadata goodness. An example of meta metadata is an email. I want to index who received the email, who forwarded the email to whom and when, and any cutting or copying of the information in the email to which documents and the people who have access to said information. You get the idea. Meta metadata is where the rubber meets the road in determining what’s important regarding information in a dataspace.

Read more

Growth of Electronic Information

August 29, 2008

Larry Borsato, writing for the Industry Standard, presents some interesting information about the growth of electronic information. You can read his article “Information Overload on the Web, and Searching for the Right Sifting Tool” here. The most startling item was this statement:

IBM predicts that in the next couple of years, information will double every 11 hours [PDF].

The article runs down the problems encountered when looking for information using various search services. He’s right. Search is a problem. But that doubling of information every 11 hours underscores the opportunity that exists for a person or company with an information access solution.

Stephen Arnold, August 29, 2008

Google: Dashboard or Buzz Word

August 29, 2008

ZDNet’s “Google Apps Dashboard: Serious about the Enterprise?” does a good job of explaining that Google continues to push into the corporate market. The article, written by Michael Krigsman, summarizes a software component that allows Google Apps Premier licensees a way to check on the status of the services. For me, the most interesting point Mr. Krigsman made was:

Although Google may offer this service level to large accounts such as Cap Gemini, I doubt smaller customers will receive any personalized attention whatsoever. After all, Google isn’t known for providing stellar customer service; actually, the company’s customer care record sucks widgets. Only time will tell whether Google can successfully transition from its mass market consumer mentality to becoming a trusted, service oriented enterprise vendor.

I too have heard that Google does not return telephone calls, misses meetings, and ignores teleconference start times. But I have also heard that Google commissioned an expert to analyze the weaknesses of its sales approach and listened as the consultant explained that Google had to change its ways.

Google is a decade old, and it must give up some of its math club ethos, not just create software and spout buzz words. Will the company make the shift? I think we must wait and see.

Stephen Arnold, August 29, 2008

Computerworld: Google’s Not Hot

August 26, 2008

The Computerworld story surprised me. Preston Gralla, a really big name in tech journalism, wrote an opinion piece called “Why Google Has Lost Its Mojo — And Why You Should Care”. You can read the full text of this important essay here. The most important point in Mr. Gralla’s write up is the title. It says it clear: Google has no spice, zing, magic, and voodoo. In Gulla, Google’s medicine men have lost “it”.

Consider this statement:

So why do I think it’s lost its mojo? Let’s start with the way it treats its employees. Google’s largesse has been legendary — free food, liberal maternity and parental leave, on-site massages, fitness classes and even oil changes. But according to a recent New York Times article, those days may be gone.

Once employees sense a downshift, human resources professionals have to scramble.

I posted an innocuous story about the Amtrak passenger service selecting Autonomy. The outfit fighting for this project was Google. Google lost this high profile account. Google has other challenges as well, including legal hassles. Some big and some small. But these take time to address. Google’s technology is showing some flaws. Ads still works, but other functions are buggy. Google has started an investment branch; its foundation is pushing “green” technology. Former employees are not surfing on Google. Some like Cuil.com are competing. The fact that those Xooglers rolled out a tasty confection before it was complete does little to polish the reputation of Google and its Xooglers. For me, the fact that Computerworld is souring on Google is news. Amazing turn of events for Googzilla.

Stephen Arnold, August 26, 2008

How Yahoo Will Catch Google in Search

August 25, 2008

Here’s an interview you must read. On August 25, 2008, the Financial Express (India) here published an interview with Yahoo’s super wizard, Prabhakar Raghavan. Dr. Raghavan is the head of research at Yahoo, a Stanford professor, and a highly regarded expert in search, database, and associated technologies. He’s even the editor of computer science and mathematics journals. A fellow like this can leap over Google’s headquarters and poke out Googzilla’s right eye. The interview, conducted by Pragati Verma, provides a remarkable look inside the plans Yahoo has to regain control of Web search.

There were a number of interesting factoids that caught my attention in this interview. Let me highlight a few.

First, Yahoo insists that the cost of launching Web search is $300 million. Dr. Raghavan, who is an expert in things mathematical, said:

Becoming a serious search player requires a massive capital investment of about $300 million. We are trying to remove all barriers to entry for software developers, who have ideas about how to improve search.

The idea is to make it easy for a start up to tap into the Yahoo Web index and create new services. The question nagging at me is, “If Web search is $300 million, why hasn’t Yahoo made more progress?” I use Yahoo once in a while, but I find that its results are not useful to me. When I search Yahoo stores, I have a heck of a time finding what I need. What’s Yahoo been doing since 1998? Answer: losing market share to Google and spending a heck of a lot more than a paltry $300 million losing ground.

Second, Google can lose share to search start ups. Dr. Raghavan said:

According to comScore data, Google had a 62% share of the US search market in May, while we had 21% and MSN 9%. Our prediction models suggest that Google could lose a big chunk of its market share, as BOSS partners and players come in.

My question is, “Since Google is vulnerable, why haven’t other search systems with funding made any headway; for example, Microsoft?” The notion that lots of little mosquitos can hobble Googzilla is not supported by Yahoo’s many search efforts. These range from Mindset to InQuira, from Flickr search to the deal with IBM, etc. Chatter and projections aside, Google’s share is increasing, and I don’t see much zing from the services using Yahoo index so far.

Finally, people don’t want to search. I agree. There is a growing body of evidence that key word search is generally a hassle. Dr. Raghavan said:

Users don’t really want to search. They want to spend time on their work, personal lives and entertainment. They come to search engines only to get their tasks done. We will move search to this new paradigm of getting the task done….

My question is, “How is Yahoo with its diffused search efforts, its jumble of technologies, and its inability to make revenue progress without a deal from Google doing to reverse its trajectory?” I wish Yahoo good luck, but the company has not had much success in the last year or so.

Yahoo lost its way as a directory, as a search system, and as a portal. I will wait to see how Yahoo can turn its “pushcart full of odds and ends” into a Formula One racer.

Stephen Arnold, August 25, 2008

Single Page Format

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta