Google, Mian Mian, and Revisionism

December 17, 2009

A happy quack to the reader who sent me a link to Computerworld Asia’s “Chinese Author Sues Google over Book Scanning.” I visited China once and learned quickly that figuring out who is on first is tough. This write up is clear enough. An author—the popular Mian Mian—asserts that Google scanned Acid Lover without permission. There are several points in the write up that are fuzzy, maybe even fuzzy:

  • The aggrieved author wants US$8,770
  • The aggrieved author wants a public apology
  • Chinese authors want Google to pay them when their books are scanned for Google Books.

The key revisionist passage for me was:

“Google earlier argued that they didn’t violate copyright law as they only displayed a small amount of text of my book, but I think their move has seriously hurt Chinese writers’ rights,” the paper [China Daily] quoted Mian as saying.

Not fuzzy, certainly wuzzy.

Stephen E. Arnold, December 17, 2009

Quick confession. This is a freebie. I wish  I could get paid in yuan. I will contact the Foreign Claims Settlement Commission to point out this situation..

LinkedIn and Faceted Search

December 16, 2009

I know the social search revolution has taken place. The parade has left me unimpressed. I am content in the goose pond. Every few days I swat a pesky comment about my lack of excitement about social search, social networking, and social communications. Folks who are “social” seem to be triggered by my goosely musings. But let’s focus on reality. LinkedIn has become a giant job hunt and self promotion service in my opinion. That’s fine. I learned yesterday from a person talking about JobAngels.org that unemployment may be higher than the official government figures. Almost any service that promises an opportunity, no matter how slight, to land a gig that pays will attract attention. That may say more about the nature of social life in 21st century America than online in my opinion.

image

Source: http://www.emergingspirit.ca/files/images/Unemployment-LR2.preview.jpg

I grant that there are some interesting chunks of content on services such as LinkedIn. One of my colleagues is active in the social search discussion group. In my opinion, high quality content is the exception, not the rule. As the economy worsens, those without a job are looking for online services to be more than search and retrieval. Online is one more communications tool. It is, therefore, not surprising to me that the “traditional methods” of generating income that engaged my father are anachronistic or dropping like flies when the country sprays 2-4-D over this goose’s mine run off pond.

LinkedIn has been tough to use since I “joined” the service years ago. I have one of the goslings manage the requests to join my network. The few times I have offered a substantive comment, I received email about my inability to provide what the “members” of my “network” wanted. Tough. If folks don’t like my opini0ns, there is an easy answer. Click away.

I heard at a conference on December 14, 2009, that Endeca is providing the faceted search system for LinkedIn. Let’s assume this is true. Is search what LinkedIn needs? A related question:

“Will the Endeca search system and “true guided navigation” improve LinkedIn?” My answer, “Some.”

I can hear LinkedIn saying, “Better search means a better search experience.” Okay, okay. But what’s the point of putting a new coat of paint on a house without getting a home inspector to look for those glitches that make the difference between a good deal and a money pit. LinkedIn is one example of a user-fueled information system. It is easy to allow these services to operate without any significant editorial oversight. But is that what I want?

No. I want professional services to have more than eBay style controls and the “wisdom of crowds.” I expect professional services to add knowledge value. Frankly I don’t think search does that. An externality like search software cannot remediate certain structural and economic factors that make an information what  it is. I anticipate that lots of search vendors will disagree. That’s okay, just check out the editorial policy of this Web log before you try to reform me with your opinions.

image

The new interface. Search cannot solve content quality problems. Marketers may make this assertion. The reality is that finding baloney does little to help me locate Kobe beef. Fancy search is not able to address the substantive content issues. More sophisticated content processing is required. Where’s a reputation score? What about lineage?

First, in the new faceted LinkedIn interface, the inclusion of hot links that allow a person to locate related information is useful. However, when the people using the service are on the hunt for jobs, I see no substantive change in the trajectory of LinkedIn knowledge value. Software is external. By itself, code does not impart knowledge value to information that may have deeper issues. What about  a user like me? I am not looking for jobs. I am not looking for classmates. I am not looking to find an expert. In short, the information payload of LinkedIn for the type of work I do is unchanged with a better search system. The content is the issue, not the access. Search in many organizations is a bandage that is too small to cover the deeper problems in the service.

Second, guided navigation is now a commodity. One can implement the function with open source tools or just get a low cost solution from Microsoft, one of the leaders in driving down the perception that facets are better, faster, and cheaper. With a content base like LinkedIn’s, adding facets is not rocket science, and I wonder why the firm has not moved more quickly to implement a more modern search system. The slow shift to a more robust search system is encouraging. I hope that the firm dives into the deeper issues of its social service. LinkedIn is not bad, but I think it could be significantly better.

Third, search will not address the interface issues, the begging-for-dollars screens, the confusing set up of the site itself, and the deteriorating quality of the content on the site. Obviously when users can build a profile and post information, the “quality” is in the hands of the users. In a professional service (which I assume LinkedIn wants to be), the proliferation of job ads, specious assertions about expertise, and the recycling of content by those who want to inform group members creates noise. LinkedIn has an editorial obligation to ensure that the system has content that merits inclusion. In my opinion, a search system that finds information goes half way. Smart search needs to calculate quality, filter baloney, and assign “reputation” scores. The “owner” of the Web site must experiment constantly to find ways to minimize noise as perceived by a paying customer from the useful content objects in the system.

A quick example is that several people post links to the content of this Web log in the search group. I don’t have a problem with this. My concern is that I write articles that are often humorous. I suppose I could put a label on the write ups that are intentionally Swiftian such as my statement that the USPTO’s search service is great. It is not. It is awful and it should have been fixed years ago. Posting one of my Swiftian stories without context creates the impression that a particular article is “real”, not  a jibe. Posting multiple stories from my Web log is laziness, not substantive information. I would prefer that a person cite my writing and then add new information to the topic. I see similar laziness when a person asks to be on my “network”. I pay someone to be “me” in the social space. When I get a name and zero context, I have told my online counterpart to ignore the request. Perhaps LinkedIn could provide a sample of what a “add me to your network” email looks like. Seems simple enough to me.

I am delighted that Endeca or whatever search vendor sold this job made a sale. I want to be clear about my opinion. LinkedIn needs more aggressive, direct action in these areas before a search system will deliver a payoff to me; for example:

  1. Filtering of job ads. I find pleas for people to help a recruiter earn some bucks or Euros by placing an Autonomy SME annoying
  2. Enforcing some editorial controls on the content and putting in place guidelines for cross posting. The goal should be to add value, not point to articles out of context
  3. Providing a service that can be navigated without recourse to a search box. I have to tell you that when the person I pay to be “me” on the social networks shows me what LinkedIn looks like, I am baffled. I can’t tell what’s what. When I walk through the response to a request to join my network, I need more than “met you at a conference”. Not a chance that I will remember this person. I instruct my agent to ignore such requests. Detail needed!
  4. The fee approach is okay. Just make it clear what is provided. As far as I can tell, paying gets more of what’s free. I will pay for substantive services. Why not make it clear what one gets for how much. Maybe I don’t use the system enough to see this basic information? Perhaps LinkedIn may want to look at their core presentation of information with the trifocals of a 65 year old addled goose?

I suppose LinkedIn will point out that I am indeed little more than an addled goose. That’s par for the LinkedIn course. High value content is needed and lots of it. Then search is useful.

I would have used a different approach by putting the money into improving content and the core navigation logic. Searching flawed content does not constitute a net gain for me. You may find the LinkedIn guided navigation just what you need. For me, more substantive work is required. Just my opinion.

Stephen E. Arnold, December 16, 2009

Oyez, oyez, I want to report to the American Battle Monuments Commission that I was not paid to write about LinkedIn, its content, its navigation, and its business model. I anticipate a dust up in the comments section of this Web log. That’s the price of not looking for work, not being a social goose, and not rolling over when search is positioned to solve problems that information retrieval cannot address no matter how much dough is wrapped about the spiced apples.

Computerworld Does SEO the Holiday Way

December 16, 2009

I enjoy irreverence. This addled goose often practices the art, taken his yoga map to the side of the mine run off pond and stretching amidst the acid fumes in Harrod’s Creek. Unfortunately getting indexed in a search engine is a stress inducer. In fact, if an organization is not in Google, that organization may not exist for some customers and prospects. Not surprisingly, making certain a Web site is Google friendly is important and to some organizations no laughing matter.

If your Web site is an also ran in the Google engine results list when you search for your product or firm, you may enjoy “Ten Tips to Make Sure Your Firm Is Ranked by Search Engines.” The idea is that 10 tips for good SEO are packaged in holiday wrap; for example, “Build new toys” is a token for your firm giving Googzilla money for AdWords in order to promote your Web site.

How useful are these 10 tips? I think more effort was invested in creating the holiday bon most than focusing on specific recommendations to readers. Missing from the list are several tips that, in the goose’s experience, are semi useful in making sure that Googzilla indexes a Web site and generates a results list ranking that does not embarrass.

Here are our tips and no faux holiday cheer is needed for these. The goose lacks the literacy polish of the Computerworld wordsmiths:

  1. Have substantive content that uses concrete words and phrases. Google sucks at poetry, so crunchy words are better than phrases like “Test run on the sleigh”. Google likes the notion of semantic vectors. The goose is not sure about sleighs.
  2. Make certain that the pages you expose to the Googlebot comply with Google’s Webmaster guidelines. Microsoft and Yahoo sort of track Google’s suggestions, and you may want to follow along.
  3. Obtain legitimate backlinks that relate to the content on your site.

In short, read the Computerworld piece for some seasonal joy. For  some pragmatic SEO, follow the goose’s recommendations.

Stephen E. Arnold, December 16, 2009

Oyez, oyez, National Archives. You need to know I was not paid to point out that the deathless prose of Computerworld was created by someone who was paid for the 10 tips. The goose was not paid anything. Which write up do you think should be placed in the National Archives? I vote for the goose’s.

Content Guide

December 16, 2009

With the furor over copyright, I assumed that “free content” was going the way of the dodo. I was wrong. If you are looking for “free downloads”, you may want to take a look at “100+ Sites to Download Everything Online.” Some of the links struck me as quite useful; for example:

  • Audio books
  • Books and documents
  • eBooks.

Useful post.

Stephen E. Arnold, December 15, 2009

I feel compelled to report to the Federal Mine Safety and Health Review Commission that I was not paid to point out where an industrious person can dig for free content.

Bing and Fast Food

December 16, 2009

Short honk: I am recycling a post from Gizmodo. I found this post a poetic flight of fancy. The news item was “Has Anybody Used Bing to Find the Nearest Arby’s? Whoa, Man. Whoa.” Here’s the text:

The search engine Bing is kinda like Arby’s. You know it exits, but you never eat there.

How inappropriate. I feed the goslings at Arby’s at least twice a month. I do use Arby’s coupons. I do not get cash back, and I know the goslings appreciate the limited menu, curly fries, and bottomless soft drinks. How can this culinary experience relate in a metaphorical sense to an expensive search technology, tony UX, and an endlessly scrolling results list for images? No comparison. The goslings like Arby’s.

Stephen E. Arnold, December 16,. 2009

I have to reveal to the Bureau of Labor Statistics that I am working really hard to knock down this cruel poetic jab directed at Microsoft, Bing.com, and the image of the Web search system. I will have some extra curly fries, but I have to pay for them. You don’t have to pay for this short honk and I was not paid to write it.

Are Google Users Ready to Step Up to Fusion Tables? Nah.

December 16, 2009

WolframAlpha and Google have a tiny challenge. Both firms’ rocket scientists and algorithm wranglers understand the importance of herding data. Take this simple test. Navigate first to WolframAlpha and enter a word pair. Try UK population. Now navigate to Google’s public facing Fusion table demo here. What did you get? How did it work? Do you know why the systems responded as they did? How do you improve your query?

My hunch is that few readers of this Web log can answer these questions? Agree? Disagree? Well, I am not running an academic class, so if you flunked, that’s okay with me. I think most people will flunk, including some of the lesser lights at the Google and at WolframAlpha.

Against this background, the Google rolled out an API for Fusion tables. You can get the Googley story in the write up “Google Fusion Tables API.” My view is that Google’s moves in structured data are quite important, generally unknown, and essentially incomprehensible to those who suffered through high school algebra.

My opinion is that this API will result in some applications that will make Google’s significant commitment and investment in structured data more understandable. If you are ahead of the curve, the Google is on the march. If you have no clue what this post means, maybe you should think about changing careers. Wal+Mart greeter is somewhat less challenging that the intricacies of Google’s context server technology.

Stephen E. Arnold, December 16, 2009

Okay, I rode by Google’s DC headquarters. No one waved. No one paid me. I suppose I report this fact to the manger of the Union Station taxi dispatchers. Nah, those folks don’t care that this is a freebie either.

Record Labels Pivot Point: Saturday Night Fever

December 15, 2009

I don’t know much about the record industry or the music business. I know that certain segments squabble. Once in a while a record mogul gets killed. That’s business is the US of A, I suppose.

I found “Understanding The Decline And Fall Of The Major Record Labels” interesting. The idea that stuck with me after I finished reading the TechDirt article was that

Having reached the peak of the CD boom in 1999, the record industry had become a nearly $15-billion-a-year juggernaut, but under the pressure for more growth they collapsed, and, in the process, a vicious cycle of expectations had been set that strained the artists, the fans, the culture, and their systems to the point of breaking. Since record industry was unable to deliver new music with “consistent tactical excellence,” they began to fray at the edges. Disruptive technologies were released, an epidemic of file-sharing proceeded, and, at this critical juncture, vested interests of music executives struggled and competed to achieve repetitive consumption through obsolescence. But these executives were too late, as the record industry, by externalizing the blame for their decline in sales, had already started to show symptoms of stage three, Denial of Risk and Peril.

This snippet originated in a Hypebot post by Kyle Bylin.

But another interesting comment appeared in the comments to the TechDirt article. I quote a comment from Mr. Panik:

News papers became obsessed with profits at the expense of the “editorial” content. Auto makers put profit ahead of quality. Health care became wealth care. TV, music, movies have all been selling shoddy but charging customers premium prices. Do not get me started on education…. Universities are stealing the students research, hoping to make a profit with out compensating the producer. Flipping houses? Selling broken software? The food we get may be killing us? Just for profit? Lets hope that the “Best Government That Money Can Buy” will step in and save us.

I found the statement intuitively on target, and it applies to other information sectors. In my opinion, the culprit in 2010 will be Eric Schmidt dressed in John Travolta’s costume for the dance competition in Saturday Night Fever. History repeats itself in my opinion.

Stephen E. Arnold, December 15, 2009

I wish to disclose to the American Film Institute that I was not paid to write this article or craft this awful comparison. Eric Schmidt is a better dancer than John Travolta in my opinion.

Oracle and Open Source

December 15, 2009

Open source has a future in the enterprise. IBM has made its commitment to open source clear. I can license a mainframe running an open source Linux OS. I know that IBM has a revenue imperative; that is, the company takes technical steps in order to generate revenue. I suppose this means that IBM is pragmatic, and it suggests that open source in this one instance may not be “open” in the sense that some of those in the open source community understand the term.

The same can be said of other commercial open source “plays”. Some are positive. Last week in London, Charlie Hull, Lemur Consulting, explained his firm’s commitment to open source, the open source community, and Lemur’s customers. I like his approach.

When I read “Oracle Makes Commitments to Customers, Developers and Users of MySQL,” I found myself asking some questions. Why is the deal between Oracle and Sun Microsystems stuck? Why is their so much consternation about the MySQL database? Why is Oracle making public commitments to a governmental group half a world away?

The write up said:

No later than six months after the anniversary of the closing, Oracle will create and fund a storage engine vendor advisory board, to provide guidance and feedback on MySQL development priorities and other issues of importance to MySQL storage engine vendors.

User groups—particularly uncontrolled user groups—and advisory boards can become problematic. I have seen a number of user groups become focal points for certain issues in enterprise software. The recent shift to software vendor owned and operated conferences is one reaction to the uncontrolled user group.

In my opinion, I think a certain large software vendor will release an open source data management system that will undermine today’s commercial and open source systems. If and when this release takes place, I think the data management world will face significant disruption. In fact, the concern about MySQL could accelerate this disruptive action. I don’t think that Oracle will be able to “control” this “advisory board”. Control is a large part of a successful publicly traded company.

Furthermore, Oracle’s apparent inability to get this deal wrapped up may be the inadvertent trigger for an even more disruptive event. Will Oracle’s assurances be enough for the European Union watchdogs? My hunch is that traditional software vendors will find themselves bitten by their own business processes. Just my opinion.

Stephen E. Arnold, December 15, 2009

Oyez, oyez, I am delighted to report to Alcohol, Tobacco, Firearms, and Explosives Bureau (Justice)that I was not paid to write this statement about the lack of adaptability in large enterprise software companies. This is an explosive idea: open source and the enterprise.

Wave Goes Down Drain?

December 15, 2009

I read a suggestive article “Terminal Wave: The Google Wave Failure” by themilwaukeeseo. The basic idea is that Wave is going down the drain. The idea is that Wave has flaws, lots of them. You can read the article for the details of Wave’s inadequacies. A key point for me was:

No one can figure out how to use it effectively. It’s not that people don’t understand the basic notion of how to compose a WAVE, or even how to add in other people, but it’s not nearly as fluent as it was made out to be.

Several points:

  1. Wave is a typical Google service. It is not a commercial product. Wave is a demo, or, more accurately, a beta demo
  2. Google has a tough time thinking for the average Joe or Jill. Google is trying to make something usable for what Google thinks is an average Joe or Jill, not what an average Joe and Jill actually are
  3. Wave may be a part of a far larger data management system. Viewed this way, Wave may be the equivalent of a telescope poked from the Google submarine to see what life on the surface is.

My thought is that those who want to surf on Google may want to splash in the Wave. Learning to swim may be preferable to getting swamped when the big one rolls ashore.

Stephen Arnold, December 16, 2009

Oyez, oyez, I want to reveal to NOAA that this coming digital Wave was offered without compensation. Yep, another freebie.

Kngine: Web 3.0 Search

December 15, 2009

A happy quack to the reader who alerted me to Kngine, not to be confused with Autonomy’s origin kinjin. I think both are pronounced in a similar way. Kngine (based in Cairo) is an:

evolutionary Semantic Search Engine and Question Answer Engine designed to provide meaningful search result, such as: Semantic Information about the keyword/concept, Answer the user’s questions, Discover the relations between the keywords/concepts, and link the different kind of data together, such as: Movies, Subtitles, Photos, Price at sale store, User reviews, and Influenced story. We working on new indexing technology to unlock meaning; rather than indexing the document in Inverted Index fashion, Kngine tries to understand the documents and the search queries in order to provide meaningful search result.

There is some information about Kngine’s plumbing in the High Scalability Web log. The system uses “semantic technology”. One interesting feature of the system is snippet search. The idea is:

Snippet Search results will consist of collection of rich ranked paragraphs rather than collection of documents links. Snippet Search paragraphs is semantically related to what you looking for (i.e. content what you looking) so we will be able to get what he looking for directly without open other pages.

Haytham El-Fadeel in his blog provided additional color about the search system. He wrote on September 4, 2009:

Kngine long-term goal is to make all human beings systematic knowledge and experience accessible to everyone. We aim to collect and organize all objective data, and make it possible and easy to access. Our goal is to build on the advances of Web search engine, semantic web, data representation technologies a new form of Web search engine that will unleash a revolution of new possibilities.

I ran a number of queries on the system. I found the results useful. My query for Amtrak provided relevant hits, some suggested queries, and a thumbnail.

kngine splash

You can contact the company at Info@Kngine.com.

Stephen E. Arnold, December 15, 2009

Okay, okay, someone fed me date nut bread this morning in the hopes I would write about their product. That did not work. I ate the date nut bread and wrote about this outfit in Cairo. I guess this shows that you can pay this goose, but the goose does what it wants. Honk.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta