Google Gains Three Patents
May 21, 2009
Now that Google: The Digital Gutenberg is out, I want to catch up on some interesting invention news from the Google.
First, Anna Patterson (former Googler and founder of Cuil.com) assigned her phrase invention to Google. Now that invention (discussed in my Google Version 2.0) has been awarded a patent. Phrase detection is important in content processing and Dr. Patterson’s approach is suggestive. You can find the document US7536408 here.
Second, Marissa Meyer (a serious Googler) and some colleagues such as Krishna Bharat received a patent for query rewriting with entity extraction. Entity extraction is a core method that surfaces in the Guha disclosures and in the work of Alon Halevy and his team. This is an important method in my opinion and the patent document here said:
A system receives a search query, determines whether the received search query includes an entity name, and determines whether the entity name is associated with a common word or phrase. When the entity name is associated with a common word or phrase, the system generates a link to a rewritten query, performs a search based on the received search query to obtain first search results, and provides the first search results and the link to the rewritten query. When the entity name is not associated with a common word or phrase, the system rewrites the received search query to include a restrict identifier associated with the entity name, generates a link to the received search query, performs a search based on the rewritten search query to obtain second search results, and provides the second search results and the link to the received search query.
The third disclosure that caught my attention was a system and method for Web page authoring for structured documents. The invention adds some beef to the JotSpot and Knol bones. You can find US7536641 here.
You may notice that I am pointing to the FreePatentsOnline.com system, not directly to the USPTO. Performance of the USPTO system has annoyed me in the last few months. Whatever FreePatentsOnline.com is doing, I find the response time somewhat more zippy.
Keep in mind that wizards from Microsoft and other big firms point out that Google does not make use of its disclosed technology. In fact, obtaining a patent or filing an application may be little more than intellectual sky writing. Categorical affirmatives are often mushy when applied to an outfit like Google. Make your own decision.
Stephen Arnold, May 21, 2009
Addled Goose in Flight
May 20, 2009
No joke. This is a real goose photo. You can see the original here and read the caption. Addled geese are agile, talented, and good looking. Here’s a copy of this inspirational image. If the image does not appear, cllick here. To get enterprise search to work, one needs this degree of flexibility.
Copyright Telegraph, 2009.
Stephen Arnold, May 20, 2009
Google Books: Peace, Capitulation, or Tactic
May 20, 2009
My newsreader spat Miguel Helft’s “Google Book-Scanning Pact to Give Libraries a Say in Prices”. You can read the story here. Mr.. Helft reported:
In a move that could blunt some of the criticism of Google for its settlement of a lawsuit over its book-scanning project, the company signed an agreement with the University of Michigan that would give some libraries a degree of oversight over the prices Google could charge for its vast digital library.
Mr. Helft does not tackle the strategic implications of this deal. I am not going to tackle them either. I want to ponder whether this means peace, capitulation, or another tactic.
Stephen Arnold, May 20, 2009
IBM Goes the Toaster Route
May 20, 2009
I had to slap my beak two times after reading “IBM Rolls Out the ‘Smart Cube’ with App Market: Think Enterprise iPod-iTunes Combo” here. The news story by Larry Dignan appeared in a ZDNet Web log here. The article explained that IBM has taken one of its expensive servers, preloaded business software on the gizmo, and created an online store to sell lucky owners of a Smart Cube more software. Mr. Dignan explained that I should think of the Smart Cube and its $8,000 starting price as a variant of the iTunes iPod play from Apple.
That’s a pretty good analogy.
When I visited the Smart Cube online app store here, I ran a query for enterprise search using the Smart Market search system. What did I find? I saw four hits that included the word enterprise and search but none of the hits provided a link to a search and retrieval system.
With search an important service, I was surprised that none of IBM’s many partners in enterprise search made it to the Smart Market store. Even more interesting was that I could find zero references to the Yahoo IBM free version of Lucene that I have running on one of my NetFinity servers.
My thoughts:
- Search is simply not perceived as a serious enterprise application by the innovators who created the Smart Cube and Smart Market. In my opinion, this makes the Smart Cube somewhat dumb. Your mileage may vary.
- The search system for the Smart Market itself defaults to an OR, not an AND. This yields the four false drops in my result set here.
- IBM has not made substantial progress in [a] either its own packaging of must-have enterprise functions or [b] the search system for its own Web sites.
For this addled goose, this is a dumb move for both the Smart Cube and the Smart Market. At least my iTunes offers a search function which is lame in my view, but it does offer the service unlike the Smart Cube.
Stephen Arnold, May 20, 2009
Yahoo: Chasing Google with Semantic Intent
May 20, 2009
Information Week’s story “Yahoo Aims to Redefine What It Means to Search” which you must read here brought a tear to the eye of the addled goose. Yahoo aimed its former IBM and Verity “big gun” at Googzilla and fired a shot into the buttocks of their Mountain View neighbors. Mr. Cliburn, the author of the Information Week story, offered:
As described by Raghavan, Yahoo is directing its search efforts toward assessing user intent. When a user types “Star Trek,” Raghavan said, he doesn’t want 10 million documents, he wants actors and show times.
Information Week approaches the yawning gap between Google and Yahoo in a kinder, gentler way. Thomas Claburn wrote:
it’s perhaps understandable why Yahoo might want to re-frame the debate. Given its lack of success challenging Google directly — Google’s April search share in the U.S. reached 64.2%, a 0.5 point gain, while Yahoo’s search share fell to 20.4%, a 0.1 point decline, according to ComScore — Yahoo wants to change the game.
How will Yahoo deliver its better mousetrap?
Yahoo is relying on its partners to feed it with structured data.
Google’s approach includes algorithmic methods, the programmable search engine methods (Ramanathan Guha), and user intent (Alon Halevy). Yahoo, on the other hand, wants Web site operators and other humans to do the heavy lifting.
Yahoo’s focus on user intent could lead to happier users, if Yahoo Search can guess user intent accurately. It could also help Yahoo make more money from advertising. “If we can divine the user’s intent, that’s obviously of great interest to advertisers,” said Raghavan.
Advertisers want eyeballs of buyers. Google delivers eyeballs in droves. One percent of two billion is a useful segment. Yahoo has struggled to: [a] deliver segments that make advertisers abandon Google’s big data method for the flawed Panama system, [b] monetize its hot, high traffic services like Flickr in an effective manner, and [c] put real flamethrowers on the GOOG’s hindquarters, which is what Yahoo has seen since mid 2003.
Yahoo will need divine intervention to close the gap with Google. More importantly, neither Google nor Yahoo have an answer to the surging popularity of Twitter, Facebook, and other real time search systems. I am watching the sky for an omen that Woden is arriving to help the Yahooligans. So far, no portents, just PR.
Stephen Arnold, May 20, 2009
Google Pays So You Can Play YouTube Videos
May 20, 2009
Who pays? Google. TechDirt’s story “YouTube Ordered to Pay $1.6 Million To ASCAP” reported that a legal shoe dropped. You can read the full story here. For me the interesting point was:
ASCAP gets to go in and demand cash from anyone who benefits from music anywhere, and a judge sorta randomly makes up reasons to give them cash. I know that ASCAP supporters will claim that the money is for songwriters, not the record labels, and it’s important and blah blah blah. But the whole system of such collective licenses is a mess that it makes it close to impossible to do anything with music without getting yourself into a huge licensing hole. For more than a century now, Congress and the courts seem to look at every innovation and simply slap another license fee on it, and leave it to the courts to sort out any mess. All of these license fees add up to a massive tax on innovation that divert money from good business models and into the hands of collections societies, who siphon off a piece and often don’t do a very good job distributing that cash. It’s a massively inefficient model that’s simply not needed.
Economy day for the Google.
Stephen Arnold, May 20, 2009
The Real Time Search Delusion
May 20, 2009
We are in for an exciting autumn. I keep looking for signs that Microsoft is making headway in new Administration. I thought Google would nail some major deals but I have a hunch, Redmond has put a stick in the Google’s advance in one key US government agency. More about this when a “real journalist posts a Kobe beef news story”. I just link and comment as an addled goose should.
I was thinking about Google’s government snub when a reader sent me a link to the UK’s Guardian newspaper, a good source of anti Google messaging. The story “Google ‘Falling behind Twitter’” did not disappoint.
The point of the story was that Google has been body slammed by Twitter in the areas of real time search based on microblogging. For me, the most interesting segment in Richard Wray’s story were these remarks allegedly made by Googlers here:
“People really want to do stuff real time and I think they [Twitter] have done a great job about it,” Page said in a closing address at Google’s Zeitgeist conference . “I think we have done a relatively poor job of creating things that work on a per-second basis.”
Let’s think about this. Twitter is not new. Twitter contains short comments that are broadcast to people. Twitter is the hot info company. Twitter makes no sense to some people.
If the Guardian’s story is accurate, Google is just now realizing that it missed the Twitter brass ring. A question to ask the Googlers is, “Why?” The Google fix is to build Twitter applications and tap the buzz that way. Good plan but the approach conceded in my opinion that Twitter at the moment owns the real time consumer content space. Said another way, Google missed a chance. Just as Microsoft missed Web search and IBM missed on PC operating systems.
What I find fascinating is that the time between these business misses (what Ben Gilad calls “business blindspots”) is decreasing. With fewer opportunities to rework online content, a miss today has a greater chance of going critical without warning.
Google glitches, Google legal hassles, and now a Google business blindspot. Twitter that.
Stephen Arnold, May 20, 2009
Google Health
May 20, 2009
A battle is shaping up among some heavy hitters for digital health services. If you want a useful summary of what Googzilla has been doing, you can click here to read Mark Gibbs’s overview of the service. For me the most interesting comment was:
Google Health provides an API based on a subset of the “Continuity of Care Record” API described as “a standard format for transferring snapshots of a patient’s medical history.” This API allows developers to build software that can create and read consumer’s medical records with sophisticated authorization and access controls.
Not much about search and data mining in the story, however. Keep in mind that Google products and services have search baked in. Google seems to be pursuing a consumer strategy whilst Microsoft is chasing the health enterprise. Lots of exciting coming in this sector. Health information is in the same sorry state as the US health care system. My thought is that it will evolve along the same lines as the US auto and airline industry. That’s a comforting notion, isn’t it?
Stephen Arnold, May 20, 2009
Wired Fraying and Shorting Out
May 19, 2009
Making money with electronic information is tough. Making money writing about the wired world is also difficult. Joel Johnson’s interesting “Welcome, Wired. We Call This Land Internet” here provided me with a useful anecdote for an upcoming talk I will be giving at an NFAIS conference at the end of June 2009. Mr. Johnson informed me that Wired Magazine may be killed off. No surprise. The magazine business was challenging when readers did not worry about dead trees and the chemicals in ink, distribution costs, and the millions of direct mail solicitations required to build a subscription list. In May 2009, the magazine business is different from those salad days between the late 17th century and 2008. Mr. Johnson wrote:
Wired is great print, but if the magazine can’t make money and is shuttered, taking the website down with it, I’m going to be livid. Not that making money online is easy—it’s not, especially without sacrificing your ethics and your voice—but if any mainstream outlet should be able to make the transition, it should be Wired. I fear that may be impossible, not just for Wired but for all these old brands, because they can’t accept that the work at which they have excelled for years will be just as important when it’s online—and online only.
I bought two magazines at the airport news kiosk this morning. The total price was about $14. I paid for them, but I was the only person in the shop in Washington Reagan Airport buying magazines. With buyers like me in short supply and advertisers trying to figure out how to maximize their ad dollars, Wired is not the only traditional publication to face a problematic future.
I also thought about the Wired wizards who described the brave new digital world. It is one thing to write about electronic information. It is quite another to make money from a print publication that contains information about online. I don’t think the Wired Web site can survive in its present form, regardless of the fate of the print publication.
Again. That pesky writing and doing problem.
Stephen Arnold, May 19, 2009
Stating the Old and Obvious: Google Makes Book Grab
May 19, 2009
In this morning’s Washington Post (May 19, 2009), I was captivated by an essay called “A Book Grab by Google.” You can find the online version of the story here, hiding behind an annoying pop up add with the close button tucked in the upper left hand corner. Like this is going to make me buy whatever was on the annoying red ad thing? Mr. Kahle is a computer legend of sorts, and at the Library of Congress he has a killer rep. He provided the LoC with a copy of the Internet to tuck into its treasure trove of books, manuscripts, and fungible artifacts that jam its warehouses.
The trigger for the write up is an impending court decision about Google and its now-contentious deal to scan books. In the essay, Mr. Kahle stated:
If approved, the settlement would produce not one but two court-sanctioned monopolies. Google will have permission to bring under its sole control information that has been accessible through public institutions for centuries. In essence, Google will be privatizing our libraries. It may seem puzzling that a civil lawsuit could yield monopolies. Traditionally, class-action lawsuits cluster a group of people who have suffered the same kind of harm as a result of alleged wrongful conduct. And under this settlement, authors who come forward to claim ownership in books scanned by Google would receive $60 per title.
He concluded:
This settlement should not be approved. The promise of a rich and democratic digital future will be hindered by monopolies. Laws and the free market can support many innovative, open approaches to lending and selling books. We need to focus on legislation to address works that are caught in copyright limbo. And we need to stop monopolies from forming so that we can create vibrant publishing environments.
Several thoughts flapped through the addled goose’s mind this morning; namely:
- There’s been a whole lot of scanning going on over the years. University Microfilms, now reincarnated as ProQuest, owned by Cambridge Scientific, has been processing dissertations, newspapers, and other publications. Similar digital collections have been built by companies with an earthworm-like profile. For examples, sector leaders include Ebsco, Thomson Reuters, Reed Elsevier, and dozens of others. None has fired anyone’s imagination yet one might view each of these collections as expensive and quasi-monopolistic. Google ignites considerable discussion; its fellow scanners elicit a bored “who” and “what”?
- The Google Book gig has been cranking along for years. The scanning project is not new, and I recall hearing that some, maybe all, of the participating libraries get a copy of the data for their archive. With amalgamated online public access catalogs, the libraries have a door through which those interested in the scanned material can walk. Unless I am missing something, I can visit most university libraries in Kentucky, use the facilities, and pay not a cent for the access. I like visiting libraries in person and online access is more of a pre-research step for me, not the main deal. My Kindle hurts my eyes, and I have to put rubber shelf liner on my chest when I prop my Dell Mini 9 under my chin to read much more than a screen of content in the evening. I don’t even print out my own books. I wait until the publisher sends me a hard copy, I buy one at the bookstore, or I order an out of print item from the world’s smartest man’s Amazon service.
- Microsoft and Yahoo jumped into the scanning game and then jumped out. Libraries are not rolling in dough, and I don’t see any of the commercial database outfits stepping forward with the moxie and the money to do a better, quicker, or more thorough job than the Google. Do library associations have the cash? Do government agencies have the cash? Does your university have the cash and management expertise to lead a consortium to scan books? Do publishers have the cash and the technical acumen to scan their books? No, no, no, no. A part of me is grateful that Googzilla is funding this initiative. Most libraries with which I am familiar face serious financial challenges. When libraries close, where to the books go? To yard sales and to second hand dealers. Not students or researchers I assert.
- I am not sure this book controversy is about books. After a decade of viewing Google as an online advertising agency and the focal point of the search engine optimization scamorama, pundits, wizards, and mavens want to play catch up. Google—as I have documented in my three Google studies—has been plodding forward a couple of inches at a time for quite a while. In fact, Google Books is not a main event. It is a side show.
I find it amusing that books, not Google’s other and more significant initiatives, has drilled into the incisors of the informed. In my view, books are, like video, in the process of change. Perhaps because books are cultural totems, the scanning project captures headlines.
Google is a new type of enterprise, and it operates on a level and in a dimension that will disrupt more than the book sector. More about these disruptive forces appear in my Google trilogy: The Google Legacy (2005), Google Version 2.0 (2007), and Google: The Digital Gutenberg (2009). More info here.
Stephen Arnold, May 19, 2009