Google Tosses Another Olive Branch
December 3, 2009
In my research, Google typically offers a pretty good deal and if people take advantage of that deal, fine. If not, the Google moves forward. The challenge in my opinion is to figure out if what Google offers is a good deal. Viewed from Google’s perspective, the benefits (one assumes) are obvious. Viewed from the outsiders’ perspective, the deal may be a bad one or just not recognized as a deal. Google does not have a master of ceremony, to ask, “Door Number 1?”
“Google Offers Publishers Limit on Free News Access” (this is an ephemeral Yahoo link so it may go dead without warning, gentle reader) may be a deal or it may not. I can’t say. The idea is that the Google will provide a control panel so a news outfit can jiggle some settings. Sure, this sounds trivial, and it is. But the news outfit has to know how to find the control panel, take the time to jiggle some settings, and then remember that the jiggling constitutes some willful interaction with the Google.
Let’s think about this.
First, the Google is * trying * to accommodate news outfits who ignored Google for 11 years and now want to use if not control the Google playground equipment. That’s fine. A bit late and somewhat unusual in the knock down, drag out American business world. But Google, if the story cited is accurate, is taking a step.
Second, the deal requires that the news outfits know that a deal is offered. Google does not connect the dots as we point out in the six part video series “How to Make Money with Google”. Googlers connect dots without thought. Googlers assume that when it offers a deal, others can exert sufficient mental energy to see dots and connect them. I recall the risks of using the word “assume”. (Ass-of-you-and-me, according to my grade school teacher in Brazil.)
Third, “control” has lots of implications. These may not be a big deal today but down the road “control” may gather some meaning as Google drives its Hummer forward.
Will news outfits play “Deal or No Deal” or will publishers take Door Number 3? Stay tuned.
Stephen Arnold, December 3, 2009
Oyez, oyez, Government Printing Office. I was not paid to write this fine essay with its references to Monty Hall and Milorad Mandi? Manda, host of the Serbian version of Deal or No Deal.
Will Google Ring Its Digital Cash Register for Thee and TV
December 3, 2009
One of the media mavens wrote “Is YouTube Ready for Primetime? Google Wants to Stream TV, for a Fee” and illustrated the write up with pictures of my relatives from Harrod’s Creek cooking over a wood fire. The maven also chose a picture that showed my relatives watching the flames burning instead of sitting in the wooden hovel telling stories about the deer who got away.
The point of this write up to me is summed up in this passage:
YouTube already lets users watch a smattering of TV shows for free, with advertising. Now it envisions something similar to what Apple and Amazon already offer: First-run shows, without commercials, for $1.99 an episode, available the day after they air on broadcast or cable. Sources say the site’s negotiations with the networks and studios that own the shows are preliminary. But both sides seem optimistic, since models for such deals already exist. No comment from YouTube.
I thought that YouTube.com was 99 percent crap and one percent good content. I thought Google lacked the business acumen to do “something” with YouTube.com. I thought YouTube.com was a giant mistake that would function like a giant, digital albatross and make Google into a current version of the ancient mariner. Guess I was misunderstanding previous punditry?
I don’t have too much to add to the maven’s analysis. I won’t even question this statement:
But while Web users have an insatiable appetite for video, they’ve yet demonstrate much interest in paying for it. If any of this is going to work, that will have to change.
I think I will raise some questions that might be considered:
- What are the core technologies on which this alleged new initiative of Google rests? When were they developed? What other functions do these have, assuming the technologies do indeed exist?
- What are the costs of the alleged new service? How will Google’s present business model benefit from such a push if such a push takes place?
- What other rich media could be affected by such a push assuming the push is extended?
- Why haven’t other companies built a sufficiently significant competitive barrier to entry? What are the weaknesses of the existing services? is Google’s service competitively flawed? If so, in what way? Is it superior? In what way?
I can generate some other questions that pundits, mavens, azure chip consultants, Google watchers, and other experts should be considering. I wonder if these wizards know the Pope has a TV channel on Google? Digital Gutenberg, anyone?
Stephen Arnold, December 3, 2009
I wish to disclose to American Society of Composers, Authors and Publishers that I was not paid to point out that questions about Google are yet to be answered. Maybe the truth is within one of the many Sergey-and-Larry-eat-pizza books?
Google Nails Duplicate Detection Invention
December 3, 2009
I know that most of my two or three readers does not give a goose feather for duplicate detection. Pretty boring stuff. Google result lists seem to be just one list with few repeating objects. Even in the Google News service, identical stories rarely slip through the digital net.
The ever reliable USPTO has granted a patent to the Google for its duplicate detection method. If you want to know a bit more about the Google approach, you will want to download US7,627,613, “Duplicate Document Detection in a Web Crawler System”. Before my pals at various search and content processing companies email me to explain that their duplicate detection is better, save that energy. No one at the Beyond Search goose pond is asserting “better”. The Google invention deals with scale, petabytes of digital crapola deduped quickly and reasonably effectively. The “scale” idea is one clue to Google’s technology. The challenges of scale are not well understood unless you have to figure out what to do with trillions of instances of digital crapola.
Google says in its glorious prose:
Duplicate documents are detected in a web crawler system. Upon receiving a newly crawled document, a set of documents, if any, sharing the same content as the newly crawled document is identified. Information identifying the newly crawled document and the selected set of documents is merged into information identifying a new set of documents. Duplicate documents are included and excluded from the new set of documents based on a query independent metric for each such document. A single representative document for the new set of documents is identified in accordance with a set of predefined conditions.
Notice what’s left out? Now read the patent document. Notice what’s left out? Google does not make explicit how these separate inventions interlock. Those interlocks are sort of important, particularly if you are a competitor and one of your 20 somethings say, “That’s obvious. I can code that up myself.” Scale. Remember scale. Remember that Google can convert speech to text and then dedupe those outputs too. Scale. Performance. Cost. Useful Google concepts all.
Stephen Arnold, December 3, 2009
I wish to disclose to the National Constitution Center that I was not paid to write this essay with its implicit reference to the constitutional right of Google competitors to misunderstand the notion of “scale” in Google’s weird vocabulary.
Microsoft and the Apple iPhone
December 2, 2009
Short honk: I am sitting yet again in the international terminal in Hotlanta’s airport. I hear a conversation between two 30 somethings about their recent Microsoft experience. One had on a Microsoft T shirt, so maybe one was a “real” Microsoft engineer. The snippet of their enthusiastic conversation was that 25 percent of Microsoft employees use an Apple iPhone. I know that the spiffy Microsoft store on the Redmond campus sells only mobile devices with Windows Mobile 6.5 living in the devices’ solidstate heart. Pretty startling market penetration if true. With an Android phone from Google heating up the blogosphere, can Microsoft make headway? I wonder if I can get paid to use a Windows 6.5 phone?
Stephen Arnold, December 1, 2009
I wish to disclose to the General Services Administration that I was not paid to write about the iPhone or the Google Android phone. I presume both of these devices will be on the GSA schedule.
Throwing Water on Real Time Search
December 2, 2009
Short honk: I read this TechCrunch article “Twitturly Sold for a Song” weeks ago. I wanted to point out a passage that I found interesting:
Joel Strellner, who started the project, finally put Twitturly up for sale on Flippa ten days ago, and the auction just ended. Only five bids came in, and the sale ultimately netted no more than $8,500.
Is real time search losing its magnetic force? Looks like it. Since I am talking about real time search, I want to include this example of water being thrown on the hot fires of big money in this corner of the search world.
Stephen Arnold, December 2, 2009
Disclosure: Yep, I am paid to talk about real time search. I am not paid to write about it. Alert to local Justice of the Peace. I am without compensation.
Some Thoughts About Real Time Content Processing
December 2, 2009
I wanted to provide my two or three readers with a summary of my comments about real time content processing at the Incisive international online information conference. I arrived more addled than than normal due to three mechanical failures on America’s interpretation of a joint venture between Albanian and Galapagos Airlines. That means Delta Airlines I think.
What I wanted to accomplish in my talk was to make one point—real time search is here to stay. Why?
First, real time means lots of noise and modest information payload. To deal with lots of content requires a robust and expensive line up of hardware, software, and network resources. Marketers have been working overtime by slapping “real time” on any software product conceivable in the hopes of making another sale. And big time search vendors essentially ignored the real time information challenge. Plain vanilla search on content updated when the vendor decided was an easier game.
Real time can mean almost any thing. In fact, most search and content processing systems are not even close to real time. The reason is that slow downs can occur in any component of a large, complex content processing system. As long as the user gets some results, for many of the too-busy 30 somethings that is just fine. Any information is better than no information. Based on the performance of some commercial and governmental organizations, the approach is not working particularly well in my opinion.,
Let me give you an example of real time. In the 1920s, America decided that no booze was good news. Rum runners filled the gap. The US Coast Guard learned that it could tune a radio receiver to a frequency used by the liquor smugglers. The intercepts were in real time, and the Coast Guard increased its interdiction rate. The idea was that a bad buy talked and the Coast Guard listened in real time even though there was a slight delay in wireless transmissions. The same idea is operative today when good guys intercept mobile conversations or listen to table talk at a restaurant.
The problem is that communications and content believed to be real time are not. SMS may be delivered quickly, but I have received SMS sent a day or more earlier. The telco takes considerable license in billing for SMS and delivering SMS. No one seems to be the wiser.
A content management system often creates this ty8pe of conversation in an organization. Jack: “I can’t find my document.” Jill: “Did you put it in the system with the ‘index me’ metatag?’” Jack: “Yes.” Jill: “Gee, that happens to me all the time.” The reason is that the CMS indexes when it can or on a specific schedule. Content in some CMSs are not findable. So much for real time in the organization.
An early version of the Google Search Appliance could index so aggressively that the network was choked by the googlebot. System administrators solved the problem by indexing once a day, maybe twice a day. Again, the user perceives one thing and the system is doing another.
This means that real time will have a specific definition depending on the particular circumstances in which the system is installed and configured.
Several business sectors are gung ho for real time information.
Financial services firms will pay $500,000 for a single Exegy high speed content processing server. When that machine is saturated, just buy another Exegy server. Microsoft is working on a petascale real time content processing system for the financial services industry which will compete with such established vendors as Connotate and Relegence. But a delay of a millisecond or two can spoil the fun.
Accountants want to know exactly what money is where. Purchase order systems and accounts receivable have to be fast. Speed does not prevent accidents. The implosion of such corporate giants as Enron and Tyco make it clear that going faster does not make information or management decisions better.
Intelligence agencies want to know immediately when a term on a watch list appears in a content stream. A good example is “Bin Ladin” or “Bin Laden” or a variant. A delay can cost lives. Systems from Exalead and SRA can handle this type of problem and a range of other real time tasks without breaking a sweat.
The problem is that there is not certifying authority for “real time”. Organizations trying to implement real time may be falling for a pig in the poke or buying a horse without checking to see if it has been enhanced at a horse beauty salon.
In closing, real time is here to stay.
First, Google, Microsoft, and other vendors are jumping into indexing content from social networks, RSS feeds, and Web sites that update when new information is written to their databases. Like it or not, real time links or what appear to be real time links will be in these big commercial systems.
Second, enterprise vendors will provide connectors to handle RSS and other real time content. This geyser of information will be creating wet floors in organizations worldwide.
Third, vendors in many different enterprise sectors will be working to make fresh data available. You may not be able to escape real time information even if you work with an inventory control system.
Finally, users—particularly recent college graduate—will get real time information their own way, like it or not.
To wrap up, “what’s happening now, baby?” is going to be an increasingly common question you will have to answer.
Stephen Arnold, December 2, 2009
Oyez, oyez, I disclose to the National Intelligence Center that the Incisive organization paid me to write about real time information. In theory, I will get some money in eight to 12 weeks. Am I for sale to the highest bidder? I guess it depends on how good looking you are.
How to Make Money as a Google Partner Video Now Available
December 2, 2009
Arnold Information Technology, http://www.arnoldit.com, released the fifth video of its six-video series called “How to Make Money with Google,” outlining how people can tap the Google revenue-generating machine. This video focuses on the Google partner and reseller program and premiered today at http://www.arnoldit.com/video.
The purpose of this short video series–watching all six videos takes about forty minutes–is to give clear, factual information on four specific ways an enterprising individual, a services company, or a diversified company can use the Google platform to produce revenue while meeting the needs of their customers and prospects. The videos are available for personal and educational use with no fee.
he newest video highlights Google’s partnership opportunities, experiences of Adhere Solutions, one of Google’s premier partners, and an overview of the upsides and challenges of partnering with Google. For anyone seeking to become a Google Apps, Search, or Maps partner, this short video will help you to ride the Google Wave.
“Adhere Solutions is happy to participate in this partner video to help companies understand how they can work with Google. We receive many questions regarding how to become a successful Google partner, and this video series by ArnoldIT is the only one we have found that provides valuable and straightforward information on how to work with Google,” says Jim Orris, Director of Adhere Solutions.
“To be successful (as a partner), commitment to Google and its platform is essential,” Stephen E. Arnold, president of Arnold Information Technology, said. “But the potential for earnings has no limit.” Arnold has published three Google monographs and these videos are based on the information compiled for The Google Legacy, Google Version 2.0, and Google: The Digital Gutenberg. The monographs are available from Infonortics Ltd., in Tetbury, Glos., at http://www.infonortics.com.
Other videos include an overview of money-making opportunities, including why the Google opportunity is similar to the opportunity Microsoft created with its MS DOS software in the early 1980s; using Google’s Adsense advertising module; the opportunity Google presents programmers who can create applications for the Google platforms such as Android for mobile devices; an overview of Google and Search Engine Optimization; and a video titled “Google Creates Opportunity,” which emphasizes the opportunity to grow with Google as the company strives for $100 billion in revenue.
“Google’s innovation drives opportunity for you,” Arnold said. “I wanted to provide some basic, factual information about what I see as the Google revenue opportunity. Information about Google is everywhere, but the upside of Google as an opportunity is not widely known. I will make these videos available without charge in the hopes that the Google revenue opportunities get broader dissemination.”
The series will be posted at http://www.arnoldit.com/video. Videos will be released on a seven- to 10-day cycle from today to Nov. 20. ArnoldIT.com has no relationship with Google. The information presented in the video represents the views and findings of ArnoldIT.com’s analyses of Google. The videos were directed by Chris Forrester, Perceality Productions, at http://twitter.com/perceality. The samba music is courtesy Sounddogs.com. For information about other uses of the videos, contact ArnoldIT.com at seaky2000 [at] yahoo.com.
About Stephen E. Arnold
Stephen E. Arnold monitors search, content processing, text mining and related topics from his office in Kentucky. He works with colleagues worldwide on a wide range of online and content-related projects. The company’s Web site is http://arnoldit.com, and the Beyond Search blog is at http://arnoldit.com/wordpress/.
Jessica Bratcher, December 2, 2009
Disclosure: Ms. Bratcher was paid by Stephen E. Arnold to write this shameless marketing article. Erik S. Arnold did not pay a dime, and he posted a Tweet that we worked on this video on a Saturday night when he could have been out with his wife having fun. Please, report this to the Planned Parenthood group. I am a miserable father today just as I was when he was a wee lad searching Dialog on my Texas Instruments Silent 700.
Google and Business
December 1, 2009
Computerworld ran a story I found interesting for two reasons. First, Google apparently cooperated with Computerworld to some degree. Second, the story lists a number of Google sales wins in the business market. Cooperation seems to be part of the new Google. Hey, with a TV show and journalists eager to write about Google functions that have been “unknown” to mavens and pundits, the Mountain View crowd is making hay while the sun shines.
Here are the clients I noted in the article “Can Google Really Hack It in Business?” Note: annoying pop ups appear in front of the text. I bet these sell lots of products to real humans. Geese like me ignore them with extreme prejudice.
- AlertSite
- City of Los Angeles
- Sonian
- Sprinklr
Quite a line up of customers. I also liked the “revelation” that Google has a master plan. After 11 years, I think it is admirable that this insight has been articulated.
Stephen Arnold, December 1, 2009
Oyez, oyez, I herewith disclose that I was not paid to point out that Computerworld has nailed the obvious. I want to share this information with the watchdogs at the Refugee Resettlement Agency.
When Your Search Traffic Heads South, What Ya Gonna Do?
December 1, 2009
Google gives new sites a little tailwind. Once a site is established, Googzilla, according to my research, becomes a bit more demanding. Over time, sites drift downward in a search results list if the webmaster does not pump value into a Web site. The algorithmic approach is not perfect, but I have examined a number of clients’ sites where the drift downward seems to be a natural process, if a Google algorithm is “natural”.
I read Problogger’s “What to Do When Your Search Rankings Drop” and the comments available at 2 pm eastern on November 30, 2009. The core of the article for me was embodied in this passage:
This approach [cross references and backlinks] had worked for me – however when my Google traffic disappeared I was left with little and realized how short sighted I’d been. I began to change my focus and started working on other sources of traffic. I still love the traffic that Google sends me but today if it all disappeared it would hurt – but it wouldn’t be the end for my business.
To keep traffic up, find other ways to get traffic. Good advice. Might be tough for some webmasters to implement because quite a few people with whom I work live in a Google fishbowl. In fact, the more Google seeps into the global information infrastructure, the more probable the outcome that Google becomes “the Internet”.
Stephen Arnold, December 1, 2009
Okay, a disclosure. Another freebie. I will send a Vulcan mind meld to the Rehabilitation Services Administration, whose professionals can “fix up” blogs and Web sites with declining traffic I presume.
Can Microsoft and Its Petascale Financial Services Mining Project Succeed
December 1, 2009
The goslings and I were chattering and quacking in Harrod’s Creek. One of our cousins was killed and eaten for an American holiday. What a way for our beloved friend and colleague to go: deep fried in an oil drum behind the River Creek Inn.
As we were recalling the best moments in Theodore the Turkey’s life, we discussed the likelihood of Microsoft’s petascale content mining project hitting a home run. The ideas, as we addled geese understand it, is that Microsoft wants to process lots of content and generate high value insights for the money crazed MBAs in the world’s leading financial institutions.
The project tackles a number of tough technical problems; for example, getting around the inherent latency in petascale systems, dealing with the traditional input output balkiness of Windows plumbing, and crunching enough data with sufficient accuracy to make the exercise worth the time of the financial client. You may find my earlier post germane.
Other outfits are in this game as well. Some are focused on the hardware / firmware / software side like Exegy. Others provide toolkits like Kapow Technologies. Some Beltway Bandits operate low profile content filtering systems for governmental and commercial clients. And there is the old nemesis, Googzilla, happily chewing through one trillion documents every few days. Finally, some of the financial institutions themselves have pumped support into outfits like Connotate. Even the struggling Time Warner owns some nifty technology in the Relegence unit. So, what’s new?
Three thoughts as I prepare to comment about the push into perceived real time processing at the International Online
Show:
- The cost of slashing latency with any type of content is going to be one expensive proposition. Not even some governments have the cash to muscle up a serve with terabytes of RAM. Yep, terabytes.
- Figuring out what process left another process gasping for air requires some programmers who can plow through code looking for an errant space, a undefined variable, or a bit of an Assembler hack that push when it should have popped
- Latency outside the span of control of the system can render some outputs just plain wrong. Delay is bad; bad outputs are even worse.
If you have not been tracking Microsoft’s big initiatives, you may want to spend some more time grinding through the ACM and other scholarly papers such as “Towards Loosely Coupled Coupled Programming on Petascale Systems. and poking around on the Microsoft Web site. To find useful stuff, I use the Google Microsoft index. If you aren’t familiar with it, check it out here.
I wonder if this stuff will be part of SharePoint 2011 and available as a Microsoft Fast ESP plug in?
Stephen Arnold, December 1, 2009
Yes, oh, yes. Let me disclose to the National Institute of Science and Technology that I was not paid to write this humorous essay. Consider it a name day present. If I am late, that’s latency. If I am early, that’s predictive output.

