Google on Path to Becoming the Internet

September 28, 2009

I thought I made Google’s intent clear in Google Version 2.0. The company provides a user with access to content within the Google index. The inventions reviewed briefly in The Google Legacy and in greater detail in Google Version 2.0 explain that information within the Google data management system can be sliced, diced, remixed, and output as new information objects. The analogy is similar to what an MBA does at Booz, McKinsey, or any other rental firm for semi-wizards. Intakes become high value outputs. I was delighted to read Erick Schonfeld’s “With Google Places, Concerns Rise that Google Just Wants to Link to Its Own Content.” The story makes clear that folks are now beginning to see that Google is a digital Gutenberg and is a different type of information company. Mr. Schonfeld wrote:

The concerns arise, however, back on Google’s main search page, where Google is indexing these Places pages. Since Google controls its own search index, it can push Google Places more prominently if it so desires. There isn’t a heck of a lot of evidence that Google is doing this yet, but the mere fact that Google is indexing these Places pages has the SEO world in a tizzy. And Google is indexing them, despite assurances to the contrary. If you do a search for the Burdick Chocolate Cafe in Boston, for instance, the Google Places page is the sixth result, above results from Yelp, Yahoo Travel, and New York Times Travel. This wouldn’t be so bad if Google wasn’t already linking to itself in the top “one Box” result, which shows a detail from Google Maps. So within the top ten results, two of them link back to Google content.

Directories are variants of vertical search. Google is much more than rich directory listings.

Let me give one example, and you are welcome to snag a copy of my three Google monographs for more examples.

Consider a deal between Google and a mobile telephone company. The users of the mobile telco’s service run a query. The deal makes it possible for the telco to use the content in the Google system. No query goes into the “world beyond Google”. The reason is that Google and the telco gain control over latency, content, and advertising. This makes sense. Let’s assume that this is a deal that Google crafts with an outfit like T Mobile. Remember: this is a hypothetical example. When I use my T Mobile device to get access to the T Mobile Internet service, the content comes from Google with its caches, distributed data centers, and proprietary methods for speeding results to a device. In this example, as a user, I just want fast access to content that is pretty routine; for example, traffic, weather, flight schedules. I don’t do much heavy lifting from my flakey BlackBerry or old person hostile iPhone / iTouch device. Google uses its magical ability to predict, slice, and dice to put what I want in my personal queue so it is ready before I know I need the info. Think “I am feeling doubly lucky”, a “real” patent application by the way. T Mobile wins. The user wins. The Google wins. The stuff not in the Google system loses.

Interesting? I think so. But the system goes well beyond directory listings. I have been writing about Dr. Guha, Simon Tong, Jeff Dean, and the Halevy team for a while. The inventions, systems and methods from this group have revolutionized information access in ways that reach well beyond local directory listings.

The Google has been pecking away for 11 years and I am pleased that some influential journalists / analysts are beginning to see the shape of the world’s first trans national information access company. Google is the digital Gutenberg and well into the process of moving info and data into a hyper state. Google is becoming the Internet. If one is not “in” Google, one may not exist for a certain sector of the Google user community. Googleo ergo sum.

Stephen Arnold, September 28, 2009

A SharePoint Dev Wiki: No NDA Required

September 28, 2009

A happy quack to the reader who sent me a link to a new SharePoint developer wiki. You can locate the service at SharePoint Dev Wiki. The person responsible for the wiki is Jeremy Thake, a Microsoft whiz. He wrote:

I want to try and make this a collective site to map all the information on WSS 4.0 and SP2010 from around the web in blogs, forum, twitter and web sites around the Internet. This should allow people to start here and either take a shallow dip into the information or deep dive into certain features and find the external resources of relevance.

When I visited, there were a number of current posts. None was about search, but that will change when the SharePoint 10 beastie leaves the lair. Looks promising. No NDA required too.

Stephen Arnold, September 28, 2009

Cognitive Decline. Does This Mean Students Are Getting Stupider?

September 28, 2009

I read the article “Cognitive Decline of the University Population”. I liked the graphs: lines heading south. The explanation:

It should not be surprising that a shrinking percentage of college students can write well or do basic mathematics, let alone appreciate Proust or quantum mechanics. Increased years of education have not actually increased verbal abilities in the general population, which at least partially supports the signaling and sorting model of higher education (the primary value of credentials is that they reflect more or less invariant qualities such as IQ and Conscientiousness), as opposed to the model that higher education builds human capital.

Clear writing for sure. I think this means that students are following the magnetic pull of the norm. My recollection from the hollow in Kentucky is that Alexis-Charles-Henri Clérel de Tocqueville figured this out by considering the consequences of “middling values”.

Stephen Arnold, September 28, 2009

Yebol Web Search: Semantics, Facets, and More

September 28, 2009

Do We Really Need Another Search Engine?” is an article about Yebol. Yebol is another search engine. The write up included this description of the new system:

According to its developers, “Yebol utilizes a combination of patented algorithms paired with human knowledge to build a Web directory for each query and each user.  Instead of the common ‘listing’ of Web search queries, Yebol automatically clusters and categorizes search terms, Web sites, pages and contents.” What this actually means is that Yebol uses a combination of methods – web crawlers and algorithms combined with human intelligence – to produce a “homepage” for each and every search query. For example, search Bell Canada in Yebol and, instead of a Google-style listing of results, you’re presented with a “homepage” that provides details about Bell’s various enterprises, executives, competitors as well as a host of other information including recent Tweets that mention Bell.

The site at http://www.yebol.com includes the phrase “knowledge based smart search.” I ran a query for Google and received a wealth of information: links, facets, hot links to Google Maps, etc.

yebol results

My search for dataspace, on the other hand, was not particularly useful. I anticipate that the service will become more robust in the months ahead.

The PC World write up about Yebol said:

At launch, Yebol can provide categorized results for more than 10 million search terms. According to the company it intends to provide results for ‘every conceivable search term’ in the next three to six months.

The founder is Hongfeng Yin, was a senior data mining researcher at Yahoo! Data Mining Research team, where he built the core behavioral targeting technologies and products which generate multi-hundred millions revenue. Prior to Yahoo, he was a software manager and Sr. staff software engineer with KLA-Tencor. He worked several years on noetic sciences and human think theory with professor Dai Ruwei and professor Tsien Hsue-shen (Qian Xuesen) at Chinese Academy of Sciences. He has a Ph.D. in Computer Science from Concordia University, Canada and Master degree from Huazhong University of Science and Technology, China. Hongfeng has multiple patents on search engine, behavioral targeting and contextual targeting.

The Yebol launch news release is here. The challenge will be to deliver a useful service without running out of cash. The use of patented algorithms is a positive. Combining these recipes with human knowledge can be tricky and potentially expensive.

Stephen Arnold, September 28, 2009

Desire, Google, and a Database

September 28, 2009

The article “Internet Search: Your Secrets Are Not Safe” seemed to be a rehash. But I did find some useful information. For me, the most interesting comment in the article was:

Big daddy Google has over 63 per cent of market share in searches, and the largest ‘desire database’. A version is available on its site where you can compare the popularity, over time, of any two products in any specified location. No absolute figures are offered — they don’t tell you that X brand had 1,000 hits in Mumbai while Y brand had 2,000. They just give out relative shares in a graph. But, make no mistake, they have the figures. This database is the mother lode of consumer profiling. Marketing strategies get a whole new meaning when a dossier of searches reveals the direction of people’s curiosities or needs. Marketing firms have been known to purchase parts of this database. Last year, UK advertising broker Phorma bought millions of personal details from Internet service provider British Telecom to sell to companies interested in online advertising. Most big search engine operators have said that they won’t be selling their database, probably because it is much more valuable if they have a monopoly over it. They can use it to charge a fee from advertisers. A watered-down version of this is already online — the subject-linked ads you see when you open an email message. Management of minds, mediated through machines, may well be at hand.

How does Google Books play into this desire angle? Books provide the knowledge foundation on which desire perches in my opinion.

Stephen Arnold, September 28, 2009

Consultant Temp Omits Context for ATT and Google FCC Dust Up

September 28, 2009

I thought ATT was miffed because Google Voice can block calls ATT cannot. With Google’s method Google gets an edge over ATT. Big surprise, right? The Google can block calls to places like Harrod’s Creek. ATT can charge more for this type of connection. I know. ATT is my phone company.

Then, I read “AT&T Calling Google a Noisome Trumpeter to FCC”. Gerson Lehrman Group is a rental agency for consultants. The idea is a good one. Save the big fees imposed by McKinsey, Booz, and Boston Consulting Group and get solid advice. I think it works reasonably well in this belt tightening market. The analysis of the ATT and Google dust up over Google Voice does what most MBA-inspired analyses do: Describes what’s in the newspapers. One comment caught my attention:

AT&T points out the FCC’s fourth principle of the Internet Policy Statement to be about competition among network providers, application and service providers, and content providers. The FCC issue will be if customers with IP connections are favored in making calls with lower costs and more UC capabilities. The goal for the U.S. market has to be that competition improves communications connectivity regardless of the type of provider.

My view of the squabble is that ATT now realizes that Google is a next generation telecommunications company. In fact, Google’s engineers have pushed into technical fields that were converted to Wal*Marts and Costcos by the “old” Baby Bells. Like farmers angered with new uses for their land, the farmers want to go back to the halcyon days of the past.

Google has marginalized the past, particularly with regard to telecommunications in four ways. None of these is referenced in the consulting firm’s analysis:

  1. Google has built a global infrastructure that provides digital or bit-centric services unencumbered by the methods and systems that US telcos in particular provide their customers. The platform approach means that telco is one business thrust, not THE business thrust.
  2. The technology in play at Google is in some cases based upon a Bell Labs-style of investment; that is, bright people working on big problems. When a breakthrough emerges, Google makes an effort to allow various Google units to “do something” with the invention. I would direct the GLB MBA to how Google has learned from a patent application that has now migrated to Alcatel Lucent. ATT had access to the same invention, missed its significance, and now faces a significant challenge in data management. Just one example from the dozens I have gathered, gentle reader. ATT’s research arm, while impressive, is not like Google’s. I think Google has some refugees from the “old” Bell Labs too.
  3. ATT, like other US telcos, continue to resist what seems to be an obvious tactic—exploiting Google. In the US, companies like ATT prefer to block, chastise, and criticize aspects of Google that are little more than manifestations of its applications platform. Google Voice is an application, and it is not a particularly smart one as Google apps go, based on my research. Instead of asking the question “How can we exploit this Google service?”, the response from publishers, media companies, telcos, and some government agencies is to put Google in a box and keep it there. As I argued in 2004 in The Google Legacy, the river of change has broken through a dam. The river cannot be “put back.”
  4. Analyses that convert a long document into a summary are useful. I do this myself, but when that summary leaves out context, the points without proper definitions float like a firefly’s disembodied glow. What else is Google probing in the telco space? That’s an important question because ATT is dealing with a probe, not an assault. Is ATT missing a larger strategic challenge? Can an Apple ATT tie up win in a game that Apple and ATT not fully understand?

To wrap up, the addled goose gets very nervous when he meets agency rental sporting an MBA name tag. By the way, what does this mean: “The letter to the FCC is from AT&T’s Federal Regulatory and deduces from the hearsay about blocked rural calls that Google saves on the higher termination costs imposed by rural telcos.” Too much MBA sophistication for me.

The tag on the bottom of the article speaks volumes, “Request a Consultation.” This addled goose is quite happy, however, to see the article labeled as a marketing item just like this Web log.

Stephen Arnold, September 28, 2009

Google Android Flapette

September 27, 2009

Android Guys published Google Responds Cyanogate 09 and caught my attention. The goslings have been enmeshed in a Google development project, and we have not paid much attention to Android. Android Guys do. This story contained an interesting comment about confusion regarding certain open source issues. Android Guys point out that Android is open source with some constraints. What are those restraints? Well, that is one of the sort of clear Google points. Best bet is to not wrap Google’s apps into one own Android code.

Stephen Arnold, September 27, 2009

SharePoint Yearns for Some of SQL Server 2008

September 27, 2009

I am interested in how the Microsoft teams create interesting puzzles for me to solve. Example: Install SharePoint and then figure out which pieces of SQL Server are really needed. If you are interested in this type of problem and its answer, then you will want to read “How SQL Server 2008 Components Impact SharePoint Implementation”. The title promises more than the article delivers, but what it does provide was useful to me. Ross Mistry lists the four components of SharePoint that SharePoint absolutely, positively needs tonight. The list triggered a question in my addled goose brain, “Why not deliver SharePoint to a client with the requisite components in one package?” Nah, that would be too complicated. Now about that search system without the 50 million document ceiling? Free extra or for fee standalone component? I don’t know the answer.

Stephen Arnold, September 27, 2009

Microsoft Fast ESP with the Microsoft Bing Translator

September 27, 2009

A happy quack to the reader who sent me a link to a write up and a screenshot of the integrated translation utility in the new Fast ESP. The idea is to run a query and get results from documents in different languages. Click on an interesting document and get the translation. To my eye the layout of the screen looked a little Googley, but that’s because I look at the world through the two oohs in the Google logo. The write up is “Enterprise Search and Bing Services – Part 1: The Bing Translator” and you should read the story. Here’s the screenshot that caught my attention:

image

The article said:

In this example, not only is the user’s query translated and expanded to include other languages (French, German, and Chinese), but the user has the ability to translate the teasers or the entire document using the Bing Translator. The search results also include query highlighting for each of the multiple translations of the query. Finally, the user can use the slider bar (or the visual navigator) to favor documents written in certain languages. Any slider action causes the result set to update automatically. The relevance control behind this slider widget is actually a feature of FAST ESP, but it shows another way of surfacing cross-lingual search.

No information was provided about the computational burden the system adds to a Fast ESP system. Interesting, however. I prefer to see a translated version of the document’s title and snippet in the results list with an option to view the hit in its original language. The “old” Fast Search & Transfer operation had some linguistic professionals working Germany. I wonder if that group is now marginalized or if it has been shifted to other projects. Info about that linguistic group would be helpful. Use the comments section of this Web log to share if you are able.

Stephen Arnold, September 27, 2009

Oracle Spells Out Flaw in Its Core Data Management System

September 27, 2009

Another white paper on Bitpipe. Sigh. I get notices of these documents with mind numbing regularity. Most are thinly disguised apologia for a particular product in a congested market. I clicked on the link for the Line56 document “A Technical Overview of the Sun  Oracle Exadata Storage Server and Database Machine” and started speed reading. [To access this link you may have to backtrack and get a Bitpipe user name and password.] I made it to page 29 but a fish hook was tugging at my understanding. I back tracked and spotted the segment that caused a second, closer reading. The headline was “Today’s Limits on Database I/O” on page 2. Here’s the segment:

The Oracle Database provides an incredible amount of functionality to implement the most sophisticated OLTP and DW applications and to consolidate mixed workload environments. But to access terabytes databases with high performance, augmenting the smart database software with powerful hardware provides tremendous opportunities to deliver more database processing, faster, for the enterprise. Having powerful hardware to provide the required I/O rates and bandwidth for today’s applications, in addition to smart software, is key to the extreme performance delivered by the Exadata family of products. Traditional storage devices offer high storage capacity but are relatively slow and can not sustain the I/O rates for the transaction load the enterprise requires for its applications. Instead of hundreds of IOPS (I/Os per second) per disk enterprise applications require their systems deliver at least an order of magnitude higher IOPS to deliver the service enterprise end-users expect. This problem gets magnified when hundreds of disks reside behind a single storage controller. The IOPS that can be executed are severely limited by both the speed of the mechanical disk drive and the number of drives per storage controller.

After the expensive upgrades and the additional licenses, I wonder how Oracle shops are going to react to this analysis of the limits of the traditional Oracle data management system. Even more interesting to me is that the plumbing has not been fixed. The solution is more exotic hardware. Do I hear the tolling of the bell for the Codd database? I do hear the sound of more money being sucked into the the “old way”. Check out Aster Data or InfoBright. Might be useful.

Stephen Arnold, September 27, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta