Copyright and the Real Time Microblog Phenom

May 24, 2009

Liz Gannes’ “Copyright Meets a New Worth Foe: The Real Time Web” is an interesting article. You can find it on NewTeeVee.com here. Her point is that copyright, the Digital Millennium Copyright Act, and other bits and pieces of legal whoopdedoo struggle with real time content from Twitter-like services. She wrote:

If you’re a copyright holder and you want to keep up with your pirated content flitting about the web — well, good luck. The way the DMCA is set up means you’re always chasing, and the real-time web is racing faster than ever before. Analytics services are only just emerging that will tell you where your views are coming from on a semi-real-time basis. That’s especially true for live video streaming sites such as Ustream and Justin.tv. Justin.tv, in particular, has come under fire by sports leagues for hosting camcorded streams of live game broadcasts. The company says it takes down streams whenever it is asked to. But the reality is, often the moment has passed.

In short, information flows move more quickly than existing business methods. An interesting illustration of this flow for video is Twiddeo here. Government officials have their work cut out for them with regard to ownership, copyright, and related issues.

But…

As I read this article, I thought about the problem Google has at this time with real time content. Google’s indexing methods are simply not set up to handle near instantaneous indexing of content regardless of type. In fact, fresh search results on Google News are stale when one has been tracking “events” via a Twitter like service.

As important is the “stepping back” function. On Google’s search results displays, how do I know what is moving in near real time; that is, what’s a breaking idea, trend, or Tweet? The answer is, “I don’t.” I can hack a solution with Google tools, but even then the speed of the flow is gated by Google’s existing indexing throughput. To illustrate the gap, run a query for American Idol on Google News and then run the query on Tweetmeme.com.

Two different slants biased by time. In short, copyright problem and Google problem.

Stephen Arnold, May 24, 2009

Kumo: Now a Wrestling Metaphor

May 24, 2009

Richard Adhikari’s “Can a Semantic Kumo Wrestle Google to the Mat?” explores the Redmond giant’s most recent effort to make headway in the Web search market. Mr. Adhikari reported that Kumo incorporates the Powerset natural language processing technology Microsoft purchased last year via its Powerset acquisition. NLP allegedly gives a search system more ability to understand a user’s query. For me, the most interesting comment in the write up was this passage which has as its source an expert on search, Rob Enderle of the Enderle Group:

“Kumo was designed from the ground up to be a Google killer,” Enderle told TechNewsWorld. “Microsoft put a lot of effort into it.”

Mr. Adhikari then included this statement, the attribution of which struck me as ambiguous:

The project may be a costly one for Redmond. The amount of time and money Microsoft has spent on Kumo has caused deep divisions within the vendor’s management, Enderle said. “I understand a lot of people on the Microsoft board want them to stop this project,” he added. “They want Microsoft to focus on things they do well and not waste any more money.”

In my opinion, it’s tough to know if this set of assertions is 100 percent accurate. What is clear to are these points:

  • The Google “killer” metaphor is now almost obligatory. The issue for Microsoft is not doing better in search in the eyes of eCommerceTimes.com. The object is to kill Google. Can this be true? More than Powerset and marketing will be needed to impede Googzilla. It’s been a decade with zero progress and I keep thinking that try and try again is a great philosophy but a decade?
  • The Board dissention, if true, may accelerate the discussion of splitting Microsoft into three or four units and deriving more shareholder value from the aging software company. With its share price in the value range, a break up would add some spice to a plain vanilla stock.
  • The semantic theme is clearly a PR magnet. Semantics play a role in many search systems, but most of the plumbing is kept out of sight. Users want answers, not polysyllabic promises. Google, despite its flaws, seems to deliver a search suite that appeals to about 70 percent of the search market.

For more on this Microsoft Google tussle, see my Bing Kumo article here.

Twitter Search a Quitter

May 23, 2009

Louis Gray’s “Twitter’s Search Engine Is Very, Very, Broken” here underscored the plight of those engaged in information retrieval. Mr. Gray wrote:

The promise of Twitter’s advanced search capability is tremendous – letting you dice your queries by the sender and recipient, and even limiting the date range for said tweets, the location, hashtags or even emoticons. And at one time, it was a valuable resource. Now, depending on which account you’re viewing, the data set could be as small as a week, or oddly, in some cases, not available at all.

If I waddled my addled goose body from pond to keyboard, I could make the same assertion about any search and retrieval system with which I am familiar. In fact, I have been clear about the challenges of search and retrieval. I track about 350 vendors with my monitoring tools, and I could point to examples of problems with any of these companies’ systems.

So, flawed search is nothing new.

Some quick illustrations. You may be able to replicate these queries yourself, but some examples perform differently at different times.

First, Microsoft’s Live Search. Run this query: “educational materials”. Scan the results. My set is biased toward state sources and health. What’s up? I want links to outfits like NEA.org. Problem: context. Most search systems lack the technology to deliver context aware searches. Is Microsoft search “broken”? Not really.

Second. Yahoo’s shopping search. Run this query “dell mini9 ssd from the search box here and be sure to click on “shopping”. What do you get? Zero hits. Isolated instance? Nope, for certain queries Yahoo works pretty well, but for others it’s as off base as Microsoft’s Live Search.

Third. Google. Navigate to Google.com and enter this query: “eccs”. The acronym stands for “emergency core cooling system” and Google returns only false drops. Google fails this query test.

What’s happening?

The reality is that no search system works particularly well. Search is good enough, and in my opinion, that’s the state of the art. Twitter is no better and no worse than most free search systems. Will search improve? Slowly, goose lovers, slowly.

Stephen Arnold, May 23, 2009

Time Magazine, Owner of AOL: Google Has Won

May 23, 2009

Short honk: Article in Time Magazine called “What Will the World Do with More Search Engines?” here. Author: Douglas McIntyre. The quote that speaks for itself:

Creating a new search engine is a tremendous risk at this stage because it’s remarkably expensive to build and market one that has any chance in the mass market. To make the proposition harder, not only do people prefer Google to other products, but also most people are not able to tell whether a search product coming to market now is better. Good is so excellent that it is not good anymore.

Translation: Google wins. That’s what makes traditional journalism so darned wonderful.

Stephen Arnold, May 22, 2009

Universal: The New Information Baloney

May 23, 2009

English does not have enough words to keep parvenus, azure chip consultants, and newly minted experts happy. The terminology of search has reached a critical point. Everyone knows how to search. In meetings last week, I learned that “search is a been there, done that” experience. I also learned that “search is not interesting”. One bright young engineer told me and others in the group, “Our employees are basically search experts”.

In such an environment, I concluded that words like “universal”, “unified” and “user experience” define the search landscape. Toss in the notion of a “social experience”, a “community”, and “real time” and we have a new way to make information available.

image

Search has been thrown from the marketers’ bandwagon. Out of sight, search is no longer a problem. This seems now to be a  universal truth.

What’s happening is the poeticization of search. The people whom I have been encountering have adopted a weird language that does little to resolve challenges in finding information. Let’s look at several examples and see if there is a message in this linguistic information baloney.

First, read “Yahoo Eyes Acquisitions, Social Media” here. The story exists without much context, which is understandable in a short write up. The language regarding search illustrates the baloney to which I have referred. The author Alexei Oreskovic offered me this statement: “Yahoo will introduce new products this fall that will give users a more unified experience across its network of websites and showcase the company’s strategy to grow again, after much of 2008 was marred by the failed deal talks with Microsoft Corp.” A “unified experience” is a phrase that seemed to suggest Yahoo’s making or becoming a single unit. Yahoo is not a single unit. When I go to the Yahoo splash page, enter search terms, and get a result list, I get one thing. When I navigate to my Yahoo Mail account and enter search terms in one of the * two * search boxes, I don’t get unity. I get a list that may or may not be what I expected. Forget relevance. The user interface offers me two search boxes instead of one box with a way to indicate which collection I wish to search. Make a goof and you don’t get unification. If you are like me, you get what’s unexpected. The relevance and precision of the results are lost in the “experience”. On a very fundamental level, Yahoo has quite a bit of work to do across and within its high traffic services. A “unified experience” does not mean very much. The reality is the opposite of unity.

Second, in 2007, Google rolled out “universal search”. You can refresh your memory of this notion here. Two years later, there is no “universal” in Google search. Look at the main page at Google.com. You have to select a specific collection or index and then run your query within that collection. Universal means run separate queries and then glue together the results. I don’t see much “universal” in this approach. i see separate tasks and lots of manual grunt work. Other vendors have adopted the word “universal” in a lemming like response to Google’s own baloney. What other search vendors trot out universal search recently? Kosmix here. Search Cowboys and the search engine optimization crowd here. And many others. The phrase “universal search” connotes some magic land where information is available in a form that goes down like a child’s breakfast of Fruity Pebbles.

Read more

Microsoft Trains for Cage Matches

May 23, 2009

The Toronto Globe & Mail’s “Toronto Firm Wins in Suit against Microsoft” here almost went direct to the trash basket. Before I hit the delete key, I thought I should scan the article by Simon Avery. The story was chilling, and it’s about 75 degree here in the mine run off pond in Harrods Creek, Kentucky. Mr. Avery wrote:

Privately-held i4i Inc. said that several years ago it approached the world’s largest software company with a breakthrough product in data processing, only to be spurned and to see its technology show up later in versions of Microsoft Word.

The courts agreed. Wow. Giant firm. Small outfit. Allegations of improper behavior and a law suit. Big companies can appeal and probably prevail. Looks like Google is warming up for the coming cage match with the GOOG. One hopes that it performs more effectively in search that it has in the Xbox wars. Click here for a surprising stat.

Stephen Arnold, May 21, 2009

Beyond Search to Cybersocialization

May 23, 2009

Ad Age is not my first choice in morning reading. I scanned the article “The Coming End of YouTube, Twitter and Facebook Socialism” by Simon Dumenco here because these services are going like great guns in World War One. In fact, one of my history profs asserted that World War One was a precursor to World War Two, and that, ladies and gentlemen, ushered in much of our current brave new world.

Mr. Dumenco asserted:

It’s sweet, really, that venture capitalists have ponied up millions so that we can all keep tweeting. It’s also more than a bit scary. Because more and more of us are increasingly addicted not only to Twitter, but to other services that lack workable business models. What happens if the “dealers” who feed our habits disappear? (It’s been known to happen. Last week, for instance, Yahoo announced it was shutting down last century’s hot social-networking-esque service, GeoCities, for which it paid $3.5 billion in 1999.)

After reminding me that money may not be “smart” when it comes to Ad Age’s view of business, Mr. Dumenco concluded:

Seriously. I love YouTube, I’ve made some interesting connections through Facebook, and I enjoy Twittering. (Last week, for instance, I tweeted about an astonishing bit of information I came across in Britain’s Daily Telegraph: YouTube “reportedly uses as much bandwidth as the entire internet took up in 2000.”)

Two thoughts this gloomy day in Cleveland, once home to the burning river:

One. The notion of socialism is interesting, but I don’t think the argument was developed, nor presented in a way that squeezed the milk from this metaphor. The referenced services are superficially similar if one views them from the with it, Mad Ave vantage point. But I don’t see the three as having much similarity with socialism let alone to coinage “cybersocialization”. Balkanization and middle school friendship groups seem to be more on point to me along with the prospect of skills no longer in demand in today’s job market. The lack of cohesion within the services and their interesting swarming behavior seems to be a new type of social integration. The old and familiar “isms” don’t provide much in the way of handholds in my opinion.

Two. The ad perspective is commissionable monetization. None of these three services has figured out a business model for themselves that generates sufficient cash to keep the Odwalla flowing for the employees and contractors. More problematic is the reality that Mad Ave types have to sit on the sidelines, Twittering and updating Facebook pages, without an opportunity for billing. The market available to Facebook owners, for example, seems to be a challenge to squeeze into the old, familiar business models. As a result, Facebook is experimenting and while Facebook tries to figure out its financial future, the Mad Ave types are being driven wacky because they can’t figure out how to cash in on the service. The squirming reminds me of a class of third graders denied recess.

Bottomline for me: traditional and familiar advertising models have to be custom fit to these new markets. Venture firms have nothing to do with this problem, nor can money solve the problem. When I watched a jeweler in Istanbul fit a stone into a cheap sterling silver setting, I was surprised at how long it took. The value of the finished piece was insufficient to compensate the worker for the labor. But the jeweler managed. Ad execs may not have the degrees of freedom the jeweler enjoyed. With each passing day, the old Mad Ave skills may be eroding.

Like the publishing and financial sectors, the issue is not socialization. It is life on the dole or a McJob. I don’t think the end of YouTube, Facebook, or Twitter is impossible. I think that more disruption will take place before these outfits bite the dust. Traditional advertising is likely to face a greater challenge in the short term. Tweet that.

Stephen Arnold, May 23, 2009

Microsoft and Online Spending

May 22, 2009

When you are north of $65 billion a year in revenue, why explain? Mary Jo Foley’s “Microsoft’s Ozzie Defends Microsoft’s Aggressive Online Spending” surprised me for two reasons. First, the old saw “never complain, never explain” seemed to be ignored. And, second, Microsoft has cash, is floating a financial instrument to raise more cash, and the Windows 7 cash dump truck will be arriving in the near future. So, spend what you want seems like a reasonable approach.

Read Ms. Foley’s article here and make up your own mind. She reported:

Microsoft’s growing family of enterprise-focused services — Exchange Online, SharePoint Online, etc. — have taught the company a lot about cloud requirements. Its investments in consumer services  have taught the company important lessons about scale, Ozzie said. The underlying infrastructure Microsoft has built to deploy and run its consumer services is now being extended to support other services throughout the company, he said. Ozzie pointed to “Cosmos,” the high-scale file system that is part of Microsoft’s Azure cloud platform, as ultimately supporting and aiding every consumer, enterprise and developer property at Microsoft. He noted that the management systems for Microsoft’s current and existing cloud services are all derived from the learnings Microsoft has gleaned from managing its consumer online services. Ozzie said he believed one of Microsoft’s main advantages vis-a-vis its cloud competitors is “the fact we build both platforms and applications.”

This sounds reasonable to me. The one thought that struck me was that Microsoft’s spending in the last few years has not brought the financial home run that I had anticipated. For example, the purchase of Fast Search & Transfer and Powerset now seem in retrospect to be interesting ideas but not yet ready to deliver megabucks. In fact, if the chatter I heard in San Francisco last week is anywhere near accurate, Microsoft may be bundling Fast ESP with SharePoint for certain clients to prevent a third party vendor from getting its snoot in the cubicles at a Microsoft-centric organization. Second, the money poured into Vista was not exactly wasted, but the grousing and negative vibes rightly or wrongly flowing through the Web postings did not put money in the bank. I don’t pay much attention to consumer products like the Xbox and the Zune, but so far neither has been the Gold Rush for which I had hoped.

Nevertheless, Microsoft has money, and in my opinion, it can spend it any way it wants to spend it. The “defensive” spin tells me that maybe there is some force field operating when Microsoft gets in front of investors and bankers. What do these audiences know or do that makes a technologist adopt a tone that connotes uncertainty and doubt?

On a related note, a reader groused about my pointing to information about SharePoint that I obtained from Mary Jo Foley’s article about search for the new SharePoint. I point to stories. I don’t create the stories. I encouraged the reader to take his grousing to my source. I enjoyed his commentary, but I can’t do much to infuse accuracy in the stories which I read and upon which I comment. When I checked out his complaint, Ms. Foley seemed to be recycling what Microsoft has said about enterprise search. I find it quite common that “old” news is included in “new” news releases. The reason, I believe, is comprehensiveness, not nefarious behavior.

I like Ms. Foley’s work, and I will continue to point to her write ups, which are often better than the drivel I find in other free Web sources.

Stephen Arnold, May 21, 2009

Google, YouTube, and Digital Volume

May 22, 2009

Short honk: A year or so ago, I learned that Google received about one million new video objects per month. TechCrunch reported here that Google’s YouTube.com ingests about 20 hours of video every minute. I don’t know if this estimate is spot on, but it is clear that YouTube is amassing one of the world’s largest collections of rich text content in digital form. For me, the most interesting information in the write up was:

Back in 2007, shortly after Google bought the service, it was 6 hours of footage being uploaded every minute. As recently as January of this year, that number had grown to 15 hours, according to the YouTube blog. Now it’s 20 — soon it will be 24.

Lots of data means opportunity for the GOOG. I am looking forward to having the audio information searchable.

Stephen Arnold, May 22, 2009

Tough to Search When Computers Are Off

May 22, 2009

Courant.com reported that a computer virus caused problems for the US Marshal’s information system. You can read “Mystery Virus Strikes Law Enforcement Computers, Forcing FBI, US Marshals to Shut Down Parts of Networks” here. Security is an important consideration in online systems and for search and content processing. Tough to perform information retrieval when the computers are off line.

Stephen Arnold, May 22, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta