Autonomy Heads East

February 22, 2010

A happy quack to the reader who keeps me up to date on the big doings at Autonomy. Today’s news is “China’s Civil Aviation Administration Selects Autonomy’s Meaning Based Computing Platform for Its Digital Library.” According to the write up:

With the rapid development of the Chinese aviation industry, the CAAC has seen a dramatic increase in the number of documents in its digital library. As a result, the CAAC recognized the need to implement a sophisticated information access solution that can deliver scalability and high-speed access to its millions of documents. Autonomy IDOL will allow the CAAC to integrate all of its different data types from various content repositories within the CAAC Digital Library, and provide advanced functionalities, including conceptual search, classification, clustering, and document recommendations.

When I have Chinese food, I am hungry a couple of hours later. Will Autonomy satisfy the appetite of the Chinese aviation system users? Stay tuned.

Stephen E Arnold, February 20, 2010

No pay again. I will report this to the US FAA. Busy agency these days.

Google and Energy

February 22, 2010

I left the power generation industry in 1975 (I think). I did a study of the online transaction service rolled out for Enron’s energy trading. That project forced me to look at how other companies dabbled in this once little-known niche in the US energy sector. Anyone remember Aquilla, a  name derived from the Latin word for eagle. Aquilla is still around, but it does business as Black Hills Energy. The other companies in this sector now have some competition.

The basics of energy trading is a variant of online search and retrieval. Information is indexed and then either analysts or smart software work through the data and their changes. The algorithms stipulate that when A happens, B should occur if the probability is X.

In short, energy trading is just another application running on a computer. The reason I mention this is that Google is now in the buying and selling of energy business. You can get the basics in “Google Energy Can Now Buy and Sell Electricity.

Most of the commentary I have scanned suggests that Google will be able to save money on its own electricity bills. That’s partially correct. My view is that the Google platform is going to take the “old” Enron model and improve it. Just as search and retrieval in the late 1990s was stuck in a rut, energy trading is similarly encumbered with inefficiencies.

My view: Google could be a bigger and better Enron. I do hope that its managers exercise somewhat better judgment than the “old” Enron group did. Worth watching because prior to this announcement I think the power generation, energy traders, and Wall Street mavens did not perceive Google as a mover or shaker in financial markets.

Well, that group of pundits will regroup once the light bulb goes on. Will the power be intermediated through Google’s trading desk? Buying and selling stuff based on digital data is just another Google application. Simple statement. Big implications in my opinion. Those janitor methods at Google are going to be busy little beavers.

Stephen E Arnold, February 21, 2010

I wish to report to the DOE that I was not paid to write this article. I was thinking of making the disclosure to the SEC, but I think that group has its hands full with traditional publicly traded power generation companies.

Perfect Video Search?

February 22, 2010

I wrangled a free meal from two Perfect Search engineers. I learned that Perfect Search was providing the technology for i.TV, pronounced “i dot TV.”

I have written about Perfect Search’s robust, high-performance search and content processing system previously.  You may know that the company was founded in 2007 by veterans of the search industry. Perfect Search has achieved significant, game-changing, patent-protected innovation in the core processes of search. The Perfect Search system can chop down the number of servers needed to manipulate petabytes of data by an order of magnitude. The result is increases in indexing and query speeds and throughput and dramatically lower infrastructure costs. Perfect Search products include a Database Search Appliance for Oracle, a OneBox Extender for the Google Search Appliance, and search for Backup and Storage solutions.

I am not “into video” so I was not familiar with i.TV. The company offers an application for the iPhone and iPod touch that helps people discover, share and consume media. With i.TV, users can browse hundreds of thousands of up-to-date local TV and movie listings, as well as a catalog of hundreds of thousands of TV and movie titles available for download and DVD rental. i.TV also includes community features and allows people to write reviews, rate shows and recommend shows to followers and friends on Twitter and Facebook. i.TV enables users to watch movie trailers and television previews, purchase movie tickets, manage their Netflix queues, and use their iPhone or iPod touch as a remote control.

My host, Ken Ebert, one of Perfect Search’s senior technologist, told me:

We have been able to replace the native search functionality of the MySQL application and integrate the Perfect Search engine into the i.TV application and have high-throughput functionality for indexing of new data and querying of the multiple MySQL databases that i.TV maintains. Companies that have multiple relational databases struggle to index and search these content repositories in a timely, cost-effective manner, especially when the query involves complex database joins. We are able to search over a billion records on a single Database Search Appliance. We are excited to be able to be involved with a company that has such a great product and that is poised to have significant growth.

Mr. Ebert explained that the Perfect Search team was delighted to to be installed as part of one of the top downloaded iPhone applications.

We downloaded the app and were able to locate specific shows quickly and easily. When I travel, I will be able to catch my History Channel favorite, “Engineering an Empire.” The i.TV app is available at the Apple Store and in the iTunes store and is a top download. Perfect Search brings order to the untidy world of programming databases. From the iPhone there is snappy performance for basic and advanced search.

Besides matching up geo-codes to determine the customer location, Perfect Search is handling some complex database joins, allowing i.TV customers to search by TV Network, actor name, or TV Show title with blistering response times. Perfect Search is also providing queries of several TV and movie listing databases.

Stephen E Arnold, February 22, 2010

I did get a free meal, but I was not otherwise compensated for this write up. I will report good food and fine company as a payoff to the Economic Research Service, a unit of the Department of Agriculture. Adhere Solutions is working with Perfect Search. My son is a smart lad in my opinion.

Search and the Open Source Card

February 22, 2010

A happy quack to the reader who sent me a link to Michael Tiemann’s “How Open Source Software Can Save the ICT Industry One Trillion Dollars per Year.” You can find the seven page document at http://regmedia.co.uk/2010/02/18/tiemann_cost_of_development_paper.pdf. When the paper was written in the fall of 2009, Michael Tiemann was the President Open Source Initiative and Vice President Open Source Affairs, at Red Hat. This firm is one of the highest profile commercial enterprises to have built a business on open source software. You can get the rosy financial news by searching Google Finance for RHT.

When I read the paper, I found myself in general agreement. But Red Hat is in the operating system and middleware business. For companies eager to chop down the license fees charged by commercial software companies, Red Hat’s approach is a must-have play.

My interest is search and content processing, and I think that many organizations are struggling to define search. If the news flowing from companies like Lemur Consulting and Lucid Imagination is accurate, some commercial search vendors no longer get a chance to compete. The outfits happy with Red Hat, JBoss, and other open source software are likely to hop on the Lucene / Solr bandwagon.

You can get a very upbeat picture of the benefits of open source software in Mr. Tiemann’s white paper. So if you want to make a case to go open source, you will want to download the document and tuck it in your “Sources” file.

There are some interesting “factoids” in the paper; for instance:

  • A reminder that most commercial software installations end up as train wrecks.
  • Costs and unnecessary expenses continue to escalate for organizations relying on commercial software
  • Proprietary software inhibits innovation.

But what about search?

Let me identify what I think is an interesting trend regarding open source and commercial vendors of search and content processing systems.

First, I have noted that one company has cut a deal with a commercial enterprise to make “connectors” available to the open source licensees. Connectors are the code widgets that allow one type of content such as Lotus Notes email to be indexed by a third-party system such as Lucene. This merging of commercial and open source suggests to me that for certain types of software, the open source community does not provide what many organizations need. After all, what good is a search system if it cannot index information in a widely used email system like Lotus Notes? I am not suggesting that the rosy picture painted my Mr. Tiemann is incorrect, but I think this is an interesting open source gap. Perhaps it will be filled by Red Hat?

Second, a number of high profile companies are offering open source operating systems. One notable example is a large search vendor’s operating system for mobile devices. If I were a struggling mobile company, I would certainly look closely at an open source, no-fee operating system. One would think that such a mobile operating system would sweep through the telecommunications industry like wildfire. What I learned last week was that Motorola was giving the for fee Windows 7 Mobile a very close look. Why? If the open source mobile operating system has a fraction of the payoffs referenced in Mr. Tiemann’s essay, why hook up with a very proprietary outfit like Microsoft? What does Motorola know that I don’t know?

Third, a number of vendors are talking about such Frankencode approaches as “support for open systems”, “full embrace of standards,” and “our APIs are open”. What do these phrases mean? On the surface, these vendors of proprietary systems seem to be leading me down the open source path. However, are these vendors using language in a way to lure the red fox to the steel trap?

Fourth, a very large outfit has figured out how to run Linux on its mainframes. What’s the purpose of this technical cartwheel? If I “buy” a mainframe, won’t the margins be sustained by boosting the price of those funny little connectors that mainframes use to hold drives in the DASD or the truly weird cables needed to hook certified gizmo A to certified gizmo B?

My hunch is that open source is a significant trend in software. Some of the success of open source is driven by those who want to create software to hold down costs and operate in a manner that to some degree reduces the brutal costs associated with certain commercial software products.

I think there is a big marketing and PR play underway as well. The use of the phrases “open source” and “support standards” sounds pretty good. Get the software into the company. When the organization’s boss figures out that the existing tech staff cannot make the open source software work as everyone believed it would, then the consulting engineers are ready to pounce.

My view is that one needs to bring the same discipline to defining requirements, testing software, and performing financial analyses regardless of the software type. This means that commercial and open source adherents will have to prove that their products and services can stand and deliver.

Without that discipline, “open source” is little more than a buzzword like “social media.”

Stephen E Arnold, February 22, 2010

No one paid me to write about open source. Because open source is “free” and I was not compensated, I am at a loss to know to whom to report my financial lapse. Maybe the Department of Treasury is the outfit in charge? Treasury knows money or at least how to print it I believe.

Buzz Search: Defaults Do Not Fly

February 22, 2010

Editor’s Note: Constance Ard, the Answer Maven, is one of the goslings. She wrote an overview of Google Buzz search functionality. Ms. Ard is active in the Special Libraries Association, heads up the legal interest group, and has an MLS with an emphasis on online search, taxonomies, and content processing.

With the release of Buzz flapping everyone’s wings over the last Internet half-life, it’s time to consider some practical application for Buzz. Danny Sullivan at Search Engine Land has laid the groundwork for searching Buzz.

For the record, the type it in the box and trust the search results, aren’t enough with this service from Google. You can see below, that Buzz, a social media tool that gets food from Twitter, Google Reader, Friend Feed, and SMS display results from a typical box search that are surprisingly old in the real-time scheme of things.

These results are for a search done at approximately 8 p.m. EST on February 17, 2010, through the Buzz search box with the term: Olympics. The first result is time-stamped 4:50 p.m. The last result was stamped 9:41 a.m. and the second was stamped 8:23 a.m. These are not exactly real-time results and not even reverse chronological in display.

clip_image002

clip_image002[4]

clip_image002[6]

clip_image002[8]

The same search on Buzzzy.com (selected results shown below) done at the same approximate time provides even more irritating displays. Has anyone heard of time, date stamps? I understand that in real-time search hours count but in search, pinpointing an accurate date and time is essential.

Read more

Twitter and Mining Tweets

February 21, 2010

I must admit. I get confused. There is Twitter, TWIT (a podcast network), TWIST (a podcast from another me-too outfit), and “tweets”. If I am confused, imagine the challenge for text processing and then analyzing short messages.

Without context, a brief text message can be opaque to someone my age; for example, “r u thr”. Other messages say one thing, “at the place, 5” and mean to an insider “Mary’s parents are out of town. The party is at Mary’s house at 5 pm.”

When I read “Twitter’s Plan to Analyze 100 Billion Tweets”, several thoughts struck me:

  1. What took so long?
  2. Twitter is venturing into some tricky computational thickets. Analyzing tweets (the word given to 140 character messages sent via Twitter and not to be confused with “twits”, members of the TWIT podcast network) is not easy.
  3. Non US law enforcement and intelligence professionals will be paying a bit more attention to the Twitter analyses because Twitter’s own outputs may be better, faster, and cheaper than setting up exotic tweet subsystems.
  4. Twitter makes clear that it has not analyzed its own data stream, which surprises me. I thought these young wizards were on top of data flows, not sitting back and just reacting to whatever happens.

According to the article, “Twitter is the nervous system of the Web.” This is a hypothetical, and I am not sure I buy that assertion. My view is that Google’s more diverse data flows are more useful. In fact, the metadata generated by observing flows within Buzz and Wave are potentially a leapfrog. Twitter is a bit like one of those Faith Popcorn-type of projects. Sniffing is different from getting the rare sirloin in a three star eatery in Lyon.

The write up points out that Twitter will use open source tools for the job. There are some juicy details of how Twitter will process the traffic.

A useful write up.

Stephen E Arnold, February 22, 2010

No one paid me to write this article. I will report non payment to the Department of Labor, where many are paid for every lick of work.

Jargon Means Shields Up for Consultants

February 21, 2010

I just read “Computer Jargon Baffles Users, Hinders Security.” This is a Thomson Reuters’ news story, and I don’t know if the wild and crazy url will work when you read this. Not my fault. Email Thomson Reuters, whose customer support crew is ready to help you.

The news story is one that runs every few months. The idea is that jargon is pretty much impossible for the average person to figure out. The argument in the Thomson Reuters’ story pivots on security, but the journalist could have picked on search, business intelligence, or any other common enterprise application. Jargon is a defense mechanism. Magic.

image

Source: http://s.bebo.com/app-image/7979726037/5411656627/PROFILE/i.quizzaz.com/img/q/u/08/04/08/Force_Field.jpg

For me, the key passage in the Thomson Reuters’ story was:

“The malicious and criminal use of cyberspace today is stunning in its scope and innovation,” said Dell Services President Peter Altabef. One problem is that computer “geeks” use jargon to cloak their work in scholarly mystique, resulting in a lack of clarity in everything from instruction manuals and systems design to professional training, the experts said. “If you don’t demystify security, people become anxious about it and don’t want to do it,” former U.S. Homeland Security Secretary Michael Chertoff told Reuters on the sidelines of the EastWest Institute security meeting in Brussels.

I had a conversation with a big wheel from a blue chip consulting firm. I really want to reveal which firm, but my legal eagle squawks when I provide certain information in this Web log. The guts of the conversation are easy to summarize.

Read more

IBM: From Mainframes to SEO

February 21, 2010

IBM’s alleged mastery of SEO baffles me. I remember hearing a talk several years ago from another IBM professional . I think the person’s name was Morano, Morone or Morrano (not Murano, that’s the glass place near Venice). I blanked out of the talk because IBM makes life really tough for me to locate information on its Web site. An outfit with an almost unusable search system is not going to have much credibility lecturing me about getting indexed in Google. I received via my trusty Overflight service a snippet pointing me to Writing for Digital. This blog ran an article I found darned remarkable. The story, which I urge you to read, is “Case Study: 2 Kinds of Organic Search Competition.” I am not “into” search engine optimization. I am into creating what I think is useful content for myself. If others read what I develop, okay with me. If others don’t read my information, okay with me. I am in the minority, but I think more folks should create original information and skip the SEO work that fascinates so many experts.

Several comments about this write up:

First, it uses the phrase “link juice” as a tag. I am not sure how many experts, even the SEO experts, use the phrase “link juice” when searching. In the database business, we would use unusual terms to keep tabs on wily vendors. Humans, nope?

Second, presumably “link juice” is in line with the author’s recommendations for high value SEO. This passage caught my attention:

One of the principles of our book is to do keyword research before you begin to even concept, let alone produce a Web page. In the old days, this didn’t happen very often on marketing pages. Traditionally, the messaging for a campaign was determined and the framework for the campaign’s Web copy was written before the search experts were brought in to choose the best keywords for it. That made it difficult to attain tight relevance between the copy and the keywords—leading to poor organic search performance.

Yep, “link juice” matches this info.

Third, how much work is required to get a $100 billion outfit’s boss indexed. Obviously a whole lot. Consider this passage:

Every speech by Sam Palmisano receives a lot of media attention, with good reason. So, the landing page was linked to by such media outlets as BusinessWeek, and others. It had blog mentions galore, including ReadWriteWeb, and others. Within two weeks of the speech, the landing page had more external link equity than all but the most central pages in ibm.com, which have link equity mostly by virtue of the pages in ibm.com that link to them, not as much by external links. After a few weeks, the speech landing page had more external link equity than all but a handful of pages in ibm.com.

In my deep experience as an addled goose, I think this is more craziness than this addled goose can tolerate. You may be different. That’s what makes goose races so exciting.

In short, I encourage you to follow in IBM’s footsteps is you [a] have a $100 billion revenue stream at your back, [b] Provide an almost unusable Web site to the hapless folks looking for documentation for an IBM device requiring a FRU, and [c] are trying to cook up a reputation as a guru in a field that is filled with land mines, potholes, and engineers who prefer clean code and original content.

Wowza! Why not make some changes to IBM.com and provide substantive content?

Stephen E Arnold, February 21, 2010

No one paid me to write this. I suppose I should report non payment to the GSA. IBM just landed a big contract to fix up the GSA’s computer systems. I will probably send my inputs to an IBM system which will certainly  work like most IBM search systems.

Quote to Note: Google on Telco Push

February 21, 2010

PC World snagged an alleged quote from top Googler, Eric Schmidt. This struck me as a keeper. You can find the source of the alleged quote in the story “Google CEO Has No Plans to Compete With Mobile Operators”. Here’s the statement:

“We are not going to be investing in broad-scale infrastructure,” Schmidt said. “It’s a very tough business and it’s not one for which we are very well optimized…. “We don’t want people to discriminate between different providers of the same kind of media,”

Good stuff!

Stephen E Arnold, February 18, 2010

No one paid me to highlight this alleged quote. The word “alleged” means that I should report this lack of compensation to the Department of Justice, masters of the use of the world “alleged” I have heard.

Microsoft and Yahoo, The Challenges

February 21, 2010

eWeek, once one of the big dogs in the Ziff Communications kennel, ran the story “Microsoft, Yahoo Face Integration Challenges, Analysts Say” on February 20, 2010. No kidding? I set this short write up aside because I was not sure how to comment on the analysis by the analysts. I decided to point out the challenges expressed in the article even those these were scattered and not grouped to make explicit that Microsoft and Yahoo have some challenges ahead. Here goes:

  1. Nine months to achieve integration, full shift by 2012
  2. Microsoft’s ad system is ready to tackle the Google in hand-to-hand combat
  3. Combined market share about 30 percent. Google’s market share is 65 percent of US market, maybe more so that’s like a handicap in golf, right?
  4. Yahoo’s hot search features will add lift to Bing. What about Bing’s UX?

My thought? We will know at the end of 2012 if not sooner. If this flops, what is Plan B? Lots of assumptions, lots of challenges. No Plan B. Even Alexander the Great had a Plan B until he fell ill and died.

Stephen E Arnold, February 22, 2010

Nope, no one paid me to write about Alexander the Great. Ah, a disease. I must report getting no dough for this short item to the NIH?

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta