Quote to Note: Microsoft Most Profitable

May 29, 2010

I wanted to capture this quotation. I have a hunch I may want to use it in one of my columns or lectures. “Microsoft’s CEO: “We’re Still the Most Profitable” contained this alleged statement by Steve Ballmer, chief executive of Microsoft:

I will make more profit and certainly there is no technology company on the planet that is as profitable as we are.—May 28, 2010

I suppose the meaning of “technology” is implied. The categorical “no technology company on the planet” is a strong statement. “More profit”, of course, depends on accounting methods. Interesting statement at a time when the technology action seems to be swirling around some other companies.

Stephen E Arnold, May 29, 2010


MBA Work Targets

May 28, 2010

Navigate to Universum, a research outfit. The 2010 “American MBA Survey” leader board is available. What is interesting is that Google retains its Number One spot. This means that if you have a tattoo that says Harvard-certified MBA, you want to work at Google. The best MBAs, in theory, get to work at their first choice employers. Well, it is a theory.

There were some surprises in the 2010 ranking. Examples include:

  • Apple at the Number Five spot
  • Amazon at Number 11
  • Microsoft at Number 15
  • Sony at Number 26, ahead of IBM at Number 29. Sony?
  • The FBI at Number 57. Wow!
  • Oracle at Number 97.

MBAs have made the world a better, cleaner, safer, more productive place. I say that with whatever is left of my Halliburtonized, Booz Allen & Hamilton-processed heart. Just look at the economic performance of the economy as a whole and the companies on this list. More MBAs. Just what business needs. A search company tops the list. I wonder if anyone at Google has thought of writing an economic analysis explaining how important Google ads are to the economy? Oh, I remember. Already done. Just not by an MBA, but by an economist. Different skills.

Stephen E Arnold, May 28, 2010


Fly.com, Flight Search

May 28, 2010

Short honk: A happy quack to the reader who alerted me to Fly.com. I ran several queries and found the service speedy, delivering results on a par with those I routinely use for my travels. I noticed that discount airlines such as Southwest did not appear in the results for my test queries. I use Southwest to travel from Louisville to Chicago Midway and to Baltimore’s BWI airport. I added Fly.com to my bookmark manager, but I still have to knit together Southwest trips with other carriers’ service to get the lowest fares.

Stephen E Arnold, May 28, 2010


Google the Data Hoover

May 28, 2010

Today the goslings and I had a short chat about data privacy. Shortly after our discussion, I read “A Matter of Trust: 10 Places Google Collects User Data From”. Useful, but incomplete, round up of services from which Google allegedly captures user information. Worth a quick look.Then read “How Much Do You Trust Google?” Seems like a lot of people trust Google. Quite a differing view of the company. Which is accurate?

Stephen E Arnold, May 28, 2010


SAS Text Analytics and Teragram

May 28, 2010

I received a call about Teragram, the text processing company that SAS acquired a couple of years ago. I did a quick Overflight check and realized that I had not documented the absorption of Teragram into SAS. Teragram’s technology is alive and well, but the SAS positioning is for content processing to be a component of SAS Text Analytics. The product and solution has its own subsite within SAS.com. You can locate the details at http://www.sas.com/text-analytics/.

Another important point is that SAS Text Analytics includes four components. There is the SAS Enterprise Content Categorization function. The system parses content and identifies entities. Metadata are created along with category rules.

The second function is SAS Sentiment Analysis. A number of companies are competing in this sector. The SAS approach sucks in emails, tweets, and other documents. The system identifies various subjective shades in the source content.

SAS Text Miner now includes both text and data mining operations. The system is not one of those Web 2.0, “it is really easy” solutions. The system is easy to use, but to put “easy” in context, you will need programming and statistical savvy along with solid data set building skills.

The SAS Ontology Management solution provides a centralized method for keeping index terms and metatags consistent. Sounds easy, but this type of consistency is the difference between useful and useless information. SharePoint lacks this type of functionality. You have been given a gentle reminder about consistent tagging, dear SharePoint user.

SAS has a blog focused on text analytics. You can read “The Text Frontier” but last time I checked, the blog’s most recent update was posted in March 2010.

Bottomline: Teragram is alive and well, just part of SAS Text Analytics.

Stephen E Arnold, May 28, 2010


Ancestry Data

May 27, 2010

My family has a long history of failures. In fact, on a recent trip to the UK, I found scratched into a centuries-old century stone wall near Tetbury, “No Arnolds Allowed.” If you have some relatives who sojourned in the UK, you may want to take a look at “Nonconformist Records Archived Online.” Begin your skeleton hunt by navigating to London Historical Records, 1500s-1900s.

Stephen E Arnold, May 27, 2010


Duck Duck Go in the News

May 27, 2010

Privacy Protecting Search Engine Challenges Google” reminded me about the Duck Duck Go search engine. A quick check of my Overflight files revealed that the company’s name popped up in a comment to Beyond Search article called “The Addled Goose’s KMWorld Lecture” which appeared in late 2008. The “official” date for the company’s inception is 2009. Duck Duck Go has found an audience for its message of search privacy. According to the MIT Technology Review article:

[Duck Duck go] is a search engine that is profoundly–some might say radically–private. Unlike Google, it doesn’t build a user profile for you, store your IP address, or collect any other information that could ever tie a particular search to you. That makes it impossible, for example, for a future more-evil version of Weinberg (or his company, were someone to buy it) to exploit that data by selling it to advertisers without your permission (as Digg, MySpace, Facebook, and others have done). Or for the company to accidentally make search data public so that someone can connect whole strings of searches to the individuals who conducted them (as was done with AOL data in 2006). Or for a more intrusive U.S. or foreign government to successfully subpoena your search history.

You can read Duck Duck Go’s original write up about its approach in “Duck Duck Go Searches Are Now Externally Anonymous”. I found some interesting information in the write up; for example, a clever use of Amazon’s S3 service.

We ran some test queries and found the results useful. The brains behind the search service is an MIT grad, Gabriel Weinberg. Unlike Google, he is approachable and has an interest in startups. If you are not familiar with Duck Duck Go, spend some time running queries on the system. We did and we were impressed.

Stephen E Arnold, May 27, 2010


Google and Its Value to the US Economy

May 27, 2010

I ignored this news item when I first read about Google’s value to the US economy. In a conference call yesterday, one of the participants mentioned the “importance” and “value” of Google. I took a look at a couple of the write ups generated non-Googlers. I have some skepticism about the utterances of Googlers, and even Xooglers have to be approached with aural radar in operational mode.

Google Says It Helps Generate $54 Billion for Businesses and Nonprofits” is representative. The numbers are Google-scale. Who can miss the $54 billion? The method, not surprisingly, is based on advertising.

For me, the most interesting passage in the Los Angeles Times’s article was:

The company said businesses bring in $2 in revenue for every $1 they spend on AdWords, Google’s online search advertising program. Separately, Google assumed that businesses get five clicks on their search results for every one click on their ads. Based on that, the company calculates that businesses get $8 in profit for every $1 they spend on AdWords.

Seems reasonable on the surface I suppose. But fiddle the assumptions and you can make Google even more significant or much less significant. And what happens if one asks, “Does advertising contribute to the US economy?” Google would argue, “Yes.” Those who have lost their jobs or have felt the impact of the economic downturn might asks, “No, Google did not do too much for me.”

Forget the numbers and the assumptions.

For me, the more interesting question is, “Why did Google feel compelled to have its world-class economist demonstrate the value of Google?”

The firm has a dominant position in Web search. The company generates most of its $27 billion in revenue from ads. But in terms of the average person, does Google’s value as calculated mean anything? Will elected officials see the number and conclude that Google is really important.

I am fascinated about the timing of this report. Is the document evidence for potential advertisers who can now turn to Facebook or other options? If the report is a type of marketing collateral, will it convince advertisers to spend more on Google?

Shareholders and stakeholders hope so. Google’s share price has fallen more than 20 percent since January 1, 2010.

Stephen E Arnold, May 27, 2010


Sybase Touts Search Prowess

May 27, 2010

Sybase IQ Update Strengthens Database Query, Search Features” is, I suppose, a response of sorts to my opinion that SAP has not made a commitment to search. With TREX and some Endeca support, SAP seems content to rely upon third parties to make content findable within a sprawling SAP construct.

The story trots out an azure chip consultant to dash around the circus ring. I love those azure chip “we” statements. Right. The core of the announcement is that Sybase IQ15.2 can search unstructured content and perform such tricks as term frequency. Yep, that’s text analytics.

The passage that caught my attention was this:

“Sybase IQ is best known for its extreme performance, allowing decision-makers to analyze business trends, predict outcomes, and revise strategies, often in a matter of seconds,” Joydeep Das, Director of Product Management, Data Warehousing and Analytics at Sybase, said in a statement. “With Sybase IQ 15.2, enterprises are now able to analyze previously untapped sources of information, such as web content and email, to deliver smarter answers across structured and unstructured data.”

What is Sybase IQ? I dug through my Overflight file on this product and jotted down these points:

  • This is a column-based database. The column approach stacks data vertically, not in the Excel-like horizontal row format. Arguments between row store and column dudes are esoteric. In a nutshell, certain types of processes are facilitated with the column approach. Keep in mind it is a relational database and some RDBMS jockeys can make row stores into columnar structures.
  • For certain types of data reads are faster. Some data warehouse jockeys swear by the column method. Sun was a cheerleader, but we know what happened to Sun, so that may not be the endorsement it once was. Keep in mind that Sybase itself was acquired after compiling an interesting financial track record, but those offices are cool looking.
  • The company’s definition of data federation confused me. Does Sybase perform a Mark Logic type of function, creating a repository? Does Sybase work like the original Vivisimo federating method? What happens if I need to see the source Sybase has its indexes within its system. I am not sure how IQ queries external databases, makes sure security is observed, and then returns results without getting confused about who provided what and which data item is visible to a particular user. But like many vendors, azure chip folks are happy to parrot “federated”, cash their check, and move on to the next client.
  • You can if you hurry download the Sybase description of IQ’s architectural strength at this link. I checked it on May 26, 2010, and it was valid. Wait too long and the PDF may be unavailable.

With Sybase in a new home, a product in Version 15 will have an opportunity to show how it can grow and contribute to the SAP franchise. With SAP pursuing inorganic growth, the expectations and timeline will be key factors in my opinion. The database sector has some dominant players with upstarts like Google ready to enter the fray. Can Sybase challenge IBM, Microsoft, Oracle, and NoSQL crowd? I hope so.

Stephen E Arnold, May 27, 2010


Open and Closed and the Value of Boundaries

May 26, 2010

A happy quack to the reader who sent me a link to “Standard Wishes: An Interview with the Creators of PostScript”. As we work through the program for the Lucene Revolution, I am getting some interesting information from those involved in open source search and content processing.

This interview includes an interesting comment regarding “platform independence.” I am not sure if this phrase appears in Charles Geschke’s response to the question, “What is the next outstanding problem to solve in the real of computer programming or computer science?”

The answer from Mr. Geschke was:

It is so frustrating that this many years later we’re still in an environment where someone says if you really want this to work you have to use Firefox. We should be way past that point by now! The whole point of the universality of the Web would be to not have those kind of distinctions, but we’re still living with them. It’s always fascinating to see how long it takes for certain pieces of historical antiquity to die away. The more you put them in the browsers you’ve codified them as eternal, and that’s stupid.

The follow on question was: “Do you see this as a failure of the standards process we have today?”.

The answer from Mr. Geschke was:

Unless there’s an active, vibrant organization who takes ownership of the standard and either polices or makes so easily and readily available the implementation of that standard that no one tries to do it on their own, you don’t have a standard. That’s always the dilemma we dealt with in the early days with PostScript. If the clones had managed to wrest control of PostScript away from us, we would never have gotten to PostScript 3. It would have by that point devolved into a set of incompatibilities that would have made its whole premise pointless.

When I think about these comments in terms of open source search, several questions came to mind:

  1. Is the underlying foundation of “platform independent” based on control; that is, a file works anywhere as long as one uses the preferred method of a particular outfit?
  2. How can fragmentation be constrained without limiting innovation; that is, will there be so many versions of an open source solution that complexity undermines benefits of software like Google Android?
  3. Is the solution a benevolent dictator and not the messiness of the free market; that is, is the approach taken by Apple, to cite one example, the one to make sales to a majority in a market segment because boundaries make good neighbors?

Today powerful companies are working hard to make their solutions the “only acceptable” solution. Some organizations like this approach. IBM is a $100 billion business for this reason. Some consumers like the constraints a brand imposes. Apple controls about one third of North America music sales for a reason.

I have not worked out how open source search and content processing will intersect with the commercial interests of companies like IBM, Google, and Oracle, to name three interesting companies with different open source policies.

Stephen E Arnold, May 26, 2010


« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta