What Data Miners Mine
September 10, 2009
KDNuggets focuses on useful information about machine processing of data and text. The Web site published a summary of visitor-provided information. The focus of the poll was answering the question, “What type of data are you processing?” The summary held two surprises for me. First, monitoring Web click stream data was gaining popularity but lagged far behind the processing of tabular data. You can get the poll data at “Data Types Analyzed / Mined Poll”. Second, the data type “Music / Audio” enjoyed a significant increase. Interesting. The mining of structured data holds the top spot, followed by time series data.
Stephen Arnold, September 10, 2009
Windows 7 Talking Points – Foreshadowing the Fast ESP Push?
September 10, 2009
I read with interest Apple Insider’s “Microsoft Unleashes Retail Talking Points Attacking Linux, Macs”. The article summarizes one tactic in marketing Windows 7 when it ships. For me the most interesting segment in the article was:
BestBuy and other retail employees can earn a $10 copy of Windows 7 for completing the training. The training course attacking Linux has been covered by other sources, including Overclock.net, where one reader commented, “I think I now know why, when I enter BestBuy, the employees say the odd lies that they do.” Microsoft is also publishing a series attacking Macs, and AppleInsider has obtained the first screenshots publicly published of this training series for retail employees. The first page of the course, outlining its “objectives,” says customers get “a lot more computer with a Windows-based PC than a Mac,” that PCs run more software programs and “come in a wide variety of colors and configurations,” and claims that “customers have less to learn to get started with a Windows 7-based PC.”
The thought I had was, “Will Microsoft use a similar method moving the Fast ESP search system? My hunch is that the company will combine talking points with financial incentives. Losing the enterprise search sector could have a negative impact on the company.
Stephen Arnold, September 10, 2009
NStein Added to Overflight
September 10, 2009
Short honk: Nstein, a content processing company that has morphed into a more complex beastie, is now part of Overflight. You can see the most recent news by navigating to the Overflight home page and clicking on Nstein.
Stephen Arnold, September 10, 2009
Google Throws Life Preserver to Newspapers Again
September 10, 2009
The Google assumes that users will “get it”. As a result, the company is not into bicycles with training wheels. The newspaper industry got thunked on the head with Google’s approach. Now the company has inflated a big, yellow kiddy duck and pushed it toward the newspaper industry. You can read the story in the Nieman Journalism Lab story “Google Developing a Micropayment Platform and Pitching Newspapers: “‘Open’ Need Not Mean Free”. Not much to summarize. Google is pretty good at microbilling, so in my opinion, the Google flipped some bits and inflated the yellow kiddy duck. Will it be enough? I don’t think so. Three reasons:
- The cost structure for news is going to be tough to “blogize”; that is, generate good info with new content methods
- The newspapers will find themselves competing with other folks who are generating “good enough” content, content to survive on fame, AdWords, or a sponsor
- Google itself has some nifty content combinatorial tools. When those are made available, software – not humans – can generate some nifty outputs. I describe some of these in my Google: The Digital Gutenberg.
So, the duckie is there, but it won’t get some newspaper publishers out of the deep end. Quack.
Stephen Arnold, September 10, 2009
Coveo Adds Muscle to Its Email Search System
September 9, 2009
Short honk: My Overflight service alerted me to Coveo’s most recent boost to its enterprise email search system. You can read the full text of the article in “Coveo Launches New Email Search Paradigm with Cross Enterprise Email Search”. For me the most interesting comment in the write up was:
Traditional email search enables only the employee to access his or her desktop email, reducing knowledge sharing and creating roadblocks to information access. Coveo`s industry-first Enterprise Email Search package provides organizations with the cross-organizational ability to index, tag, categorize and securely access all enterprise email content to reduce risk and conduct investigations and provide completely unified access to email and attachments regardless of where they are stored, offline within PST files on desktops or laptops, or online within active Exchange or email archiving platforms such as Symantec Enterprise Vault and Quest Archive Manager.
For a full discussion of the new system, navigate to the Coveo Web site.
Stephen Arnold, September 9, 2009
Google, First a God, Then a Religion
September 9, 2009
I think one of the New York newspapers ran a story that suggested Google was “god”. Julius Caesar would have gone along with this idea. He was pretty flexible and pragmatic when it came to power. Now a humor charged Web site has, according to APCMag.com, created a church of Google. Okay. Good idea. The article is “Is ‘Googlism’ the New Religion?” There are nine reasons. If you need something in your life, check it out.
Stephen Arnold, September 9, 2009
Beyond the Database: Implications for Organizations
September 9, 2009
The challenge in information technology in general and information management is particular is that we face a “bridge” challenge. On one side are individuals using a wide range of devices. These include Microsoft Zune HD, Google Android phones, and netbooks like the one I am using. Millions of the young-at-heart have full-scale computers like this Apple iTouch.
On the other side, are large businesses with entrenched information technology infrastructures. Change is expensive, time consuming, and often fiercely resisted by employees. Change means relearning methods that work.
When someone searches for information, a variety of sources is available. For example, if a NOAA professionals gets a weather alert, he or she can pull from many sources. The problem is that the “answer” is not evident.
What about a search for “Florida severe weather”? Bing and Google return laundry lists of results. My research suggests that users do not want laundry lists. Users do want answers or a result that gets them closer to an answer and farther from the almost useless laundry list of results.
In this talk (converted to an essay), I will comment about some of Google’s new technology, but I want to point out that Microsoft is working in this field as well. Most of the major players in search, content processing, and business intelligence know that laundry lists are a dead end, of low value, and a commodity.
Google’s corporate strategy looks unorganized. The Sunday, January 28, 2007, New York Times’s article about Steve Ballmer included a reference to Google’s dependence on search advertising. The implication was that Google is a one-trick pony and therefore vulnerable. Google is in a tough spot because if advertising goes south, the company has to have a way to monetize its infrastructure. Google has spend billions building a global datasphere, a subject to which I will return at the end of this talk / essay.
Stand on the edge of a slice in the land near Antrim, Northern Ireland. You see a gap which you can cross using a rope bridge. Someday, a modern steel structure may be put in place. But for now, the Northern Ireland residents need a “good enough” solution.
That’s the problem the Federal government and many organizations face. Instead of a gap in the terrain, there are many legacy systems inside the organization and new systems outside the organization. The systems gap creates major problems in information access, security, and efficiency. In today’s economic climate and in the new Administration’s commitment to serving citizens, a digital bridge is needed, sooner rather than later.
The opportunity is to bridge these two different sides of the river of technology that flows through our society. Similar gaps can be identified in the structured and unstructured information gap, the legacy systems versus the Web service enabled systems gap, the Microsoft versus Google gap, archived data versus real time data gap, the semantic versus statistical gap, and others.
The question is, “How can we get the bridge built?” and “How can we deal with these gaps?”
These are important issues, and the good news is that tools and approaches are now becoming available. I will highlight some of Google’s innovations and mention one company that has a product available that provides a functional “bridge” between existing IT infrastructure and Google’s services. Many tools are surprisingly affordable, so progress—in my opinion—will be picking up steam in the next six to 12 months.
Because I have limited time, I will focus on Google and make do with side references to other vendors working to build bridges between organizations’ internal systems and the fast-moving, sometimes more innovative world external to the organization.
I have written three monographs about Google technology: The Google Legacy in 2005, Google Version 2.0 in 2006, and Google: The Digital Gutenberg this year. Most of the information I am going to mention comes from my research for these monographs which are available from Infonortics, Ltd. (http://www.infonortics.com). The information in my monographs comes from open source intelligence.
In the Q&A session, I will take questions about IBM’s, Microsoft’s, and other companies’ part in this information drama, but in this talk most of my examples will be drawn from Google. I don’t work for Google and Google probably prefers that I retire, stop writing my monographs and blog posts about the company, and find a different research interest.
Let’s start with a query for an airplane flight.
Google and Sony Tie Up Implications for Competitors
September 9, 2009
When Google held a seat on the Apple board of directors, one assumed that the companies were in sync. Then the Google board seat went away. A short time later, Google inked a deal with the struggling Sony. The most interesting angle was the increasingly controversial Google Books project. Sony updated its somewhat clumsy ebook and announced that the device would use Google Books or at least some of them. I did not bother to point out that the axis of Google – Sony seemed to offset the more and more chummy relationship between Microsoft and Amazon. These large scale movements are outside the scope of this somewhat narrow Web log. I did notice that few people explored this new tie up in any detail.
I want to mention one news story that struck me as interesting and possibly germane to the Google – Sony love fest. “Sony Music in Mexico Raided by Police” reported “Thousands of CDs and storage devices were seized after a raid on Sony Music Entertainment Mexico offices linked to a legal case with ranchera singer Alejandro Fernandez.” Sony, as you may recall, ceded the personal media player sector to Apple. Then Sony deployed what some called a “root kit”. In short, lots of Sony excitement to go with the financial excitement the company has delivered to investors.
My question, “Will Google competitors and some who are skeptical of Google’s innate goodness probe into the Google – Sony deal?” I wonder if there may be a gap that can be leveraged?
Stephen Arnold, September 1, 2009
Google and Consolidated Messaging
September 9, 2009
Last evening I attended a reception at Google’s Washington, DC offices. The goose kept his beak down and his ears open. One of the most interesting comments was the phrase “consolidated communications”. Imagine my surprise when I realized that the Googler was referring to knitting Google services together. The interlocks have been evident in Google’s research papers, but now the Google was edging forward—finally. You can read “Google Voice Finally Marries SMS And Email” and more details. Microsoft is on this trail as well. What I find different about each firm’s approach is that Google focuses on the individual user and Microsoft focuses on the enterprise. This difference is going to become more important over time. The notion of having messages in forms that meet a user’s situational needs is important and a component of the Wave tech pack which is gaining strength.
Stephen Arnold, September 9, 2009
Oracle Cuddles with InQuira
September 9, 2009
Database Journal’s “Oracle Announces Integration of Oracle CRM On Demand with Partner InQuira’s Web Self-Service Applications Available On Demand” solves a problem for some Oracle CRM customers and is good news for InQuira, a vendor of natural language processing technology. The story reported:
Oracle announced the availability of an integration combining the capabilities of Oracle CRM On Demand with InQuira’s On Demand Web self-service applications. The integrated, on demand service solution enables customers to go seamlessly from self-service to live agent-assisted service.
Several thoughts beg to be captured. First, what happened to Oracle’s SES10g search technology. My recollection is that SES10g could handle similar functionality. Second, will this tie up be an early signal that Oracle is embracing third party solutions for search and content processing.
Stephen Arnold, September 9, 2009