Addressing the Limitations of Traditional Databases
October 20, 2009
I saw a news release (“SAS and Teradata Extend In Database Analytics Leadership, Capabilities”) that will be good news for SAS and Teradata (once part of NCR, I believe) customers struggling with very large datasets. The idea is that the two firms will join forces to put more crunching within the data warehouse data table environment. The idea is that processing will be faster and some of the tedious export and crunch steps will be reduced and in some cases eliminated.
When I read the story, however, I saw the tie up as yet another work around for the limitations of traditional data management systems. With large data sets, it takes too much time and effort to extract and process with numerical recipes subsets produced by an old-style query. Speed has to be a key feature. Even the opaque Google Squared and Wolfram Alpha services work quickly and allow a user to interact with data. Obviously for users accustomed to the cache based systems offered by Web search companies the hurry up and wait approach that traditional databases impose on users is a problem.
My view is that a leapfrog technology will create a lot of pain for vendors anchored in the traditional RDBMS approach to data. I don’t see leapfrog innovation from any of the database vendors. I see more innovation from companies like Aster Data and Infobright. But even these firms’ solutions may not be able to handle the petascale data flows for real time information streams and the growing piles of legacy data that must be crunched for a thorough analysis.
In short, a disruptive data management technology could disrupt the traditional RDBMS market. And quickly. Characteristics of a leapfrog solution could include:
- On premises and cloud hybrid approach
- Support for new types of queries such as lineage which makes it possible to “see” where data originate
- Massively parallel computation without a data cube
- User friendly syntax.
Any candidates to suggest?
Stephen Arnold, October 20, 2009
I got paid a dollar by SAS a couple of years ago but nothing for pointing out that this solution is promising but not a leapfrog. Ah, that’s life.
Battle of the BI Systems
October 20, 2009
I was in Europe and besieged by business intelligence of BI conversations. No one wants “search”. The user wants “answers”. The shortest way from a question to an answer is the next generation business intelligence system or BI solution. The only problem is that if a person does not know what question to ask, it remains tricky for a business intelligence system to provide much of an answer.
Not surprisingly, the BI vendors have approached this problem in a straightforward manner. The BI system is configured to provide reports on a number of standard topics. The most generalized collection of these standard components is called a dashboard. The idea is a user looks at a dashboard, recognizes a heading or graphic that provides the “answer”. The user has been changed from a person who asks a question to a person who recognizes a link that directs the user to an answer.
Search has become business intelligence. In my opinion, some customer sensitive search vendors are shifting their angle of approach to the market by several degrees. These vendors offer tools which an information technology department or a reseller can use to build dashboards.
One company that is in this business is Attivio, formed a couple of years ago by former Fast Search & Transfer professionals. In my meetings, Attivio was not mentioned, but in the airport waiting for my flight back to the US I read “Traction Software Selects Attivio to Power Information Access for Enterprise 2.0 Social Software”. The article was brief, but I clicked to the Attivio Web site and followed links to Traction Software. These companies are pushing into the social business intelligence.
One of the people who regaled me with business intelligence information provided a useful item of information—Yellowfin in Australia. I was not familiar with the company. The person who explained this company to me mentioned three points:
- Although a commercial product, some open source goodness flavors this solution
- The software includes analytics and some nifty graphics
- Geographical support is baked in
- The system offers social information support.
The Yellowfin site offers demos, and I signed up for additional information. Here’s an example of the type of dashboard one can make available to end users.
I saw one demo whilst in Europe which I will profile in a separate write up. (I have to check to see if the information is publicly available.)
Three observations:
- Search is shifting to the dashboard model
- The lines between business intelligence and UX (Microsoft’s buzzy term for a slick interface) are starting to look quite a bit like the intelligence community’s screen displays
- Users don’t have to know the ins and outs of either the data or the logic to get what’s needed from a system. In my opinion, this approach can “dumb down” some types of information access which may be a short cut to making some ill advised decisions.
The dashboard approach to information access is one that gets potential users excited and interested in spending money for a new approach.
Stephen Arnold, October 20, 2009
Traditional Database Alternatives Listed
October 20, 2009
A happy quack to the reader who sent me a link to “NoSQL: Distributed and Scalable Non-Relational Database Systems”. I have mentioned the problems that petascale data flows create for traditional RDBMs systems. Jeremy Zawodny has gathered together a number of non-SQL databases and provided a comment about each in this Linux Magazine write up. I recommend that readers of this Web log read the article and then tuck a copy away for one of those dreary days with Dr. Codd’s baby spits up. The article includes a list of attributes for the non-SQL data management systems. Software profiled includes Redis, Tokyo Cabinet, Apache CouchDB, Riak, and several others. I don’t have any quibbles with the article. My research indicates that a shake up in the data management world is coming. The big one will not originate from today’s leaders in data management. An outlier in this sector is going to set off some market upheavals. Look for late 2010, which is not too far off.
Stephen Arnold, October 20, 2009
No pay for this opinion. Sniffle.
Unusable Web Pages
October 20, 2009
I noticed a growing annoyance—maybe problem is a better word—with public facing Web pages. Let me do a quick run down of what I noted.
Too Much Code
The assumption that a person visiting a Web site has a broadband connection that is fast and without latency is a bad assumption. I experienced severe latency using Internet connections in hotels, at a conference, and in a public Internet café in Paris. Pages would not render. I looked at several search vendors’ Web sites and was unable to access the splash pages. I am not sure what the firms’ Web team are trying to accomplish. One result was that I could not answer a client’s questions about a service, so I just deleted these companies from the procurement list for the job I was doing. These vendors were not pushing crazy animations. The sites were trying to make the sites more “useful” I suppose with too much JavaScript or lousy programming. I want to mention these vendors but my attorney is a spoil sport.
Pop Ups
A number of Web sites such as those from the publishers of Network World and Computer World use full page ads that display before a story appears. These are delightful when I have a high speed connection. These are less delightful when I have to use a sluggish Internet connection or a connection with latency issues. The pop up may or may not go away. After accessing the story, hitting the Back arrow often presents a blank page. In one instance, the blank page would not go away. My solution was to avoid reading the stories from this publisher. A related issue surfaces on some tech news sites where underlined words, hot links in videos, and dense type of other nifty articles from a publisher are presented on a Web page. I use a Toshiba netbook, and I simply cannot cope with the presentation and the pop ups that take three to 10 seconds to go away. Once again my solution is to * not * read articles from these publishers.
Hard to Spot Ads
I have to tell you that I get pretty angry when a story is not a story. Ads are inserted in content flows and poorly marked. Maybe a person with perfect vision can see the tiny notices, but at my age, I can’t deal with these. In one instance my normal habit is to click on the top five or six links regardless of topic. I scan the item and then move on. A recent change on Digg.com embeds a sponsored link in a content flow. Here’s a screenshot with the paid listing marked. I added the yellow highlight so this would stand out.
My fix has been to stop using Digg.com.
Autorunning Videos
I also avoid any Web site that autoplays a video. I don’t have much of a speaker in my Toshiba so noise is not the problem. The issue arises from the sluggish performance I have experience.
Search vendors and sites running content related to search may want to think about the implications of annoying users. I am content to fly alone, but real customers and prospects may decide to join my flock. The cute and bloated, therefore, produce the opposite of what the companies want to achieve. When the videos require that I download or update special programs, I get really testy. I can’t mention a certain large software company that wants to displace Adobe Flash, which I also dislike because of its really wonderful “Flash cookies”. Grrr.
Stephen Arnold, October 20, 2009
Dear US government, no one paid me to write this article. My dog did lick me when I was inserting the screenshot so that may count as compensation in some instances.
Reflections on SharePoint and Search
October 19, 2009
I had an interesting conversation with a librarian at the international library conference in Hammersmith on December 15, 2009. This professional from a central European country asked my views about SharePoint. As I understood her comments, she was in a SharePoint centric environment and found that making small changes to improve information access was difficult, if not impossible.
One reason was her organization’s bureaucratic set up. Her unit was in the same building at the information technology group, but IT reported to one senior manager and her library to another. Another reason was the reluctance of the IT team to make too many changes to SharePoint and its indifference to her legacy library systems. Her challenge was even more difficult because there were multiple legacy systems in the information center. One of these system offered search, and she wanted to have SharePoint make use of the content in that legacy system’s repository. She was not sure which vendors’ search system was in the legacy system, but she thought it was from the “old” Fast Search & Transfer outfit.
The motto of IT and another unit’s management.
Okay, I thought. Fast Search & Transfer, the company Microsoft bought in 2008 and had spent the last 18 months converting into the world class search system for SharePoint. Fast ESP would break through the 50 million document ceiling in SharePoint search and add user experience plus a nifty range of functions.
To make a long story short, she wanted to know, “Will Microsoft and SharePoint support the “old” versions of Fast ESP?” I told her that I recalled reading that Microsoft would stand behind the Fast ESP technology for a decade. She replied, “Really?” Frankly, I do not know if that guarantee extends to OEM instances of the “old” Fast ESP. My hunch is that the problem will fall upon the vendor licensing the Fast ESP engine. But I have learned and suffered from the fact that it is very easy to write marketing collateral and somewhat more difficult to support a system—any system—for a decade. I know mainframe systems that have been around for 30 years, possibly more. But the Linux centric search systems built to index the Web content are a different kettle of Norwegian herring.
My understanding of the trajectory of Fast ESP is that the company took the core of its high speed Web indexing system and reworked it to handle enterprise content. The “old” Fast ESP abandoned the Web search market when the company’s top brass realized that Google’s technology was powering past the Fast technology. Enterprise search became the focus of the company. Over the years, the core of Fast’s Web search system has been “wrapped” and “tweaked” to handle the rigors and quite different requirements of enterprise search. The problem with the approach as I pointed out in the three editions of the Enterprise Search Report I wrote between 2003 and 2006 were:
- Lots of moving parts. My research revealed that a change in a Fast script “here” could produce an unexpected result elsewhere in the system “there”. Not even my wizard son could deal with these discomforting technical killer bees. Chasing down these understandable behaviors took time. With time converted to money, I concluded that lots of moving parts was not a net positive in my enterprise search engagements. Once a system is working, the attitude of the librarian’s IT department is my reaction. Just leave something working alone.
- Performance. Web search required in the late 1990s a certain type of spidering. Today, indexing has become more tricky. Updates are often larger than in the past, so the content processing subsystem has to do more work. Once the index has been updated, other processes can take longer because indexes are bigger or broken into chunks. Speeding up a system is not a simple matter of throwing hardware at the problem. In fact, adding more hardware may not improve performance because the bottleneck may be a consequence of poor engineering decisions made a long time ago or because the hardware was added on but to the wrong subsystem; for example, the production server, not the indexing subsystem.
- Ignorance. Old systems were built by individuals who may no longer be at the company. Today, the loss of the engineers with implicit knowledge of a subsystem make it very difficult—maybe even impossible– to resolve certain functions such as inefficient use of scratch space for index updates. I recall one job we did at a major telco several years ago. The legacy search system was what it was. My team froze it in place and worked with data the legacy system wrote to a file on a more modern server on a fixed schedule. Not perfect, but it was economical and it worked. No one at the company knew how the legacy search system worked. The client did not want to pay my team to figure out the mystery. And I did not want to make assumptions about how long gone engineers cooked their code. Companies that buy some vendors’ search systems may discover that there is a knowledge problem. Engineers often document code in ways that another engineer cannot figure out. So the new owner has to work through the code line by line to find out what is going on. Executives who buy companies make some interesting but often uninformed assumptions; for example, the software we just bought works, is documented, and assembled in a textbook manner. Reality is less tidy than the fantasies of a spreadsheet jockey.
Internet Boiling Down to a Few Big Players
October 19, 2009
Short honk: Network World ran a story which struck me as 9/10ths marketing. The story was “The Internet Has Shifted under Our Feet”. I found some of the assertions in the article quite interesting but there was no source for the factoids. First, 30 companies consume (maybe account for) 30 percent of the Internet traffic. Second, more interconnections means more traffic will be moved locally. This is an interesting idea and I think it is good news for content delivery networks. Finally, “Traffic these days is concentrated on a small number of Web and video protocols, while peer-to-peer traffic has nosedived in the past two years.” In short, the Internet may become the province of a handful of companies just as the US auto industry boiled down to a couple of viable outfits.
Stephen Arnold, October 19, 200
I wrote this for a VC type who bought me lunch.
Will UK Ad Probe Poke the Big Dogs?
October 19, 2009
I learned on Friday that the UK’s Office of Fair Trading was “deepening it probe into the online targeting of advertising and pricing”. The story I saw was on a ZD Web log. The probe seems to be a broad one. Will the net snag Google? If it does, the company has one more legal hassle to address. How many little people rendered Gulliver helpless? How many lawyers will it take to trip the Google? What about Microsoft and its renewed ad activity? Will the OFT ding Bing? And, Yahoo? Yahoo is Yahoo.
Stephen Arnold, October 19, 2009
This is not an ad. It is a freebie. Gnarf.
Google Wave Has Modest Amplitude
October 19, 2009
A happy quack to the reader who sent me a link to “Google Wave: Mark Status As Undefined”. The author takes a critical look at Google Wave and makes some pointed observations about how Google Wave stacks up against a former Microsoft employee’s experience and offers some food for thought when that Microsoft executive joined Google. This write up is interesting, but I think it pegs its argument on the wrong wizard. Google Wave is a subset of a larger tech initiative at Google. The person responsible for this initiative is not the former Microsoft executive. Nevertheless, I found the write up a refreshing change from the run-of-the-mill commentary about Wave.
Stephen Arnold, October 19, 2009
No money and not even an invitation from the Google to test Wave. Sniff. Sniff. Goose discrimination.
Light Bulb Goes On: Google Is in the TV Biz
October 19, 2009
On my diagram of the businesses that Google is disruptive, I have a big label on motion pictures, videos, and television. Most of those in my lectures on Thursday (October 15) and Friday (October 16) just did not see the connection between Google and “real” rich media. I suppose those doubters are in a state of denial. Take the Guardian’s story “C4 Deal with YouTube Will Let Users Watch Full-Length TV Dramas Online. Deal Highlights Shift in British Viewing Habits.” The story reported:
The deal will see about 50 hours of Channel 4 programming made available on the Google-owned videosharing website soon after airing on TV in a move designed to get it on the front foot as traditional broadcasters scramble to develop revenue-generating online content distribution strategies. For its part, YouTube is seeking to move beyond short clips, often produced by members of the public, to offer full-length TV shows and other higher quality video content. The three-year deal with YouTube will see the broadcaster provide its content, including 3,000 hours of archive programming of shows such as Brass Eye, Derren Brown, Ramsay’s Kitchen Nightmares and Teachers, free. The partners will share income on advertising around Channel 4 content, though details of the revenue split were not revealed.
What is not in the story is the context of the move by Google. Why not? In my opinion, this move is simply another Google beta test and is—after all—just one TV outfit.
So denial. Understatement. Lack of context. These are the three flaws in most analysis of Google. I want to point out that taking a look at what Google is doing is not too useful. When these deals are announced, it is too late for some of Google’s competitors to take action. Sure, competitors can negotiate their own deals, but these knee jerk reactions trigger some unpleasant consequences; namely, dealing with the costs associated with digitizing, serving, and storing chunky data like video. Google’s advantage is scale at costs below what knee jerk outfits will incur. Thus, the context of this Google deal has some grim implications for competitors:
First, Google can ramp up video and do it quickly and more economically than most firms. Click here for one view.
Second, with Google’s third party payer business model, the lure of the Google approach may be enticing to rich media outfits with an appetite for new revenue options.
Third, the lack of context for Google’s actions means that those who join in the Google video content push are likely to see one thing when Google is doing something quite different.
Intrigued? This is a marketing blog, so you have to buy Google: The Digital Gutenberg to get more info on how Google is moving information access to a metalevel. Click the ad at the top of this Web page.
Stephen Arnold, October 19, 2009
The cab driver was nice to me as I wrote the first draft of this article. Compensation enough I suppose.
Google Keeps on Moving in the Book Sector
October 18, 2009
Most companies finding themselves buried in vituperation, legal matters, and worldwide pushback would play it cool. Maybe offer an olive branch, hold meet up, or send happy smoke signals. Google is not “most companies.” I emerged from a grueling meeting to learn that Google is jumping into the eBook market in 2010. Why announce this? in my opinion, the Google is reminding anyone who is sentient that the Google is going to move forward in books. The forum for the announcement, according to the AFP news story “Google Editions to Be Up and Running Next Year”, was the annual love in for publishers. Each year in beautiful Frankfurt, the big dogs of publishing gather to celebrate their products, industry, business methods, and all round great business judgment. Google chose this forum to make known:
Google’s online service Google Editions enabling electronic books to be downloaded to mobile telephones and readers will have some half-a-million publications available in the first half of next year.
There you go. In your face forward movement. Unlike Google’s baby step approach to some sectors, the Google laced up its waffle stompers and clomped through the clubby world of publishing. Reaction? My hunch is that it will take publishers and folks like Amazon to figure out exactly what is going to happen. Jeff Bezos, an early investor in Google, is probably wondering what “gratitude” means.
In my opinion, the killer fact in the AFP story was:
According to the Association of American Publishers, e-book in the United States totaled 113 million dollars last year, a leap of 68 percent from the previous year but still tiny compared with an estimated total of 24.3 billion spent on all books.
Any doubt that Google is moving forward in the book market? The intent is clear to the addled goose.
Stephen Arnold, October 18, 2009
Sadly no one provided wampum for this write up.