SSNOrder Google: The Digital GutenbergSurf on Google

XML Appliances

March 22, 2010

I worked my way through a run down of XML appliances. The write up was “The Modern XML Appliance: From Acceleration to Integration and into the Cloud.” The focus was dedicated hardware devices that allegedly cope with the size of XML files. XML can be more verbose than documents stored in an application’s native file format. In addition, XML is not * one thing *. There are flavors of XML and within XML documents one can encounter issues. For example, have you heard, “Where’s that referenced entity?”

Years ago, I looked at a then-new method of coping with the numerous problems of Extensible Markup Language. That firm was Rocket XML, I believe. The approach was to implement some of these clean up functions via software. The article I read focused exclusively on expensive gizmos that are often expensive, tricky to configure, and narrow in their functionality. Some of the appliances have limits on XML file size. So what’s the fix? Manual rework or expensive off loading methods are the approach some organizations have to follow.

I did a bit of poking around and found another XML appliance article by the same outfit (IT Business Edge), “XML Appliances Get New Mission: Integrating B2B and the Cloud.” This write ignores:

  • Increased interest in putting such functions as XML processing in firmware, not boxes
  • The need for restrictive XML in order to deal with variances in files
  • The costs of making the jump to the cloud at the present time with the current limitations.

I think the XML appliance sector is an interesting one, but like many specialized niche markets, change is coming, and it will come quickly.

Stephen E Arnold, March 22, 2010

A freebie. No one paid me to write this article. I don’t know to whom in the US government to report a no pay write up about XML. Maybe the Government Printing Office? Baffled am I.

Autonomy and Customer Relationship Management

March 21, 2010

I learned when I read “Autonomy Delivers the First Meaning-Based Multichannel Customer Interaction Analytics Application” that:

  1. Autonomy was at the Gartner Customer Relationship Management Summit. (Gartner is, like Ovum, an azure chip consultant with aspirations to define the information technology world. I believe everything both firms output too. Well, almost, I suppose.)
  2. Autonomy is a player in customer relationship management.

According to the write up:

Autonomy Explore gives businesses a much more insightful and valuable view of their customers. For instance, the same customer that submitted a complaint to the contact center, searched for products on the company’s Web properties, and then commented about the company on Twitter, may have expressed different levels of satisfaction and used different terms through each channel. Autonomy Explore detects the evolving sentiment of that customer by analyzing the concepts and patterns communicated across each touch point, in order to more effectively engage with that customer as well as other customers from the same segment.

The Autonomy system includes a range of functions, including concept understanding and automated reporting and workflow. For more information, navigate to www.autonomy.com.

Stephen E Arnold, March 19, 2010

An unpaid write up. I  will report this to the agency with the best customer support in the Federal government, the Office of Citizen Services.

Interdependency: Why IT Costs Are Tough to Control

March 20, 2010

Network World ran a very interesting article which, in my opinion, helps explain why search and content processing applications are characterized by sky rocketing costs. The story is “Is IT Keeping Up with a Changing Infrastructure? “ When I read the write up, I realized that most IT departments are like buggy whip manufacturers who did not want to manufacture automobile seat covers. Bad move but understandable. Buggy whips were comfortable just like silos of on premises applications and users who did not know that data could be mashed up and displayed in an actionable format.

For me, the most interesting segment of the article was:

A new study from Forrester Research Inc. shows that application developers and their project managers are not keeping up with the times…. [a] senior analyst…, said IT pros aren’t necessarily adjusting to what is the new reality of a tough economy and the popularity of certain technology trends.

I think I would have inserted the word “some” so that the statement would have stated: “some IT pros aren’t necessarily adjusting.”

In San Francisco earlier this week, I talked with a New York consulting firm. One of the interesting throw away remarks was that this outfit has found a number of new customers among the consulting firms in New York. I probed but was unable to get the names of this company’s consulting firm clients, but I recall the comments made during out chat.

One message that came through was that consulting firms are struggling to manage their information technology operations. The challenges range from cost control to finding information that someone in the consulting firms knows is on a server.

The Network World has hit the nail on the head. I wonder if the clients of the firms who purport to point out IT problems have the expertise, money, and time to fix their own IT problems.

My hunch? No. But talking about the flaws in companies is much easier and more fun than fixing one’s own problems.

Just my opinion.

Stephen E Arnold, March 20, 2010

A freebie. No one paid me to write this. I will report information technology cost issues to the General Accountability Office, an outfit with responsibility for tackling such issues. I don’t think the GAO works for free as I do, but perhaps the entity will sympathize.

Google, Profile Capturing, and User Intent

March 18, 2010

On March 16, 2010, Google nailed a patent for its invention “Profile Based Capture Component,” US7,680,809. On the surface, maybe not so exciting. Google wizard Steve Lawrence had a hand in this invention filed in early 2004. Don’t you admire the turnaround time? Here’s the abstract:

An indexing system in a computer system may include applications, a capture processor, a queue, a search engine, and a display processor. The indexing system captures events of user interactions with the applications. Events are queued and if indexable, indexed and stored for user access through the search engine. Capture components in the capture processor can include a keyboard capture component that processes user keystrokes to determine events. A display capture component captures event data from windows associated with the applications. Display event data can be captured on a polling schedule or based on state changes of window elements. To determine target applications and window applications of interest application profiles and window profiles can be used.

Google’s predictive methods revealed.

Stephen E Arnold, March 18, 2010

A freebie. I will report non payment to the USPTO which recently rejected a fax because it was “backwards”. The other pages, as I understand the event, were not backwards.

Mainframe Cost: Migration Motive

March 16, 2010

You are babysitting a mainframe. The iPod listening 20 somethings don’t want to dig into the legacy code. You are reluctant to involve the IBM-savvy specialists and their new BMW work wagons. And for good reason. Navigate to “An Expert View on Mainframe Migration” at http://www.computing.co.uk/. The article provides a useful business case for dumping big iron. The source for the write up is DFA president Francis Feldman. He provides some useful factoids, including this gem:

“We expected to see a drop of between 30 and 70 per cent,” he said.

Particularly interesting was the list of the nine steps or checkpoints for migrating an application from the legacy system to a newer, much cheaper modern platform. I don’t want to recycle is list, but I can highlight three items and urge you to visit Computing UK for the full write up.

The three highlights of the write up I noted were:

First, the rework did require recoding and tweaking. The method involved recompiling into code that conformed to the ANSI standard.

Second, the legacy system and the new system were operated in parallel for a period of time. How many organizations bother with this step today?

Finally, the items on the checklist provide a solid anchoring in what one should consider. The first item is the key one in my opinion: “Asset catalogue and consistency assessment.”

I think this article is a download and save candidate. One question, “How much cheaper would a cloud solution be?” My view is that on premises installations are tomorrow’s mainframes. Just my opinion.

Stephen E Arnold, March 16, 2010

A freebie. No one paid me to write about mainframes. Because the subject is a mainframe, I will report non payment to the IRS.

Limitations of MSFT Exchange 2010

March 16, 2010

I am not sure how one of my goslings came across this spreadsheet tucked away on the Microsoft Exchange Web log. When I tried to access the file, the system did not recognize my “official” Microsoft MSDN user ID nor my Windows Live credentials. So you may have to register to access the blog. Once there, you need to look for the download section and visually inspect the file names for the one that points to the Exchange Performance Excel spreadsheet. Running a query in the blog’s search box produced zero hits for me. But with some persistence and patience I was able to get a copy of the spreadsheet. Latency was a problem when I was fiddling with this download. (Note: if the link is dead, write one of the goslings at benkent2020 at yahoo dot com, and maybe he will email you a copy of this document.)

Once you get the document “Scalability Limitations”, you will see some pretty interesting information. One quick example is that the spreadsheet includes three columns of specifics about scaling amidst the more marketing oriented data on the spreadsheet. These three juicy columns are:

  • Limitation
  • Issue
  • Mitigation.

Here’s the information for the row Database Size:

  • Limitation–Exchange 2007 – 200GB; Exchange 2010 – 2TB or 1 disk, whichever is less
  • Issue–The DB size guidance changed from 200GB (if you are in CCR) to 2TB or 1 disk, whichever is greater (if you have 2+ copies of the DB in question)
  • Mitigation—Blank. No information.

Okay.

I hope you are able to locate this document. For those of you eager to install Exchange 2010, SharePoint 2010, and Fast Search 2010, you will want to make sure you have these type of spreadsheets at your fingertips * before * you jump on the Microsoft Enterprise steam engine. The information in the spreadsheet makes clear why some types of email content processing may be expensive to implement.

Stephen E Arnold, March 16, 2010

This is the equivalent of the free newspaper Velocity in Louisville. Read it for nothing. I will report working for no dough to the Jefferson County agency that thinks I work in Louisville when I spend most of my time in the warm embrace of airlines.

More XML Expertise to Google

March 16, 2010

According to ZDNet, Tim Bray, founder of OpenText and collaborator with Ramanathan Guha on things XML, is now a Googler. The story “Ex-Sun Director Bray Joins Google’s Android Team” notes that Mr. Bray will work on the Android. The addled goose wants to point out that there are some big semantic Web guns in the Google arsenal now. Is Google becoming the big gun in the semantic Web or just the semantic Web?

Stephen E Arnold, March 16, 2010

Nope, a free one. No one paid me to reference semantic weapons. I will report this free write up to the FCC.

SQL Does Too Scale

March 16, 2010

The Dennis Forbes on Software and Technology blog published “Getting Real about NoSQL and the SQL-Isn’t-Scalable Lie”. The article caught my attention because it expresses the viewpoint that SQL does scale. I found the write up interesting, and I wanted to highlight several  of the arguments presented.

First,  the article points out that bashing SQL is an increasingly popular sport. Mr. Forbes writes:

In the case of the NoSQL hype, it isn’t generally the inventors over-stating its relevance — most of them are quite brilliant, pragmatic devs — but instead it is loads and loads of terrible-at-SQL developers who hope this movement invalidates their weakness.

Second, he makes clear that SQL does scale. He offers:

Such a solution — even on a stodgy old RDBMS — is scalable far beyond any real world need because you’ve built a system for a large corporation, deployed in your own datacenter, with few constraints beyond the limits of technology and the platform. Your solution will cost hundreds of thousands of dollars (if not millions) to deploy, but that isn’t a critical blocking point for most enterprises. This sort of scaling that is at the heart of virtually every bank, trading system, energy platform, retailing system, and so on. To claim that SQL systems don’t scale, in defiance of such obvious and  overwhelming evidence, defies all reason.

Third, he points out that progress is being made:

Scalability noise based upon the limitations of a cloud vendor’s offerings needs to be put into context: They don’t apply to most of the users of relational databases. MySQL isn’t the vanguard of the RDBMS world. Issues and concerns with it on high load sites have remarkably little relevance to other database systems. And of course the SQL/RDBMS world is changing (side note: Few love SQL, but I’ve yet to see a viable replacement). Wouldn’t it be a grand world where every desktop (platforms that spend about 99% of their time completely idle) in a corporation was a part of the corporate cloud, all seamlessly acting as a part of the corporate information system in a reliable, redundant way? A simple SQL statement silently and transparently fulfilled by hundreds of distributed systems?

But the real interesting part of the write up is the comment section of the Web log. Some are clever and others like Alex Popescu’s are thought provoking. Excellent write up.

Stephen E Arnold, March 17, 2010

No one paid me to write this. Because the article is about relational databases, I will report non payment to DHS, an outfit with quite fascinating RDBMS challenges. Those folks and their consultants get paid I believe.

Another Google Jibe

March 15, 2010

Poor, poor Google. From top of the world to a punching bag in less than three months. This new decade is proving to be a challenging one for Google. I just read “Six Delusions of Google’s Arrogant Leaders.” I want to disclose that I too have been accused of being arrogant. Now I don’t have any good reason to be arrogant. I just find that approach works for me, but, please, keep in mind that I am an addled goose, live in rural Kentucky, and am wandering slowly toward being 66 years old. I am no sports car in today’s NASCAR ego race.

But Google! According the write up, Google is coming across as “cocky”. I don’t want to run down the six delusions. I inveigh you to go direct and suck up the juiciness yourself. However, I can point to two of the examples and offer a comment.

The first is “users are hungry for Google synergy.” I am not sure what synergy means. I know that the Google platform is one that works like a giant plastic bag wrapped around the earth. The idea is to put everyone in the bag and keep them there. This is mostly complete, but about 25 percent of Web users are outside of the bag and Google wants to get them in one way or another. The notion that users want this is irrelevant. What this delusion makes clear is that Google is retrofitting public relations baloney to match what the company has been working on for about decade. What’s interesting is that it has taken mavens, pundits, and “real” journalists 360 months to figure out the Google game plan. Who’s delusional? Google which has mostly accomplished its mission or the folks just figuring out that Google has been and will continue to push the Google PR line?

The second delusion is that “Google is a worker’s utopia.” Okay, when you take money to do work, by definition, this situation is not utopia for the workers. Companies can make work less onerous or more meaningful, but it is work. I don’t think the Googlers I know are doing much more than drinking the Google Kool-Aid, trying to build their knowledge value, and get some money. Like Apple, Google operates a reality distortion field, and, let’s face it, having Google on one’s résumé is arguably more impressive than a degree in Harry Potter studies from Frostburg College. My view is that Google manipulates its workers as effectively as it manipulates the media. Like the media, Google employees play along. It’s a game with high stakes, but it is a game. Google knows exactly what it is doing.

Now what’s the arrogance? The arrogance is not unique to Google. I call this the Math Club Syndrome. Here’s how it works. A group of folks with specialized interests and skills bond, sort of like a golf foursome from Sigma Chi fraternity. The difference is that no one understands the Math Club and most people understand and envy the Sigma Chi golf foursome. As a method of coping with a world that simply does not understand math, the math club becomes insular. The club’s rules are insider rules and act like a protective barrier. No problem until the math club becomes the first next generation supra-national company jousting on an apparently equal footing with China, the Department of Justice, and giants like Microsoft.

What do we expect from the Math Club? I expect Math Club behavior, complete with the insider jokes about janitors in patent documents. (Oh, janitors is a way of describing Google’s semi autonomous agents which “clean up” statistical anomalies in petascale flows of data. Snort, snort, get it. Janitor equals Dilbert’s garbage collector, the smartest person in the comic strip. Oh, you don’t get it? Well, there you have it. A mismatch between Math Club humor and you, gentle reader.)

My view is that it is time to quit worry about Google’s power and time to start figuring out how to surf on Google. My column for KMWorld and this month’s column for the Smart Business Network address two different ways to surf on Google. I don’t grouse. I accept that over the last decade Google has emerged as a new ecosystem. You can’t kill it because the Googlers who leave the company spawn Google-centric entities. My last count tallied a couple of hundred of these Xoogler ventures. And Facebook is not much more than a “legacy” of Google. Maybe Facebook will become the new Google, but that won’t change the arrogance.

Math Club is congruent with arrogance. Reality. Live within it; don’t deny it.

Stephen E Arnold, March 14, 2010

No one paid me to write this article. Because I have not been paid and I refer to psychological behavior, I will report my writing for no pay to the Surgeon General who understands such esoteric notions as delusions.

Database Skirmishes: Relational versus Non-Relational

March 15, 2010

A happy quack to the reader who sent me a link to “SCALE 8x: Relational vs. Non-relational.” I was able to access the link, but the page was marked “subscribers only.” The main point of the write up was to explain some of the performance differences between Codd databases like SQL Server and Oracle and the non relational data management systems like those in use at Google or Digg.com. You can download the slides upon which the article was based at http://www.pgexperts.com/document.html?id=40. The article also includes a useful write up about the issue at http://ossdbsurvey.org/.

Stephen E Arnold, March 15, 2010

No one paid me to write this news item. I will report this sad fact to Health & Human Services because I used a variant of the word “relation”.

Next Page »