K-Now: Here and Now
December 17, 2008
Guest Feature by Dawn Marie Yankeelov, AspectX.com
I have been discussing progress in semantic knowledge structures with Entrepreneur and Researcher Sam Chapman of K-Now who has recently left the University of Sheffield, Department of Computer Science, in the United Kingdom to go full-time into the delivery of semantic technologies in the enterprise. His attendance at the ISWC 2008 has created some momentum to engage new corporations in a discussion on a recently presented paper on “Creating and Using Organisational Semantic Webs in Large Networked Organisations” by Ravish Bhagdev, Ajay Chakravarthy, Sam Chapman, Fabio Ciravegna and Vita Lanfranchi. Knowledge management has shifted as evidenced in his paper. He contends with others that a more localized approach based on a particular perspective of the world in which one operates is far more useful than a centralized company view. All-encompassing ontologies are not the answer, according to Chapman. In the paper, his team indicates:
A challenge for the Semantic Web is to support the change in knowledge management mentioned above, by defining tools and techniques supporting: 1) definition of community-specific views of the world; 2) capture and acquisition of knowledge according to them; 3) integration of captured knowledge with the rest of the organisation’s knowledge; 4) sharing of knowledge across communities.
At K-Now, his team is focused upon supporting large scale organizations to do just this:capturing, managing and storing knowledge and its structures, as well as focusing upon how to reuse and query flexible dynamic knowledge. Repurposing trusted knowledge in K-Now is not based on fixed corporate structures and portal forms, but rather from capturing knowledge in user alterable forms at the point of its generation. Engineering forms, for example, that assist in monitoring aerospace engines during operations worldwide can be easily modified to suit differing local needs. Despite such modifications being enabled this still captures integrated structured knowledge suitable for spotting trends. Making quantitative queries without any pre-agreed central schemas is the objective. This is possible, under K-Now’s approach, due to the use of agreed semantic technology and standards.
LogRhythm: Analysis and Search of Log Files
December 17, 2008
A couple of years ago I visited a very big US government agency. I asked about log files. I learned that these were often deleted without review. The reason was, as I recall, that log files were too big. Okay, that told me quite a bit about the US government’s interest in log files. Had this big government agency had access to LogRhythm, maybe those log files would have been reviewed. LogRhythm (a variant of logarithm, get it?) is a special purpose content processing system with a search component. You can read the MarketWatch news item here. The company’s system can automate monitoring, analysis and alerting for internal or external threats. The company has added what it calls “intelligent IT search.” The software classifies content and adds metadata to log entries. One use of the system is to query logs for an audit event; that is, modifications to access authentication privileges linked to user’s network log in. I think this means that an organization fires a guy or gal. LogRhythm makes it easy to find out if said guy or gal has taken an action that the organization deems inappropriate. The metadata generated from log files includes consistent date and time stamping, prioritization of events, and context tags to pinpoint a harmless file transfer versus a file transfer to one that goes to an external IP address from a secure source within the organization. If you are struggling with log file analysis, LogRhythm may be able to help. More information is available here.
Overflight: Pop Up Summaries Implemented
December 17, 2008
Overflight Google provides a dashboard of Google’s Web log posts. You can access the service here. Just select a Google category. You can hover over a Google blog post title, and the Overflight system displays a snippet of the Web post. Google coordinates its public announcements with its Web log posts. One feature of the pop up is that it will identify the article as a unique post or a cross post. The Web logs are grouped into five categories:
We are planning another vertical Overflight. Watch for the announcement. You can search the full text of Google’s Web log posts using the Exalead search system. In head to head comparisons, ArnoldIT.com found that Exalead does a better job of adding metadata such as dates and entity extraction. Run a query on Overflight via Exalead and then run the same query on Google’s Blogsearch. Judge for yourself.
Stephen Arnold, December 17, 2008
Amazon’s Approach to Staff Motivation
December 16, 2008
The London Times has a tendency to report news that is more like the made up stories that I thought only big US newspapers offered. A colleague in Europe sent me a link to this London Times’s story here about Amazon, the much loved online retailer run by the world’s smartest man. Amazon has, if the story in the London Times is true, seems to have an personnel touch closer to the idiosyncrasies reported by Richard von Krafft-Ebing (Amazon link is here) than those softies Robert L. Mathis and John H. Jackson (Amazon link is here). The Times’s headline is certainly a grabber, “Revealed: Amazon Staff Punished for Being Ill.” You can read the zippy prose here. As a really fragile goose, I find the thought of my game master putting me in the roaster because my feathers fall out quite troubling. That’s why I am not sure the Times has reported the whole story. But what Claire Newell and Daniel Foggo stated is chilling; to wit:
Warned that the company refuses to allow sick leave, even if the worker has a legitimate doctor’s note. Taking a day off sick, even with a note, results in a penalty point. A worker with six points faces dismissal.
I would be a gone goose for sure. The Times’s editors allowed Amazon to respond. I found this comment interesting:
We want our associates to enjoy working at Amazon.co.uk and the interests of all workers are represented by a democratically elected employee forum who meets regularly with senior management. This forum was consulted before the workforce elected to reduce breaks to 15 and 20 minutes on an eight hour shift in order to cut the total working day by half an hour.
Amazon has a remarkable balance sheet. Its expenditures for R&D and infrastructure seem modest. Compared to companies like Google and Microsoft, Amazon seems to have cracked the code for creating massive Web centric systems on a shoestring. Furthermore, Amazon has been quicker than either Google or Microsoft to release commercial Web services such as the Amazon cloud based computing service and its online storage system. Amazon also has its home grown search system. The guru of search high tailed it to Google. I wonder if the Amazon holiday personnel policies influenced that decision?
If the Times’s story is correct, maybe some of that balance sheet magic comes by applying Herr Krafft-Ebing’s use cases to staff. I bet I could work long hours if I wore the cruel shoes brilliantly described by Herr Krafft-Ebing. If you are not familiar with Herr Krafft-Ebing and his research into human motivation, dive in here. Just make sure no colleagues or children are peeking over your shoulder. Please, keep in mind that I am pointing to a London Times’s news story and I am not sure that story is rolling down the same railroad tracks as I. I do fancy the image of Amazon managers in Herr Krafft-Ebing’s high heels, though.
Stephen Arnold, December 16, 2008
Teragram: Growth Strong Despite Downturn
December 16, 2008
I enjoy contrarians. I say the economy is lousy. A contrarian tells me that the economy is wonderful. I say that financial fraud undermines investor confidence. The contrarian tells me to trust American Express. In fact, American Express is one of the most trusted companies in the United States. In my view, I wouldn’t trust this outfit to walk my dog.
Teragram issued an interesting news release that contains information that is contrary to information I have compiled. Specifically, Teragram, now a unit of SAS, the statistics outfit, said here:
At a time when enterprises are concerned about a lagging economy and the bottom line, Teragram has consistently provided proven, money-saving knowledge management tools. Teragram helps knowledge workers automatically organize unstructured data sources, making information more accessible and enabling faster and more accurate knowledge and information sharing. This helps enterprises efficiently manage their growing amounts of information, saving time, resources and money.
I profiled Teragram in one of my studies for a teen aged publisher and reported that the company had some solid clients, interesting technology, and a hosted option to give its customers flexibility. But the economy is lousy and I am not inclined to trust big companies. Therefore, I will keep my eye on Teragram to make sure that it continues to move smoothly against the currents that are carrying some search and content processing companies over Victoria Falls. Yahoo is in some trouble with its world class search system. I reported on TeezIR’s elusiveness. SurfRay remains a mystery. Delphes seems to be on hiatus. Entopia is a flat out goner. And I know of one “big name” that is literally fighting for its life. Could it be good public relations? The marketing clout of SAS? Teragram’s Harvard connection? If anyone knows Teragram’s secret, please, share it.
Stephen Arnold, December 16, 2008
New Arabic Search Engine Yamli
December 16, 2008
Update December 16, 2008 7 41 am Eastern
A reader provided two useful links. One suggests that Xooglers are involved. Navigate here. The other, in Arabric, may be of interest as well. That story from Onkosh.com is here. Two happy quacks to my one reader in the sunny climes of the Near East.
Original Post
A happy quack to the reader who alerted me to this news story. Click quickly. The Businesswire stories can disappear without warning. The release said: “Yamli.com, a startup targeting the Arabic Web, unveiled its new search engine that allows users to easily search Arabic content in all its forms. Various studies show that transliterated Arabic content is ubiquitous due to a large portion of Arabic Internet users choosing to write Arabic phonetically using Latin characters in an ad-hoc and informal fashion. Yamli automatically expands Arabic keyword searches to include all of their transliterated variations and returns results for both Arabic and transliterated content. This feature is a breakthrough for Arabic Internet users who are frustrated with having to repeatedly search different variations of their query when searching for music, news or videos.” Let me know how you like this system. I had to translate words to Arabic using an online translation service. I then pasted the results into another online translation system. In short, I couldn’t figure out whether the results were on target or not. Help, please.
Stephen Arnold, December 16, 2008
Dead Tree Mouthpiece Asks What Is XML
December 16, 2008
Search and content processing vendors are reasonably comfortable with XML or documents in Extensible Markup Language formats. I don’t think much of the content management industry, but I know that most of these outfits can figure out when and how to use XML. Even Word 2007 takes a run at XML. Like an inexperienced soccer player, sometimes Microsoft gets to the right spot and then misses the goal. But the company is trying. You will find “What the Hell Is XML? And Should It Really Make Any Difference to My Business?” in Publishers Weekly here a good read. The author is Mike Shatzkin, and he does a good job of explaining that publishers have to slice and dice their content; that is, repurpose information to make new products or accelerate the creation of new information. He then presents XML as “information to go”. I think the notion is to embed XML tags into content so that software can do some of the work once handled by expensive human editors. For me, the most interesting comment in the article was this passage:
Here’s what we call the Copernican Change. We have lived all our lives in a universe where the book is “the sun” and everything else we might create or sell was a “subsidiary right” to the book, revolving around that sun. In our new universe, the content encased in a well-formed XML file is the sun. The book, an output of a well-formed XML file, is only one of an increasing number of revenue opportunities and marketing opportunities revolving around it. It requires more discipline and attention to the rules to create a well-formed XML file than it did to create a book. But when you’re done, the end result is more useful: content can be rendered many different ways and cleaved and recombined inexpensively, unlocking sales that are almost impossible to capture cost-effectively if you start with a “book.”
XML and its antecedents have been around for 30 years. Anyone remember CALS or SGML? The metaphor of Copernicus’ insights into how the solar system worked seems to suggest a new world view. Okay, but after Copernicus there was a period of cultural adjustment. I don’t think the dead tree crowd has the luxury of time. My recollection is that the clock strikes midnight for the New York Times in a couple of months. Sam Zell has already embraced bankruptcy as a gentlemanly way of dealing with the economics of the dead tree business model. The Newsweek Magazine staff is working on résumés and Web logs, not the jumbo next issue. Heliocentrism is a nifty concept, but it won’t work because like Copernicus De revolutionibus orbium coelestium was finally delivered from the print shop. Oh, Copernicus allegedly got the book right before he ascended to his caelum.
I think that it is too late for most of the dead tree outfits. Fitting in way, I suppose, Copernicus died just as his insights became available to a clueless public… printed on paper. There is a possible symmetry exists between Mr. Shatzkin’s reference to Copernicus and what has happened to most traditional publishers.
Stephen Arnold, December 16, 2008
Google: Worms Are Turning
December 16, 2008
Google is not accustomed to having its plans jeopardized by the likes of the Wall Street Journal. After a decade of baffling the pundits with free Odwalla beverages and lunch entertainment from the likes of Tony Bennett, the GOOG is thrashing. To add to the misery of the Wall Street Journal story here, the SFGate online site published “Google Off List of 20 Most Trusted Companies.” You can read this story here. American Express and eBay are allegedly perceived as more trustworthy than Google. Wow. eBay and PayPal. More trusted. When will other shoes begin to drop? Last week I listened as a Googler ran the game plan; that is, did a standard presentation about the firm’s capabilities. The presentation was warm, interesting, and what is on the Google Web site. Googlers, I opine, only know what Mother Google wants them to know. I have often mentioned Googler Cyrus, a high ranking Googler, who told me I Photoshopped a Google report that looked a lot like a dossier prepared by the police on a suspect. I pointed out to dear Cyrus that the image came from a Google patent document. The Googler did not believe me. Now you try to find in Google a hit on my name, my study Google Version 2.0, and patents. You won’t be able to find it. Somehow the links to my study of Google patents are really tough to find. I find this amusing. I wonder if Google finds my analyses a wee bit off putting? Now the GOOG is battling a dead tree traditional media company and finding itself no longer among the most trusted companies. What’s amazing to me is that it has taken a decade for pundits, wizards, and assorted Google search experts to figure out some of the Google’s more interesting initiatives. There’s more in the closet. I can hardly wait to see what antics dead tree media and the GOOG will display. For a quick primer, check out my Google studies here.
Stephen Arnold, December 16, 2008
Wall Street Journal Figures Out What Google Is Doing, Gets Criticized
December 15, 2008
The Wall Street Journal’s Vishesh Kuman and Christopher Rhoads stumbled into a hornet’s nest. I think surprise may accompany these people and their editor for the next few days. The story “Google Wants Its Own Fast Track on the Web.” The story is here at the moment, but it will probably disappear or be unavailable due to heavy click traffic. Read it quickly so you have the context for the hundreds of comments this story has generated. Pundits whose comments I found useful are the Lessig Blog, Om Malik’s GigaOM, and Google’s own comment here.
The premise of the article is that the GOOG wants to create what Messrs. Kuman and Rhoads call “a fast lane.” In effect, the GOOG wants to get preferential treatment for its traffic. The story wanders forward with references to network neutrality, which is probably going to die like a polar bear sitting on an ice chunk in the Arctic circle. Network neutrality is a weird American term that is designed to prevent a telco from charging people based on arbitrary benchmarks. The Bell Telephone Co. figured out that differential pricing was the way to keep the monopoly in clover a long time ago. The lesson has not be forgotten by today’s data barons. The authors drag in the president elect and wraps up with use of a Google-coined phrase “OpenEdge.”
Why the firestorm? Here are my thoughts:
First, I prepared a briefing for several telcos in early 2008. My partner at the Mercer Island Group and I did a series of briefings for telecommunication companies. In that briefing, I showed a diagram from one of Google’s patent documents and enriched with information from Google’s technical papers. The diagram showed Google as the intermediary between a telco’s mobile customers and the Internet. In effect, with Google in the middle, the telco would get low latency rendering of content in the Googleplex (my term for Google’s computer and software infrastructure). The groups to a person snorted derision. I recall one sophisticated telco manager saying in the jargon of the Bell head, “That’s crap.” I had no rejoinder to that because I was reporting what my analyses of Google patents and technical papers said. So, until this Wall Street Journal story appeared, the notion of Google becoming the Internet was not on anyone’s radar. After all, I live in Kentucky and the Mercer Island Group is not McKinsey & Co. or Boston Consulting Group in terms of size and number of consultants. But MIG has some sharp nails is its toolkit.
Second, in my Google Version 2.0, which is mostly a summary of Google’s patent documents from August 2005 to June 2007, I reported on a series of give patent documents, filed the same day and eventually published on the same day by the ever efficient US Patent & Trademark Office. the five documents disclosed a big, somewhat crazy system for sucking in data from airline ticket sellers, camera manufacturers, and other structured data sources. The invention figured out the context of each datum and built a great big master database containing the data. The idea was that some companies could push the data to Google. Failing that, Google would use software to fill in the gaps and therefore have its own master database. BearStearns was sufficiently intrigued by this analysis to issue a report to its key clients about this innovation. Google’s attorneys asserted that the report contained proprietary Google data, an objection that went away when I provided the patent document number and the url to download the patent documents. Google’s attorneys, like many Googlers, are confident but sometimes uninformed about what the GOOG is doing with one paw while the other paw adjusts the lava lamps.
Third, in my Beyond Search study for the Gilbane Group, I reported that Google had developed the “dataspace” technology to provide the framework for Google to become the Internet. Sue Feldman at IDC, the big research firm near Boston, was sufficiently interested to work with me to create a special IDC report on this technology and its implications. The Beyond Search study and the IDC report went to hundreds of clients and was ignored. The idea of a dataspace with metadata about how long a person looks at a Web page and the use of meta metadata to make queries about the lineage and certainty of data was too strange.
What the Wall Street Journal has stumbled into is a piece of the Google strategy. My view is that Google is making an honest effort to involve the telcos in its business plan. If the telcos pass, then the GOOG will simply keep doing what it has been doing for a decade; that is, building out what I called in January 2008 in my briefings “Google Global Telecommunications”. Yep, Google is the sibling of the “old” AT&T model of a utility. Instead of just voice and data, GGT will combine smart software with its infrastructure and data to marginalize quite a few business operations.
Is this too big an idea today? Not for Google. But the idea is sufficiently big to trigger the storm front of comments. My thought is, “You ain’t seen nothing yet.” Ignorance of Google’s technology is commonplace. One would have thought that the telcos would take Google seriously by now. Guess not. If you want to dig into Google’s technology, you can still buy copies of my studies:
- The Google Legacy: How Google’s Internet Search Is Transforming Application Software, Infonortics, 2005 here
- Google Version 2.0: The Calculating Predator, Infonortics, 2007 here
- Beyond Search: What to Do When Your Enterprise Search System Doesn’t Work, Gilbane Group, 2008 here
Bear Stearns is out of business, so I don’t know how you can get a copy of that 40 page report. You can order the dataspaces report directly from IDC. Just ask for Report 213562.
If you want me to brief your company on Google’s technology investments over the last decade, write me at seaky2000 at yahoo dot com. I have a number of different briefings, including the telco analysis and a new one on Google’s machine learning methods. These are a blend of technology analysis and examples in Google’s open source publications. I rely on my analytical methods to identify key trends and use only open source materials. Nevertheless, the capabilities of Google are–shall we say–quite interesting. Just look at what the GOOG has done in online advertising. The disruptive potential of its other technologies is comparable. What do you know about containers, janitors, and dataspaces? Not much I might suggest if I were not an addled goose.
Oh, let me address Messrs. Kumar and Rhoads, “You are somewhat correct, but you are grasping at straws when you suggest that Google requires the support and permission of any entity or individual. The GOOG is emerging as the first digital nation state.” Tough to understand, tough to regulate, and tough to thwart. Just ask the book publishers suggest I.
Stephen Arnold, December 15, 2008
Your Identity
December 15, 2008
One of my three or four readers wrote me because in the comments section of this Web log his / her email address appeared. As Homer Simpson said, “D’oh.” If you wish to post and remain anonymous, navigate to Mailinator here or use a similar service. Another option is to NOT read my personal Web log.
The likelihood that I am assuming responsibility for what a person provides in the comments section of this Web log is addled. If you in doubt about the purpose or seriousness of this Web log, you need to read the editorial policy on the About page. If you want to jump there now, click this link. Just skip the baloney about my background and read the Disclaimer.
My approach to analysis baffles certain linear thinkers. Dear, confused Ms. Sperling–my high school English teacher–thought Robert Burns’s 1794 song “A Red Red Rose” was about really beautiful, true, romantic love. Nope. It’s about a sailor’s having a girl in every port. I believe she went to her grave misunderstanding Mr. Burns’s poetry.
If you, gentle reader, can’t winnow the goose feathers from the giblets, remove this Web log from your news reader. Feel free to complain about my for fee work for you, but this free stuff is what it is–recycled information with some notes I make for myself.
The fact that “Beyond Search” Web log is publicly available is due to the nature of the free content management system I use. I make little effort to code around WordPress’ problems. I go default all the way, gentle readers.
Stephen Arnold, December 15, 2008