K-Now: Here and Now

December 17, 2008

Guest Feature by Dawn Marie Yankeelov, AspectX.com

I have been discussing progress in semantic knowledge structures with Entrepreneur and Researcher Sam Chapman of K-Now who has recently left the University of Sheffield, Department of Computer Science, in the United Kingdom to go full-time into the delivery of semantic technologies in the enterprise. His attendance at the ISWC 2008 has created some momentum to engage new corporations in a discussion on a recently presented paper on Creating and Using Organisational Semantic Webs in Large Networked Organisationsby Ravish Bhagdev, Ajay Chakravarthy, Sam Chapman, Fabio Ciravegna and Vita Lanfranchi. Knowledge management has shifted as evidenced in his paper. He contends with others that a more localized approach based on a particular perspective of the world in which one operates is far more useful than a centralized company view. All-encompassing ontologies are not the answer, according to Chapman. In the paper, his team indicates:

A challenge for the Semantic Web is to support the change in knowledge management mentioned above, by defining tools and techniques supporting: 1) definition of community-specific views of the world; 2) capture and acquisition of knowledge according to them; 3) integration of captured knowledge with the rest of the organisation’s knowledge; 4) sharing of knowledge across communities.

At K-Now, his team is focused upon supporting large scale organizations to do just this:capturing, managing and storing knowledge and its structures, as well as focusing upon how to reuse and query flexible dynamic knowledge. Repurposing trusted knowledge in K-Now is not based on fixed corporate structures and portal forms, but rather from capturing knowledge in user alterable forms at the point of its generation. Engineering forms, for example, that assist in monitoring aerospace engines during operations worldwide can be easily modified to suit differing local needs. Despite such modifications being enabled this still captures integrated structured knowledge suitable for spotting trends. Making quantitative queries without any pre-agreed central schemas is the objective. This is possible, under K-Now’s approach, due to the use of agreed semantic technology and standards.

Read more

LogRhythm: Analysis and Search of Log Files

December 17, 2008

A couple of years ago I visited a very big US government agency. I asked about log files. I learned that these were often deleted without review. The reason was, as I recall, that log files were too big. Okay, that told me quite a bit about the US government’s interest in log files. Had this big government agency had access to LogRhythm, maybe those log files would have been reviewed. LogRhythm (a variant of logarithm, get it?) is a special purpose content processing system with a search component. You can read the MarketWatch news item here. The company’s system can automate monitoring, analysis and alerting for internal or external threats. The company has added what it calls “intelligent IT search.” The software  classifies content and adds metadata to log entries. One use of the system is to query logs for an audit event; that is, modifications to access authentication privileges linked to user’s network log in. I think this means that an organization fires a guy or gal. LogRhythm makes it easy to find out if said guy or gal has taken an action that the organization deems inappropriate. The metadata generated from log files includes consistent date and time stamping, prioritization of events, and context tags to pinpoint a harmless file transfer versus a file transfer to one that goes to an external IP address from a secure source within the organization. If you are struggling with log file analysis, LogRhythm may be able to help. More information is available here.

Overflight: Pop Up Summaries Implemented

December 17, 2008

Overflight Google provides a dashboard of Google’s Web log posts. You can access the service here. Just select a Google category. You can hover over a Google blog post title, and the Overflight system displays a snippet of the Web post. Google coordinates its public announcements with its Web log posts. One feature of the pop up is that it will identify the article as a unique post or a cross post. The Web logs are grouped into five categories:

We are planning another vertical Overflight. Watch for the announcement. You can search the full text of Google’s Web log posts using the Exalead search system. In head to head comparisons, ArnoldIT.com found that Exalead does a better job of adding metadata such as dates and entity extraction. Run a query on Overflight via Exalead and then run the same query on Google’s Blogsearch. Judge for yourself.

Stephen Arnold, December 17, 2008

Teragram: Growth Strong Despite Downturn

December 16, 2008

I enjoy contrarians. I say the economy is lousy. A contrarian tells me that the economy is wonderful. I say that financial fraud undermines investor confidence. The contrarian tells me to trust American Express. In fact, American Express is one of the most trusted companies in the United States. In my view, I wouldn’t trust this outfit to walk my dog.

Teragram issued an interesting news release that contains information that is contrary to information I have compiled. Specifically, Teragram, now a unit of SAS, the statistics outfit, said here:

At a time when enterprises are concerned about a lagging economy and the bottom line, Teragram has consistently provided proven, money-saving knowledge management tools. Teragram helps knowledge workers automatically organize unstructured data sources, making information more accessible and enabling faster and more accurate knowledge and information sharing. This helps enterprises efficiently manage their growing amounts of information, saving time, resources and money.

I profiled Teragram in one of my studies for a teen aged publisher and reported that the company had some solid clients, interesting technology, and a hosted option to give its customers flexibility. But the economy is lousy and I am not inclined to trust big companies. Therefore, I will keep my eye on Teragram to make sure that it continues to move smoothly against the currents that are carrying some search and content processing companies over Victoria Falls. Yahoo is in some trouble with its world class search system. I reported on TeezIR’s elusiveness. SurfRay remains a mystery. Delphes seems to be on hiatus. Entopia is a flat out goner. And I know of one “big name” that is literally fighting for its life. Could it be good public relations? The marketing clout of SAS? Teragram’s Harvard connection? If anyone knows Teragram’s secret, please, share it.

Stephen Arnold, December 16, 2008

Dead Tree Mouthpiece Asks What Is XML

December 16, 2008

Search and content processing vendors are reasonably comfortable with XML or documents in Extensible Markup Language formats. I don’t think much of the content management industry, but I know that most of these outfits can figure out when and how to use XML. Even Word 2007 takes a run at XML. Like an inexperienced soccer player, sometimes Microsoft gets to the right spot and then misses the goal. But the company is trying. You will find “What the Hell Is XML? And Should It Really Make Any Difference to My Business?” in Publishers Weekly here a good read. The author is Mike Shatzkin, and he does a good job of explaining that publishers have to slice and dice their content; that is, repurpose information to make new products or accelerate the creation of new information. He then presents XML as “information to go”. I think the notion is to embed XML tags into content so that software can do some of the work once handled by expensive human editors. For me, the most interesting comment in the article was this passage:

Here’s what we call the Copernican Change. We have lived all our lives in a universe where the book is “the sun” and everything else we might create or sell was a “subsidiary right” to the book, revolving around that sun. In our new universe, the content encased in a well-formed XML file is the sun. The book, an output of a well-formed XML file, is only one of an increasing number of revenue opportunities and marketing opportunities revolving around it. It requires more discipline and attention to the rules to create a well-formed XML file than it did to create a book. But when you’re done, the end result is more useful: content can be rendered many different ways and cleaved and recombined inexpensively, unlocking sales that are almost impossible to capture cost-effectively if you start with a “book.”

XML and its antecedents have been around for 30 years. Anyone remember CALS or SGML? The metaphor of Copernicus’ insights into how the solar system worked seems to suggest a new world view. Okay, but after Copernicus there was a period of cultural adjustment. I don’t think the dead tree crowd has the luxury of time. My recollection is that the clock strikes midnight for the New York Times in a couple of months. Sam Zell has already embraced bankruptcy as a gentlemanly way of dealing with the economics of the dead tree business model. The Newsweek Magazine staff is working on résumés and Web logs, not the jumbo next issue. Heliocentrism is a nifty concept, but it won’t work because like Copernicus De revolutionibus orbium coelestium was finally delivered from the print shop. Oh, Copernicus allegedly got the book right before he ascended to his caelum.

I think that it is too late for most of the dead tree outfits. Fitting in way, I suppose, Copernicus died just as his insights became available to a clueless public… printed on paper. There is a possible symmetry exists between Mr. Shatzkin’s reference to Copernicus and what has happened to most traditional publishers.

Stephen Arnold, December 16, 2008

Wall Street Journal Figures Out What Google Is Doing, Gets Criticized

December 15, 2008

The Wall Street Journal’s Vishesh Kuman and Christopher Rhoads stumbled into a hornet’s nest. I think surprise may accompany these people and their editor for the next few days. The story “Google Wants Its Own Fast Track on the Web.”  The story is here at the moment, but it will probably disappear or be unavailable due to heavy click traffic. Read it quickly so you have the context for the hundreds of comments this story has generated. Pundits whose comments I found useful are the Lessig Blog, Om Malik’s GigaOM, and Google’s own comment here.

The premise of the article is that the GOOG wants to create what Messrs. Kuman and Rhoads call “a fast lane.” In effect, the GOOG wants to get preferential treatment for its traffic. The story wanders forward with references to network neutrality, which is probably going to die like a polar bear sitting on an ice chunk in the Arctic circle. Network neutrality is a weird American term that is designed to prevent a telco from charging people based on arbitrary benchmarks. The Bell Telephone Co. figured out that differential pricing was the way to keep the monopoly in clover a long time ago. The lesson has not be forgotten by today’s data barons. The authors drag in the president elect and wraps up with use of a Google-coined phrase “OpenEdge.”

Why the firestorm? Here are my thoughts:

First, I prepared a briefing for several telcos in early 2008. My partner at the Mercer Island Group and I did a series of briefings for telecommunication companies. In that briefing, I showed a diagram from one of Google’s patent documents and enriched with information from Google’s technical papers. The diagram showed Google as the intermediary between a telco’s mobile customers and the Internet. In effect, with Google in the middle, the telco would get low latency rendering of content in the Googleplex (my term for Google’s computer and software infrastructure). The groups to a person snorted derision. I recall one sophisticated telco manager saying in the jargon of the Bell head, “That’s crap.” I had no rejoinder to that because I was reporting what my analyses of Google patents and technical papers said. So, until this Wall Street Journal story appeared, the notion of Google becoming the Internet was not on anyone’s radar. After all, I live in Kentucky and the Mercer Island Group is not McKinsey & Co. or Boston Consulting Group in terms of size and number of consultants. But MIG has some sharp nails is its toolkit.

Second, in my Google Version 2.0, which is mostly a summary of Google’s patent documents from August 2005 to June 2007, I reported on a series of give patent documents, filed the same day and eventually published on the same day by the ever efficient US Patent & Trademark Office. the five documents disclosed a big, somewhat crazy system for sucking in data from airline ticket sellers, camera manufacturers, and other structured data sources. The invention figured out the context of each datum and built a great big master database containing the data. The idea was that some companies could push the data to Google. Failing that, Google would use software to fill in the gaps and therefore have its own master database. BearStearns was sufficiently intrigued by this analysis to issue a report to its key clients about this innovation. Google’s attorneys asserted that the report contained proprietary Google data, an objection that went away when I provided the patent document number and the url to download the patent documents. Google’s attorneys, like many Googlers, are confident but sometimes uninformed about what the GOOG is doing with one paw while the other paw adjusts the lava lamps.

Third, in my Beyond Search study for the Gilbane Group, I reported that Google had developed the “dataspace” technology to provide the framework for Google to become the Internet. Sue Feldman at IDC, the big research firm near Boston, was sufficiently interested to work with me to create a special IDC report on this technology and its implications. The Beyond Search study and the IDC report went to hundreds of clients and was ignored. The idea of a dataspace with metadata about how long a person looks at a Web page and the use of meta metadata to make queries about the lineage and certainty of data was too strange.

What the Wall Street Journal has stumbled into is a piece of the Google strategy. My view is that Google is making an honest effort to involve the telcos in its business plan. If the telcos pass, then the GOOG will simply keep doing what it has been doing for a decade; that is, building out what I called in January 2008 in my briefings “Google Global Telecommunications”. Yep, Google is the sibling of the “old” AT&T model of a utility. Instead of just voice and data, GGT will combine smart software with its infrastructure and data to marginalize quite a few business operations.

Is this too big an idea today? Not for Google. But the idea is sufficiently big to trigger the storm front of comments. My thought is, “You ain’t seen nothing yet.” Ignorance of Google’s technology is commonplace. One would have thought that the telcos would take Google seriously by now. Guess not. If you want to dig into Google’s technology, you can still buy copies of my studies:

  1. The Google Legacy: How Google’s Internet Search Is Transforming Application Software, Infonortics, 2005 here
  2. Google Version 2.0: The Calculating Predator, Infonortics, 2007 here
  3. Beyond Search: What to Do When Your Enterprise Search System Doesn’t Work, Gilbane Group, 2008 here

Bear Stearns is out of business, so I don’t know how you can get a copy of that 40 page report. You can order the dataspaces report directly from IDC. Just ask for Report 213562.

If you want me to brief your company on Google’s technology investments over the last decade, write me at seaky2000 at yahoo dot com. I have a number of different briefings, including the telco analysis and a new one on Google’s machine learning methods. These are a blend of technology analysis and examples in Google’s open source publications. I rely on my analytical methods to identify key trends and use only open source materials. Nevertheless, the capabilities of Google are–shall we say–quite interesting. Just look at what the GOOG has done in online advertising. The disruptive potential of its other technologies is comparable. What do you know about containers, janitors, and dataspaces? Not much I might suggest if I were not an addled goose.

Oh, let me address Messrs. Kumar and Rhoads, “You are somewhat correct, but you are grasping at straws when you suggest that Google requires the support and permission of any entity or individual. The GOOG is emerging as the first digital nation state.” Tough to understand, tough to regulate, and tough to thwart. Just ask the book publishers suggest I.

Stephen Arnold, December 15, 2008

Google Recipes

December 15, 2008

Last week I showed some “in the wild” functions on Google. These are test pages on which certain Google features appear. Finding an “in the wild” service is a hit and miss affair. I was curious about the query “recipes”. On Wednesday, December 10, 2008, I ran the query and it was ho hum regular Google laundry list format. Today (Sunday, December 14, 2008), the query generated an interesting result page. First, the Programmable Search Engine drop down box appears. Second, the source of the recipes is a Web site at http://allrecipes.com. Third, a hot link to a definition of recipes appears under the line about customized search results; for example, Results 110 of about 148,000,000 for recipes [definition]. (0.15 seconds). When I clicked the definition, I was directed here.

recipes 12 14

Advertisers may be willing to pay extra to be featured with the Google categories for their Web site or the “definition” hot link. Add to this the insertion of AdWords into the drop down suggestion box and what have you got? Subtle monetization. The GOOG is going to hit its revenue targets by offering advertisers some very tasty ad options. Ads, like Web pages, are losing their zing. The GOOG is responding.

Stephen Arnold, December 15, 2008

Nstein and Taxonomy Improvement

December 14, 2008

Nstein Technologies is helping media companies like Scripps, Bonnier, and Time expand search  taxonomies to return better search results. By customizing word relationships, Nstein uses semantics to categorize results in context. The goal is to increase user satisfaction. By giving them better results in searches, the customers are more likely to return to the Web site. To support the idea, Nstein redesigned its entire site, incorporating a custom taxonomy to increase reader satisfaction. Their example: “Stuffing” was added to the taxonomy – and an association was made between “Dressing” with “Stuffing,” so no matter which keyword a reader chose, all relevant recipes would appear. Companies also are going farther than custom taxonomies – they are adopting and expanding authority files (controlled lists of products, companies, locations, people, etc.)
It all comes down to making search better.

Jessica Bratcher, December 14, 2008

Autonomy: The Next Big Thing

December 14, 2008

I enjoy the hate mail I get when I write about Autonomy’s news announcements. Some of my three or four readers think that I write these items for Autonomy. Wrong. I am reporting information that my trusty newsreader delivers to me. Here’s a gem that will get the anti-Autonomy crowd revved on a Sunday morning. The article appeared on SmartBrief.com as news. The headline was an attention grabber: “Autonomy at the Cutting Edge of New Multi-Trillion dollar Sector According to Head of Gartner Research.” You can read it here. The url is one of those wacky jobs that can fail to resolve. The core of the story for me is that Gartner has identified a “multi trillion dollar sector.” That has to be good news to those who pay Gartner to make forecasts about markets. Search and content processing has been chugging along in the $1.3 to $3.0 billion range if one ignores the aberration that is Google. I find it hard to believe that Gartner’s financial forecasts can be spot on, but who knows? In case, you want to know what a trillion is, it is one followed by a dozen zeros. The Gartner fellow with the sharp and optimistic pencil is identified as Peter Sondergaard, Senior Vice President, Gartner Research. The source, according the the news release, is an interview with an outfit called Business Spectator. I wonder if a few extra zeros were added as Mr. Sondergaard’s pronouncement was recorded? So, what’s this forecast have to do with Autonomy? Autonomy said in its input to SmartBrief:

Autonomy Corporation plc , a global leader in infrastructure software for the enterprise, today announced that its vision of searching and analyzing structured and unstructured data has now been validated as the next big thing in business IT. According to an interview with Business Spectator, Peter Sondergaard, Senior Vice President, Gartner Research, predicts that the next quantum leap in productivity will come from the use of IT systems that analyze structured and unstructured data. Sondergaard says that Autonomy is at the cutting edge of the new search technology, a sector in the IT industry that will ultimately earn multi trillion dollar revenues.

The story appeared on PRNewswire and on one of the Thomson Reuters’ services. With economies tanking, I am delighted to know that the sector in which I work is slated to become a multi trillion dollar business. I hope I live long enough. Since laughter is a medicine that extends one’s life, I look forward to more Gartner forecasts and to Autonomy’s riding the crest of this predicted market boom.

Stephen Arnold, December 15, 2008

Expert System’s COGITO Answers

December 12, 2008

Expert System has launched COGITO Answers, which streamlines search and provides customer assistance on web sites, e-mail and mobile interfaces such as cell phones and PDAs while creating a company knowledge base.  The platform allows users to search across multiple resources with a handy twist: it uses semantic analysis to absorb and understand a customer’s lingo, therefore analyzing the meaning of the text to process search results rather than just matching keywords. It interprets word usage in context. The program also tracks customer interface and stores all requests so the company can anticipate client needs and questions, thus cutting down response time and increasing accuracy. You can get more information by e-mailing answers@expertsystem.net.

Jessica Bratcher, December 12, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta