OneRiot Identifies Challenges to Monetizing Real Time Info
April 13, 2010
OneRiot’s Kimbal Musk has identified the three challenges to monetizing real time information. The reasons appear in “Monetizing the Realtime Web” in the company’s blog. I agree that there is interest in real time information. Even SAS, the analytics giant, wants to hop on this fast moving content train. Law enforcement has long had an interest in knowing what’s going on, particularly in certain fast moving situations when mobile devices are used to pass messages. The challenges are, however, formidable. Mr. Musk identifies these hurdles:
- “Real time targeting”; that is, knowing what message goes to whom at a particular point in time. Advertisers want to fire info rifle shots, not shotgun blasts in my experience. However, real time targeting can be computationally expensive.
- “Data is everything”; that is, individual messages must be processed and converted into meaningful information. Google has had this challenge gripped in its teeth for more than a decade. Many organizations are struggling with this issue. There are costs and precision issues in addition to technical challenges to resolve. Better metadata are needed to make some real time information useful to an advertiser.
- Advertisers have some learning to do. Missionary marketing is important and some old expectations and habits can be difficult to change.
Mr. Musk provides some color about OneRiot’s successful approach provides a useful case.
The challenge is not just OneRiot’s. Google continues to tweak its presentation of real time results. I noted that our research suggests that users skip over the real time results. Some topics don’t have real time results; others do. Traditional searchers, therefore, don’t see information consistently in result sets. Consistency is important.
The larger issue, in my opinion, is that some real time results lack context. Additional information may be needed to make sense of some real time results. These injected content wrappers provide the user with the information needed to make sense of an otherwise cryptic or out of context item of information. If you run a query on a current event such as updates to the PGA tournament, the user presumably has context. But even these messages may need framing.
At this time, injection and wrapper technology is available, based on our research, just not deployed. Real time information is likely to benefit when more than the terse message is presented. Smart software may be able to shoulder the burden, converting isolated items into mini news stories.
Whoever cracks this problem will have an edge in monetization because the machine generated wrappers can have ads attached which may offer more advertising hooks.
Stephen E Arnold, April 13, 2010
Unsponsored post.
Google Snags Programmable Search Engine Patent
April 11, 2010
Short honk: The programmable search engine invention has been granted a US patent. Filed in august 2005 and published in February 2007, the PSE provides a glimpse of the Google’s systems and methods for performing sophisticated content processing. Dr. Ramanathan Guha, inventor of the PSE, has a deep interest in data management, the semantic Web and context tagging. You can download a copy of US7693830 from the USPTO. There were four other PSE patent applications published on the same day in February 2007, which is a testament to Dr. Guha’s ability to invent and write complex patent applications in a remarkable period of time. The PSE is quite important with elements of the invention visible in today’s Google shopping service, among others.
Stephen E Arnold, April 9, 2010
Unsponsored post.
Boston Search Engine Meeting and Exalead
April 9, 2010
The Evvie Award recognizes outstanding work in the field of search and content processing. Ev Brenner, one of the original founders of the Boston Search Engine Meeting emphasized the need to acknowledge original research and innovative thinking. After Mr. Brenner died, the Boston Search Engine Meeting, then owned by a company in the UK, instituted the Evvie award. This year, the Evvie is sponsored by Exalead, one of the leaders in search-based applications and ArnoldIT.com, are sponsoring the award. in addition to a cash recognition of $1,000, the recipient receives the Evvie shown below.
For more information about the premier search and content processing conference, navigate to the Search Engine Meeting Web site. You can review the program and pre conference activities.
For more information about Exalead, navigate to the Exalead Web site. You can see a demonstration of the Exalead system on the ArnoldIT.com site here and you can explore next generation search and content processing innovations at Exalead’s “labs” site.
For more information about the award, click here.
Stephen E Arnold, April 9, 2010
This post is sponsored by ArnoldIT.com, Exalead, and Information Today, Inc.
AIIM Report on Content Analytics
March 30, 2010
A happy quack to the reader who sent me a link available from the Allyis Web site for the report “Content Analytics – Research Tools for Unstructured Content and Rich Media”. If you are trying to figure out what about 600 AIIM members think about the changing nature of information analysis, you will find this report useful. I flipped through the 20 pages of data from what strikes me as a somewhat biased sample of enterprise professionals. Your mileage may vary, of course. One quick example. In Figure 4: How would you rate your ability to research across the following content types on page 7, the respondants’ data are pretty good at search customer support logs. The respondents are also confident of their ability to search “case files” and “litigation and legal reports.” My research suggests that these three areas are real problems in most organizations. I am not sure how this sample interprets their organizations’ capabilities, but I think something is wacky. How can, for example, a general business employee assess the ease with which litigation content can be researched. Lawyers are the folks who have the expertise. At any rate, another flashing yellow light is the indication that the respondents have a tough time searching for press articles and news along with collateral, brochures, and publications. This is pretty common content, and an outfit that can search “case files” should be able to locate a brochure. Well, maybe not?
There were three findings that I found interesting, but I am not ready to bet my bread crust on the solidity of the data.
First, Figure 14: What are your spending plans for the following areas in the next 12 months?. The top dog is enterprise search – application. This should give some search vendors the idea to market to the AIIM membership.
Second, respondents, according to the Key Findings, can find information on the Web more easily than they can find information within their organization. This matches what Martin White and I reported in our 2009 study Successful Enterprise Search Management. It is clear that this finding underscores the wackiness in Figure 4, page 7.
Finally, the Conclusion, page 15 states:
The benefits of investment in Finance and ERP systems have only come to the fore with the increasing power of Business Intelligence (BI) reporting tools and the insight they provide for business managers. In the same way, the benefits of Content Management systems can be much more heavily leveraged by the use of Content Analytics tools.
I don’t really understand this paragraph. Finance has been stretched with the present economic climate. ERP is a clunker. Content management systems are often quite problematic. So what’s the analysis? How about cost overruns?
I tucked the study into my reference file. You may want to do the same. If the Allyis link goes dead, you can get the report directly from AIIM but you may have to join the association.
Stephen E Arnold, March 31, 2010
Like the report, a freebie.
SAS Teragram in Marketing Push
March 25, 2010
Two readers on two different continents sent me links to write ups about SAS Teragram. As you may know, SAS has been a licensee of the Inxight technology for various text processing operations. Business Objects bought Inxight, and then SAP bought Business Objects. I was told a year or so ago that there was no material change in the way in which SAS worked with Inxight. Not long after I heard that remark, SAS bought the little-known specialist content processing firm, Teragram. Teragram, founded by Yves Schabes and a fellow academic, landed some big clients for the firm’s automated text processing system. These clients included the New York Times and, I believe, America Online.
Teragram has integrated its software with Apache Lucene, and the company has rolled out what it calls a Sentiment Analysis Manager. The idea behind sentiment analysis is simple. Process text such as customer emails and flag the ones that are potential problems. These “problems” can then be given special attention.
The first news item I received from a reader was a pointer to a summary of an interview with Dr. Schabes on the Business Intelligence network. Like ZDNet and Fierce Media, these are pay-for-coverage services. The podcasts usually reach several hundred people and the information is recycled in print and as audio files. The article was “Teragram Delivers Text Analytics Solutions and Language Technologies.” You can find a summary in the write up, but the link to the audio file was not working when I checked it out (March 24, 2010, at 8 am Eastern). The most interesting comment in the write up in my opinion was:
Business intelligence has evolved from a field of computing on numbers to actually computing on text, and that is where natural language processing and linguistics comes in… Text is a reflection of language, and you need computational linguistics technologies to be able to turn language into a structure of information. That is really what the core mission of our company is to provide technologies that allow us to treat text at a more elaborate level than just characters, and to add structure on top of documents and language.
The second item appeared as “SAS Text Analytics, The Last Frontier in the Analysis of Documents” in Areapress. The passage in that write up I noted was this list of licensees:
Associated Press, eBay, Factiva, Forbes.com, Hewlett Packard, New York Times Company, Reed Business Information, Sony, Tribune Interactive, WashingtonPost.com, Wolters Kluwer, Yahoo! and the World Bank.
I am not sure how up to date the list is. I heard that the World Bank recently switched search systems. For more information about Teragram, navigate to the SAS Web site. Could this uptick in SAS Teragram marketing be another indication that making sales is getting more difficult in today’s financial climate?
Stephen E Arnold, March 25, 2010
A no fee write up. I will report this sad state of affairs to the IMF, which it appears is not clued in like the World Bank.
IBM and Its Do Everything Strategy
March 24, 2010
I read an unusual interview with Steve Mills. The story was “Q&A: IBM’s Steve Mills on Strategy, Oracle, and SAP.” What jumped out at me was that there was no reference to Google that I noticed. Odd. Google seems to be ramping up in the enterprise sector and poised to compete with just about everyone in the enterprise software and services market. When I noticed this, I decided to work through the interview to see what the rationale was for describing companies that are struggling with many “push back” issues from customers, resellers, and partners. The hassles Oracle is now enduring with regard to open source and the SAP service pricing fluctuations are examples of companies struggling to deal with a changing market needs.
Please, read the original interview because I am comfortable highlighting three comments in a blog post.
First, Mr. Mills said:
Our technology delivers important elements of the solution, but there are often third-part application companies that add to that solution. No one vendor delivers everything required. The average large business, if you went into their compute centers around the world, runs 50,000 to 60,000 programs that are part of 2,000 to 4,000 unique applications.
Yes, and it is the cost and complexity of the IT infrastructure in those companies today that are creating pressures on the CFO, the users, and stakeholders. IBM’s engineers helped created the present situation and the company is now in a position where those customers are likely to look for lower cost, different types of options. If I have a broken auto, would I go to the mechanic who failed to make the repair on an earlier visit? I seek a new mechanic, but perhaps IBM’s cash rich customers don’t think the way I do.
Second, Mr. Mills offered this “fact”:
But in the enterprise, for every dollar invested in ERP, there will be five dollars of investment made around that ERP package to get it fully implemented, integrated, scaled and running effectively.
My view is that the time value of the dinosaur like applications are likely to be put under increasing pressure by new hires. The younger engineers are more comfortable with certain approaches to computing. Over time, the IBM “factoid” will be converted into a question like, “If we shift to Google Apps, perhaps we could save some money?” The answer would require verification, but if the savings are accurate, the implications for Oracle and SAP are significant. I think IBM will either have to buy its way into the cloud and “try to make up the revenue delta” on volume or find itself in the same boat as other “old style” enterprise software vendors.
Third, Mr. Mills stated:
It’s money. That’s the No. 1 motivator. And money is not a single-dimensional factor because there’s short-term money, long-term money and money described in broader value terms versus the cost of a product. The surrounding costs are far in excess of products. Every month, customers convert from Oracle to DB2. Why do they do that? Well, Oracle is expensive. Oracle tries to use pricing power to capture a customer and then get the customer to keep on paying. Oracle raises its prices constantly. Oracle does not provide a strong support infrastructure. There are many customers who have decided to move away from Oracle across a variety of products because of those characteristics.
I agree. The implication are that IBM is a low cost option. Well, maybe in some other dimension which the addled goose cannot perceive. My view is that time, vale, and cost will conspire to create a gravity well into which the IBM-like companies will be sucked. IBM’s dalliance with open source, its adherence to its services model, and its reliance on acquisitions to generate revenue may lose traction in the future.
And finding stuff in IBM systems? Not mentioned. Also, interesting.
I don’t know when, but IBM’s $100 billion in revenue needs some oxygen going forward. The race is not a marathon. It’s more like a 200 or 440. Maybe Google will be in the race? Should be interesting.
Stephen E Arnold, March 24, 2010
No pay for this write up. I will report this to the GSA who has tapped IBM to build its next generation computing infrastructure. I think IBM will be compensated for this necessary work.
Google Bombshell: Alleged Links to Intelligence Services Alleged
March 22, 2010
I was plonking along looking at ho hum headlines when I spotted “Chinese Media Hits Out at Google, Alleges Intelligence Links”. The addled goose does not know anything about this source nor about the subject of the article. But the addled goose is savvy enough to know that if this story is true, it is pretty darned important. The main point of the story in Economic Times / India Times is:
Xinhua said in an editorial: “Some Chinese Internet users who prefer to use Google still don’t realize perhaps that due to the links between Google and the American intelligence services, search histories on Google will be kept and used by the American intelligence agencies.”
Okay, that’s interesting. Several years ago, I heard a talk by a citizen in Washington, DC who made a similar comment. My recollection is that Google was pretty darned mad. I wondered if the citizen in Washington, DC was right or wrong. If another source comes up with more detail, the story becomes much more interesting.
Chinese intelligence agents are pretty savvy. And the Ministry of State Security is one of the best. I can’t remember whether Section 6 is the go-to bunch, but perhaps more information will surface.
Stephen E Arnold, March 22, 2010
A freebie. I will report non payment to DC Chief of Police who is really clued into Google’s activities in Washington.
Attensity in PR Full court Press
March 2, 2010
Risking the quacking of the addled goose, Attensity sent me a link to its “new” voice of the customer service. I have been tracking Attensity’s shift from deep extraction for content processing to customer support for a while. I posted on the GlobalETM.com site a map of search sectors, and Attensity is wisely focusing on customer support. You can read the “new” information about customer support at the company’s VOC Community Advantage page. The idea is to process content to find out if customers are a company’s pals. Revenues and legal actions can also be a helpful indicator too.
What interested me was the link to the Attensity blog post. “Leveraging Communities through Analytic Engines” presents an argument that organizations have useful data that can yield insights. I found this passage interesting:
Analytical engines cannot stop at simply producing a report for each community; they have to become a critical part of the platform used by the organizations to interact with and manage their customers. This platform will then integrate the content generated by all channels and all methods the organization uses to communicate, and produce great insights that can be analyzed for different channels and segments, or altogether. This analysis, and the subsequent insights, yield far more powerful customer profiles and help the organization identify needs and wants faster and better. Alas, the role of analytical engines for communities is not to analyze the community as a stand-alone channel, although there is some value on that as a starting point, but to integrate the valuable data from the communities into the rest of the data the organization collects and produce insights from this superset of feedback.
Now this is an interesting proposition. The lingo sounds a bit like that cranked out by the azure chip crowd, but that’ is what many search and content processing vendors do now? Wordsmithing.
An “analytical engine” – obviously one like Attensity’s – is an integration service. In my opinion this elevation of a component of text processing to a much larger and vital role sounds compelling. The key word for me is “superset”. This notion of taking a component and popping it up a couple of levels is what a number of vendors are pursuing. Search is not finding. Search is a user experience. Metatagging is not indexing. Metatagging is the core function of a content management system.
I understand that need to make sales, and as my GlobalETM.com diagram shows, the effort is leading to marketing plays that focus on positioning search and content processing technologies as higher value solutions. From a marketing point of view, this makes sense. The problem is that most vendors are following this path. What happens is that the technical plumbing does one or two things quite well and then some other things not so well.
Many vendors run into trouble with connectors or performance or the need for new coding to “hook” services together. Set Attensity aside, how many search and content processing vendors have an architecture that can scale economically, quickly, and efficiently? In my experience, scaling, performance, and flexibility – not the marketing lingo – make the difference. Just my opinion.
Stephen E Arnold, March 2, 2010
No one paid me to write this. I suppose I have to report poverty to the unemployment folks. Ooops. Out of money like some of the search and content processing vendors.
Twitter and Mining Tweets
February 21, 2010
I must admit. I get confused. There is Twitter, TWIT (a podcast network), TWIST (a podcast from another me-too outfit), and “tweets”. If I am confused, imagine the challenge for text processing and then analyzing short messages.
Without context, a brief text message can be opaque to someone my age; for example, “r u thr”. Other messages say one thing, “at the place, 5” and mean to an insider “Mary’s parents are out of town. The party is at Mary’s house at 5 pm.”
When I read “Twitter’s Plan to Analyze 100 Billion Tweets”, several thoughts struck me:
- What took so long?
- Twitter is venturing into some tricky computational thickets. Analyzing tweets (the word given to 140 character messages sent via Twitter and not to be confused with “twits”, members of the TWIT podcast network) is not easy.
- Non US law enforcement and intelligence professionals will be paying a bit more attention to the Twitter analyses because Twitter’s own outputs may be better, faster, and cheaper than setting up exotic tweet subsystems.
- Twitter makes clear that it has not analyzed its own data stream, which surprises me. I thought these young wizards were on top of data flows, not sitting back and just reacting to whatever happens.
According to the article, “Twitter is the nervous system of the Web.” This is a hypothetical, and I am not sure I buy that assertion. My view is that Google’s more diverse data flows are more useful. In fact, the metadata generated by observing flows within Buzz and Wave are potentially a leapfrog. Twitter is a bit like one of those Faith Popcorn-type of projects. Sniffing is different from getting the rare sirloin in a three star eatery in Lyon.
The write up points out that Twitter will use open source tools for the job. There are some juicy details of how Twitter will process the traffic.
A useful write up.
Stephen E Arnold, February 22, 2010
No one paid me to write this article. I will report non payment to the Department of Labor, where many are paid for every lick of work.
Global ETM Dives into Enterprise Search Intelligence
February 18, 2010
Stephen E Arnold has entered into an agreement with Global Enterprise Technology Management, an information and professional services company in the UK. Mr. Arnold has created a special focus page about enterprise search on the Global ETM Web site. The page is now available, and it features:
- A summary of the principal market sectors in enterprise search and content processing. More than a dozen sectors are identified. Each sector is plotted in a grid using Mr. Arnold’s Knowledge-Value Methodology. You can see at a glance which search sectors command higher and lower Knowledge Value to organizations. (Some of the Knowledge Value Methodology was described in Martin White’s and Stephen E. Arnold’s 2009 study Successful Enterprise Search Management.
- A table provides a brief description of each of the search market sectors and includes hot links to representative vendors with products and services for that respective market sector. More than 30 vendors are identified in this initial special focus page.
- The page includes a selected list of major trends in enterprise search and content processing.
Mr. Arnold will be adding content to this Web page on a weekly schedule.
Information about GlobalETM is available from the firm’s Web site.
Stuart Schram IV, February 18, 2010
I am paid by ArnoldIT.com, so this is a for-fee post.