Lingospot: Technology for Publishers
July 10, 2009
I have been thinking about the problem facing traditional publishing companies. Some pundits suggest that publishers need to solve their problems with technology. My view is that publishers think about technology in a way that makes sense for their company’s business and work processes. I have come across two companies who have been providing publishers with quite sophisticated technology. The management of these two firms’ clients gets the credit. The technology vendors are enablers.
I provided some recent information about Mark Logic Corporation in my summary of my presentation to the Mark Logic user’s group a few weeks ago. You can refresh your memory about Mark Logic’s technology by scanning the company’s Web site.
The other company is Lingospot. Among its customers are Forbes Magazine and the National Newspaper Association. The company offers hosted software solutions. Publishing companies comprise some of Lingospot’s customers, but the firm has deals with marketing firms as well.
The company describes its technology in this way:
Lingospot’s patented topic recognition and ranking technology is the result of more than eight years of development and four years of machine learning. To date, our algorithm has identified over 30 million unique topics drawn from more than two billion pages that we have crawled and analyzed. During the last four years, we have collected over five billion data points on such topics, including the context in which readers have chosen to interact with each topic. What does all this mean for our clients? By partnering with Lingospot you have access to the leading topic recognition, extraction and ranking technology, as well as the accumulated machine learning of our platform. This translates into a more engaging experience for your readers and substantially higher metrics and revenue for you.
My understanding of this technology is that Lingospot can automatically generate topic pages from a client’s content and then handle syndication of the content. The Lingospot works with text, images, and video.
The company is based in Los Angeles. Founded by Nikos Iatropoulos, Mr. Iatropoulos was involved with several other marketing-centric companies. He worked for Credit Suisse and, like the addled goose, did a stint with Booz, Allen & Hamilton. His co founder is Gerald Chao, who is the company’s chief technical officer. Prior to founding Lingospot, Gerald served as a Senior. Programmer at WebSciences International and as a programmer at IBM. Gerald holds a MS in Computer Science and a PhD in statistical natural language processing, both from UCLA.
Publishers are embracing technology. My hunch is that the business models need adjustment.
Stephen Arnold, July 10, 2009
Kartoo Adds New Interface Functions
July 9, 2009
Kartoo’s interface has added some features. If you have not visited the site for a while, you will want to navigate to the Kartoo main page. Set your preferences for this Flash based metasearch system. The interface has visual impact, but an addled goose like me wanted pop up explanation of the icons. The options page looks like this:
Now enter your query in the search box at the top of the page. Unlike the Kartoo interface of the past, you have a larger, cleaner presentation of the relevant hits. When you hover over an icon, Kartoo displays a relationship line. For the query “US financial crisis” the system displayed these results:
When you click on one of the thumbnail images, Kartoo sends you to the source site. If you hover, Kartoo displays a pop up with a text snippet.
On the left column of the interface are two buttons. You can select what supplementary content you want to see. I selected topics, allowing me quick access to only those hits about one of the identified categories. I also instructed the system to show me images. You can see the images, which are presented in low resolution, in the scrollable side bar below the topics.
Kartoo Technologies is based in Paris. The company has been one of the firms pushing the envelope in search interface designs and controls. Information about the company’s products and technologies may be found on the Kartoo corporate Web site. The company now has more than 200 customers who use the firm’s technologies for visualization and intelligence monitoring. The Kartoo teams are located in Clermont-Ferrand, France.
Stephen Arnold, July 9, 2009
Mobile Content: A Really Big Market
July 9, 2009
Short honk: A consulting firm with which I am not familiar calculated the size of the mobile content market. You can find the news story “Mobile Content to Top $340 billion in 2013” here. In addition to the starling figure, the article said:
The billing relationship operators have with subscribers, marks them apart from other companies, allowing them to exert significant levels of control over what content the majority of subscribers have access to.
Yep and customer control. Big opportunities. Will telcos muff the bunny or put the bunny in the hutch?
Stephen Arnold, July 8, 2009
The Guardian Sees Opportunity in User Generated Content
July 9, 2009
The Guardian is taking a week off from its nibbling at Googzilla’s paws. I found “User-Generated Content Is Only the Beginning” here a gust of early autumn wind. Jeff Jarvis, WWGD author, is the author of the essay. He wrote:
Bild’s [a newspaper] partnering with readers is part of a trend that this paper’s editor, Alan Rusbridger, is calling the “mutualisation of news”. Last week, he talked with staff about the concept, using as examples the paper’s own collaboration with readers. The separation between reporter and reader, he said, blurs as they work together. That is the future – and natural state – of media; collaboration not just in content creation, but now in advertising as well.
The idea is that publishing companies can tap into the stem cell growth of user generated content. Sounds good. My thought is that those autumn winds, no matter how warm and gentle, come before the long winter. Where is my parka?
Stephen Arnold, June 9, 2009
Chrome: A Shiny Wrap for the Plumbing of the Google
July 8, 2009
Short honk: The other shoe has fallen. I can’t compete with the hundreds of analyses of Google’s announcement that its Chrome browser is not really a browser after all. Not much of a news flash for the goslings at Beyond Search’s headquarters in rural Kentucky. Quite a few of Google watchers were surprised, and that is encouraging because Google has other shoes to drop. A few are not even shoes. Think steel toed boots that will make a big noise in the database world when the Google decides the time is propitious to upset the well-ordered lives of IBM and Oracle.
For the moment, however, I want to point you to TechMeme’s excellent write up “Google Drops a Nuclear Bomb on Microsoft and It’s Made of Chrome”. For me, the key sentence in the article was:
What Google is doing is not recreating a new kind of OS, they’re creating the best way to not need one at all.
There you have it. An open source alternative to Microsoft’s desktop operating system. Windows Mobile never had a chance against Android. The fine Vista product gave Google time to snap together bits and pieces to create a non-operating system. Now that Windows 7 is getting press and the Wall Street SlimFast gulping wizards are salivating over the Windows 7 revenue bonanza, the Google tosses a sneaker from the math club window, and it hits on a slow news day and bounces up to give Mr. Redmond a nose bleed. Ouch.
Let me offer several addled goose observations:
- The open source angle is a nifty one. Google does the “community” a service and automatically enlists those math club members world wide in a mission to snap a chastity belt on Microsoft’s promiscuous software
- The earlier notion of Chrome as a browser wows the azure chip consultants, the Google pundits, and the legions of analysts who bought into the idea that the browser wars were back and competition among browsers was a good thing for everyone. Close but no cigar. Chrome is code shim on steroids.
- With Chrome, Google takes one step closer to the 21st century embodiment of the full service digital Swiss Army knife. Viewed from one angle, Google is a publisher. Viewed from another, Google offers an operating system, embraces and validates cloud computing, and makes it financially more difficult for a competitor or a government to duplicate the Google’s “as is” infrastructure. (The British government wants to use the Google instead of its bungled national health information service. Sure, the Microsoft Health Vault will get a piece of the action, but this is a Google centric play, not a Microsoft centric play from what I hear.
To wrap up, the Google is now officially on the march. Microsoft is just a way station on this modern day pilgrimage. What is at the end? I think Google needs a flag because it now looks more like an emerging nation than a services and software company. Chrome is an “as is” play. Dataspaces are “to be” and qualify as a steel toed boot. If you think Microsoft is the only target, you may want to step back and look at the larger universe of computing services in my opinion.
Stephen Arnold, July 8, 2009
SharePoint Sunday: Best Practices Series
July 8, 2009
An interesting series of Web log articles lit up my radar screen. You can find the first installment of ““Best Practices of SharePoint Farm Configuring and Deployment” here. SharePoint Magazine plans six segments:
- Part 1 – Architecture and Logical Planning
- Part 2 – Installation is here.
- Part 3 – Development Environment
- Part 4 – Backup and Recovery Strategy
- Part 5 – Virtualization
- Part 6 – Post Deployment.
The diagram below provides a visual reminder why such best practices’ write ups are essential for SharePoint. The system is complicated, and one can argue over lunch whether the demo Google Wave is more or less complicated. I know for certain that SharePoint is a hairy beastie. Your mileage may differ, and Microsoft Certified Partners will insist that SharePoint is not too tricky. Just add a dash of hoisin sauce to the MVP’s pre invoice appetizer.
Source:Michael Nemtsev
The author is Michael Nemtsev is a Microsoft MVP and an expert in SharePoint and .NET technologies. With the strong software engineering and development background, Michael has been leading development projects for industry giants like Microsoft, IBM and Dell since year 2000. Based in Sydney, Australia, Michael is a SharePoint consultant for Readify, Australia’s industry recognized business technology company. Michael runs “SharePoint Tips & Tricks” site http://sharepoint-sandbox.com, SharePoint blog and tweets actively (via @laflour).
I will do my addled goose best to update the links as I locate them. If I miss one, please, use the comments section of this Web log to provide the two or three readers I have of the source material.
Stephen Arnold, July 8, 2009
Overflight for Attensity
July 8, 2009
Short honk: ArnoldIT.com has added Attensity to its Overflight profile service. You can see the auto generated page here. We will be adding additional search and content processing companies to the service. No charge, and this is a version of the service I use when those who hire the addled goose to prepare competitive profiles. I have a list of about 350 search and content processing vendors. I will peck away at this list until my enthusiasm wanes. If you want a for fee analysis of one of these companies, read the About section of this Web log before contacting me. Yep, I charge money for “real” analysis. Some folks expect me to survive on my good looks and charming personality. LOL.
Stephen Arnold, June 8, 2009
Profit in Data Theft
July 8, 2009
I read Dancho Danchev’s “Microsoft Study Debunks Profitability of the Underground Economy” here. I have not read the Microsoft study, but I want to recommend both documents to my addled geese and you, gentle reader. The arguments strike me as germane to online information and data. Mr. Danchev’s view is that there is money is underground endeavors. Check out the article. My question is, “If there weren’t money in underground plays, why are there so many efforts?”
Stephen Arnold, June 7, 2009
Elsevier Policy Change for Sponsored Publications
July 8, 2009
Short honk: According to CBC News, “Publisher Elsevier to Revise Journal Sponsorship Rules” here. The CBC said”:
“Elsevier will review practices related to all article reprint, compilation or custom publications and set out guidelines on content, permission, use of imprint and repackaging to ensure that such publications are not confused with Elsevier’s core peer-reviewed journals and that the sponsorship of any publication is clearly disclosed,” the company said in a statement on Thursday. The company expects to complete its review and issue new guidelines by June 30. An internal review showed a series of publications produced in Australia between 2000 and 2005 used the words “Journal of” in their name but lacked proper disclosure of sponsors, were not, in fact, journals and should not have been titled as such, the company said.
A shift in advertorials and their attendant revenue is underway. No word about the impact of those researchers who were unaware of the difference between peer review and sponsored publications. A somewhat spicey commentary is here.
Stephen Arnold, June 8, 2009
Hewlett Packard Extreme Scale Out for Extreme Cost
July 7, 2009
I read CBR Online’s June 11, 2009, article “HP Unveils new Portfolio for Extreme Scale-Out Computing” here. I decided to think about the implications of this announcement before offering some observations relative to the cost of infrastructure for search, content processing, and text mining licensees. The CBR Online staff did a good job of explaining the technical innovations of the system. One point nagged at me, however. I jotted down the passage that I have been thinking about:
HP said that the new ProLiant SL server family with consolidated power and cooling infrastructure and air flow design uses 28% less power per server than traditional rack-based servers. Its volume packaging also reduces the acquisition costs for customers who require thousands of server nodes.
Now this type of savings makes perfect sense. For example, if an online service were to experience growth to support 3.5 billion records and deliver 200 queries per second, under traditional search software and server architecture, the vendor may require up to 1,000 servers (dual processor, dual core, 32 Gb of RAM). Lots of servers translates to lots of money. Let’s assume that energy costs consume about 15 percent of an information technology group’s budget.
A 28 percent reduction would yield about $0.28 cents power cost reduction. But what’s the cost of these new servers? HP hasn’t made prices available on its Web site. I was invited to call an 800 number to talk with an HP sales engineer, however. The problem is that information is growing more rapidly than the 28 percent savings, so I think this type of hardware solution is a placeholder. The math can be fun to work out but the cost savings won’t be material because exotic servers are incrementally better than what’s available as a white box. Google figured this out a long time ago and has looked for power savings in many different places. These include on board batteries instead of uninterruptable power supplies and software efficiencies. Using commodity gizmos gives Google, as I reported in my BearStearns’ reports, a significant cost advantage that is not 0.25 percent but a four X reduction; that is, $1.00 becomes $0.25. Google has an efficiency weapon that cannot be countered with name brand hardware delivering modest improvements. Just my opinion, gentle reader, except that BearStearns paid me for my research and then published it in two reports in 2006 and 2007. Old news, I know, but the newer stuff is available in Google: The Digital Gutenberg here.
PC World shed some light on the configuration options of these servers. James Niccolai’s “HP Designs New Server for Extreme Scale Out Computing” here said:
HP introduced three ProLiant SL models, all 2u deep, but it emphasized that the configurations are flexible, and HP will even work with large customers to design specific server boards, said Steve Cumings, an HP marketing director. The SL160z is for memory- and I/O-intensive applications and comes with up to 128GB of DDR3 memory. The SL170z is a storage-centric system that can hold six 3.5-inch drives. And the SL2x170z, for maximum compute density, can hold up to four quad-core processors, for a total of 672 processor cores in a 42u rack.
I found the write up “HP High Efficiency Cloud Infrastructure Servers, Moving Away from Blades, Learning from Data Center Operation” in Green Data Center Blog here quite interesting. The point the article made was that blades create some data center management problems. The article stated:
The problem with blades is high density computing created hot spots with problems airflows. But, this behavior to use blades was driven by chargeback models that used rack space occupied. Which artificially can bring down IT costs when in reality it increases costs. Just read the above quotes again, on how these latest servers are the most efficient.
The infrared snap included with the Green Data Center Blog makes the hot spots easy to “spot”, no pun intended:
To be fair, the same problem was evident when I visited an organization’s on premises data center with dozens of Dell blades stuffed into racks. The heat was uncomfortable and the air conditioning was unable to cope with the output.