Concept Searching Update
July 3, 2009
Founded in 2002, Concept Searching provides licensees with search, auto-classification, taxonomy management and metadata tagging solutions. You can download a fact sheet about the privately firm here. The software can be used on an individual user’s computer or mounted on servers to deliver enterprise solutions. The company’s secret sauce is its statistical metadata generation and classification method. The technology uses concept extraction and compound term processing to facilitate access to unstructured information. The company operates from Stevenage in Hertsfordshire. A list of the Concept Searching offices is here.
The company emphasizes the value of lateral thinking, and its approach to content analysis implements numerical recipes to find these insights and linkages within unstructured text.
When I updated my profile for this company earlier this year, I noted that the firm had signed Portal Solutions, a company that focuses on things Microsoft. The idea is to make it possible for a user to search for “insider dealing” and retrieve documents where that bound phrase does not appear but a related phrase such as “insider trading” does appear. This type of system appeals to intelligence officers and financial analysts. Concept Searching’s methods generated lists of related topics. You can see an example of the system in action by navigating to this page. I ran several test queries and the interface provided useful information and suggestions about other related content in the processed corpus. A screen shot of the output appears below:
Concept Searching is a Microsoft and Fast Search partner. The idea is that Concept Searching’s technology complements and in some cases extends the search and content processing services in Microsoft products. In May 2009, the company sponsored a best practices site for Microsoft SharePoint. The deal involves a number of companies, including ShemaLogic, KnowlegeLake, and K2 Technologies among others. The site is supposed to go live in the next couple of weeks, but I don’t have a url or a date at this time.
The company had a busy May, signing deals with Allianz Global Investors, Directory, and AT&T Government Solutions.
For me, the most interesting system that Concept Searching offers is its ability to generate and classify terms found in SharePoint documents into a taxonomy. The company has prepared a brief video that demonstrates this functionality. You can find the video here. The company’s approach does not require a separate index. Microsoft Enterprise Search can use the outputs of the Concept Searching system. I noted two “uniques” in the narrative to the video, and I remain skeptical about categorical affirmatives. I think the bound phrase extraction and the close integration with SharePoint are benefits. I just bristle when I hear “unique”, which means the one and only anywhere in the world. Broad assertion in my experience.
Concept Searching’s president, Martin Garland, said here:
Our intellectual property is still unique as we are the only statistical search technology able to indentify multi-word patterns within text and insert these patterns directly into the index at ingestion or creation time. We call this “Compound Term Processing”.
Last week I sat in a briefing given by one of Microsoft’s enterprise search team. I thought I heard descriptions of functions that struck me as quite similar to those performed by Concept Search and such companies as Interse in Copenhagen, Denmark.
I think it will be fruitful to watch what features and functions are baked into the upcoming Microsoft Fast ESP version of the old Fast Search & Transfer system. Remember: the roots of Fast Search stretch deep to 1997, a year before Google poked its nose from the Stanford baby crib.
Partners like Concept Searching have invested significant resources in Microsoft technologies. Will Microsoft respect these investments, or will Microsoft in an effort to recoup is $1.23 billion investment take a hard line toward such companies as Concept Searching.
I am on the fence regarding this issue.
Stephen Arnold, July 3, 2009
Sci Tech Publishers: Doom Looms for the Tech Challenged
July 3, 2009
Quite interesting essay by Michael Nielsen: “Is Scientific Publishing about to Be Disrupted?” The answer is soon. I don’t agree. Sci tech publishing is in the midst of a crisis. If you want to know about Mr. Nielsen’s good news interpretation of the coming disruption, dive in.
Mr. Nielsen, in case you haven’t been keeping up with quantum computation, is a real life wizard. He is one of the pioneers of quantum computation. Together with Ike Chuang of MIT, he wrote the standard text on quantum computation. This is the most highly cited physics publication of the last 25 years, and one of the ten most highly cited physics books of all time (Source: Google Scholar, December 2007). He is the author of more than fifty scientific papers, including invited contributions to Nature and Scientific American. His research contributions include involvement in one of the first quantum teleportation experiments (related), named as one of Science Magazine’s Top Ten Breakthroughs of the Year for 1998, quantum gate teleportation, quantum process tomography, the fundamental majorization theorem for comparing entangled quantum states, and critical contributions to the formula for the quantum channel capacity.
He explains that publishers are victims of a local optimum; that is, publishers know where they should take their companies. Publishers just can’t bridge the gap. He provides a useful discussion of the knocks traditional media deliver to the digital door to online information.
But the guts of the write up are gathered in his discussion of non traditional publishing of scientific and technical information. The links are useful and the examples are compelling. Let me mention one; the others you can glean directly from his write up. He wrote:
Or consider startups like SciVee (YouTube for scientists), the Public Library of Science, the Journal of Visualized Experiments, vibrant community sites like OpenWetWare and the Alzheimer Research Forum, and dozens more. And then there are companies like WordPress, Friendfeed, and Wikimedia, that weren’t started with science in mind, but which are increasingly helping scientists communicate their research. This flourishing ecosystem is not too dissimilar from the sudden flourishing of online news services we saw over the period 2000 to 2005.
He concludes his essay with some examples of new opportunities. His recipe for success is that publishers must understand technology in the way Steve Jobs and Messrs Brin and Page do. That’s where he and I part company. A technologist like Mr. Nielsen assumes that a motivated manager can identify, recruit, and manage a world class technologist or somehow edge closer to this capability.
Won’t happen. Technologists like Mr. Nielsen come from a different dimension; sci tech publishers adopt a very different technology world. Nevertheless, the essay is interesting and worth reading.
Stephen Arnold, July 3, 2009
OECD Data Diving
July 3, 2009
Short honk: Want to explore OECD country data. First, read the BBC story “Exploring the OECD Web Site” then navigate to OECD Explorer. Ideal for those who want short cuts to data analysis.
Stephen Arnold, July 3, 2009
UFC 2010: HTML 5, Air, and Silverlight
July 3, 2009
Mary Jo Foley opened my eyes to a new unlimited online fighting battle in 2010. Her story with a lamentably cryptic headline appeared on June 11, 2009 as “Microsoft .Net RIA Services: Not until 2010.” You can find the article here. He story revealed that Microsoft will try to push its Rich Internet Application technology into the market in 2010. She wrote:
.Net RIA Services is designed to allow coders to bring together the .Net programming model with Microsoft’s Silverlight competitor to Adobe Flash. Microsoft made a Community Technology Preview (CTP) of the technology available in March, but didn’t provide any final availability information.
The RIA acronym means stuff like Adobe Flash and Google’s HTML 5 methods. The idea is that a computing device with an Internet connection can look and feel like a traditional application, a DVD player, or an immersive game. The end of shrink-wrap software and the money machine that made Microsoft and Adobe the big dogs each is today is likely to whine and stumble to a limp along, not a footrace.
I want to capture my thoughts about the dust up:
- I think Adobe is the weakest of the three combatants in the UFC 2010 digital slugfest. Adobe’s pushing the envelope with its license fees now. The sudden spate of security problems coupled with the balky nature of some Adobe Air implementations means that whatever cash Adobe has will not be enough to cope with the GOOG and the Softies.
- The Google team has a quasi-open source angle. The Microsoft team wants everyone to get with the Windows agenda, memorize it, and live it. This is a toss up because Google has been stumbling of late with regard to security, government regulations, and that old annoyance copyright. Microsoft is Microsoft, so it is a force no matter how wacky the Silverlight code may be.
- The financial climate, despite the sunny news from TV commentators, looks bleak to me. As a result, each of these UFC 2010 fighters will be ready to rumble. I think fingers in the eyes, low blows, and blows to the back of the neck will be entertaining tactics to watch.
In short, Ms. Foley reminded me to make time in 2010 for this traveling road show.
Stephen Arnold, July 3, 2009
Google Books: Legal Eagles Carry On… Er, Carrion
July 2, 2009
It is official. an investigation of Google Books is stumbling forward. You can get “D” word on this by reading DOJ Confirms Antitrust Investigation Into Google Book Settlement. Will this be Google’s Salamis? Exciting times for the Google, those who are parties to the signed contract, those who are excluded, those who are fearful, and those who don’t see what the Google has been doing for more than a decade. As I said in my study The Google Legacy, “Lawyers can kill the GOOG.” Was I prescient? Just plain wrong? We’ll know in three or four years I suppose.
Stephen Arnold, July 2, 2009
YAGG: Google App Engine Takes a Long Lunch
July 2, 2009
Short honk: Fresh from its criticism of Microsoft’s approach to data centers, Google makes clear its engineering approach to reliability. TechCrunch reported “Google App Engine Broken For 4 Hours And Counting.” That early Google patent document about quality of service may not be in the hands of the App Engine team I surmise. YAGG is the addled goose’s acronym for “yet another Google goof.” Will Google issue another critique of the Microsoft approach today to obfuscate what seems to be a Googley way to bring some fireworks to App Engine users’ pre holiday festivities?
Stephen Arnold, July 2, 2009
Selling Bing: Great Expectorations
July 2, 2009
I was not going to comment on the vomit and porn advertisement for Microsoft. Nasty stuff. I want to point you, gentle reader, to the Register’s “Microsoft Distances Self from IE 8 Puke Ads.” Gavin Clark wrote:
Microsoft told Cnet‘s Chris Matyszczyk: “While much of the feedback to this particular piece of creative was positive, some of our customers found it offensive, so we have removed it.” The ad was one of four in Microsoft’s Better Browser campaign of spoof 1950s informercials, and the point was to promote IE 8’s private browsing feature.
Impressive creative and remarkable rationalization. However, keep in mind that this is a company that bought a search vendor involved in an ongoing police investigation that has now seeped to the accounting firm validating the Fast Search financials. Par for the course. I wonder if Microsoft Fast works as well as the actress’s faux expectoration? Probably not a question I wish to explore. I think I will run a query on Bing.com for “management judgment.” Isn’t this ad the Dickens?
Stephen Arnold, July 2, 2009
Search and Maxing Out the Grid
July 2, 2009
I recall meetings at Halliburton’s old Nuclear Utilities Services unit we talked about the problem of sucking too much power from the grid. The grid, of course, is a metaphor for a complicated set up of devices and cables that move power from where it is produced to an end point or end points. I found the reference in the Slashdot article “NSA To Build 20-Acre Data Center In Utah” a blast from the past. My thought is that the problem of power sucking data centers is not on most folks’ radar. Along with the deteriorating roads and bridges in rural Kentucky, the power generation industry faces a similar problem. Search requires big data centers. Green yapping aside, as the volume of data increases, the need for megawatts goes up. Green is good but not even zippy chips and clever ideas like uninterruptable power supplied by conventional flashlight batteries is enough. Exciting times and costs ahead for those with big data centers. Plumbing may make the difference between the winners and the losers in search and content processing.
Stephen Arnold, July 2, 2009
Bing.com Tweets
July 2, 2009
Short honk: I noticed on July 1, 2009, that Bing.com has begun adding Twitter messages to its search results. The Google looks a bit flat footed in this area, although Google Wave was a good demo. Bing. Is it the real thing? You can find more information in IT Pro’s “Bing Integrates Twitter Data into Search”.
Stephen Arnold, July 2, 2009
SAP: Dinosaurs Resist Extinction
July 2, 2009
Kelly Fishash’s “SAP Hits On Demand SaaS Button to Avoid Extinction” here reminded me that I had in my write up pile a comment about the German software giant’s latest reflex action. Mr. Fishash wrote:
SAP, in a spectacular U-turn, has leapt on board the software-as-a-service bandwagon – the company confirmed its new selling strategy yesterday [June 10, 2009]. The German software giant, which was speaking at an On-Demand conference in Amsterdam on Wednesday, said it will launch SaaS functionality add-ons for its existing Business Suite ERP customers soon. It will wedge open the door to its Large Enterprise on-Demand product, to allow companies to bolt on SAP’s web offerings with their core, on-site or hosted ERP platforms.
I think SAP is one of those companies that merits close observation. The company is a variant of the IBM approach to software and services; that is, big, complex, expensive, and an exemplar of the “take your medicine” method. The SAP TREX search system is interesting, but I don’t see much about it. I track TREX in my Overflight service (sorry, this part of the service is not available for free at this time). I did a write up about TREX in one of the three editions of Enterprise Search Report I wrote. I did not include the system in my 2008 Beyond Search because I just wasn’t hearing much about the company. I continue to follow SAP outfit because it pumped cash into Endeca via its venture unit a year or so ago. I wondered if SAP execs recognized that Endeca required similar upfront consulting for its search and content processing system. The SAP system is front loaded in the same way, and both SAP and Endeca avoid offering bargain basement pricing on enterprise systems.
Now I learn that after a run at raising some fees, SAP is embracing SaaS or Software as a Service which is a more trendy name than timesharing.
Dennis Howlett’s “European SaaS Vendors: Not Quite Comfortabole in Their Skins” here made this point in his June 10, 2009 article:
you have John Wookey’s announcement of SAP’s saas plans. Confused or not, it speaks volumes that SAP chose to make the public announcement to the industry itself. It was greeted with muted acceptance with some muttering that it was defensive while others immediately thought ‘cost.’
I have a slightly different view; specifically:
- SAP is struggling with two financial challenges. The first is the money sucked into the SAP’ black hole of engineering. The company has to spend to keep the quite interesting collection of systems and subsystems working for today’s customers. Second, the company has to find a way to fund research that gets the SAP systems out of the dinosaur trap and into the Googzilla type of low cost engineering mode that Messrs. Brin and Page use. Even Amazon has figured out that open source and commodity hardware are a way to control costs. (Amazon reliability is another issue, however.)
- SAP’s customers are either happy because the system is up and running, business procedures are understood by licensees’ employees, and senior management just pays for engineering support and upgrades. The big invoices are behind the company. Happy days!
- Competitors like Salesforce.com and the Google are not deaf, blind, and mute to the opportunities the IBMs, Microsofts, Oracles, and SAPs create. So, SAP with its juicy client base and “intersting financial challenges” chugs along with a system creaking under complexity, almost immune to substantive change.
I think sudden shifts like the SaaS “love” are little more than signals that an era is ending. I keep watching for similar indicators from IBM, Microsoft, and Oracle. I wonder which of these three will follow in the footsteps of the SAP dinosaur?
Stephen Arnold, July 2, 2009