Semantic Web: Useful Links
May 10, 2008
Advancing Insights posted a list of useful links for “Web 3.0, RDF, and the Semantic Web”. A content goose squawk for Jim Wilde for the links. Clicking through these documents is instructive. If you follow Google’s activities in the semantic space, you can see why Google has pushed forward with its programmable search engine.
Invented by former IBM Almaden scientist, the PSE or programmable search engine could, if deployed on a large scale by Google, make Google the de facto “hub” for semantic processing. You can download one of the Google PSE documents by navigating to the USPTO’s awesome Web site and searching for US2007 00386616, filed on April 10, 2005, and published on February 15, 2007.
Stephen Arnold, May 9, 2008
New Oracle White Paper: How To Move from Ultra Search to Secure Enterprise Search
May 9, 2008
On April 22 Oracle released a new (to me) white paper on switching over to its new Secure Enterprise Search (SES) program. The company is discontinuing its previous search product, Ultra Search, an engine used to find public documents on corporate servers at large. SES will be included in all future products.
The paper is meant to be a tool to help users migrate from Ultra Search to SES, described as a newer, faster and more secure product developed from Ultra Search building blocks.
Customers are “strongly urged” to make the change to SES. Sounds ominous to me. I wonder if that means Oracle will not only be dropping Ultra Search, but also any support for the older search engine. There’s also a list of ten considerations for migration, and those ten things are not necessarily working in SES’s favor. “Strongly urged,” indeed.
What really caught my attention was this, stated plainly on Page 2: “There are many new features in SES not included in UltraSearch. This document considers only features which have changed, or existed in UltraSearch, but are not present in SES.” Why would you talk about features NOT present in the newer system?
Oracle database administrators understand Oracle reasoning. I’m not a certified Oracle DBA. Ergo, I’m only amused by this trope.
You can download the somewhat hard-to-find document here. Get a copy now before it eludes the Oracle search technology.
Stephen Arnold, May 10, 2008
The AP Analyzes Microsoft’s Live Search Options
May 9, 2008
ut of the aether, I received Jessica Mintz’s story “With Microsoft Mum, Analysts Mull Next Moves for Live Search”. You can read the story here or here. As I often say, snag it quickly. The wild and wonderful world of the Associated Press’s online system can baffle even a skilled researcher.
I scanned the story, intrigued that “analysts mull” much of anything related to search, text processing, or information retrieval. The sector has a glass ceiling that kicks in at the $350 to $400 million level with most companies in the market trying to make the losses and voracious appetite for investment look like a great business.
Her analysis, which I’m confident is Grade A for the AP arrived as I read the PCWorld story “Microsoft’s Answer to Google Sky to Launch at End of May”.
Microsoft has to do more than play me-too if Microsoft is going to hobble Googzilla. The GOOG isn’t very good at PR, marketing, or sales. At least, Microsoft pays attention. That’s a good thing, I suppose.
Ms. Mintz’s interesting essay is about Micrsooft after Yahoo. I think her point is that without Yahoo, Microsoft has no easy, fast, cheap way to increase its search traffic and, hence, its online ad revenue. She writes:
Some analysts say Microsoft must increase its search traffic to attract advertisers. Others believe Microsoft should concede that market to Google Inc. and find success elsewhere — leapfrogging rivals in areas such as display and mobile advertising. All that is clear is Microsoft must come up with a Plan C soon, after acknowledging that its Plan A of going solo was troubled, forcing it to turn to Plan B of acquiring Yahoo. Part of the problem analysts face predicting Microsoft’s next moves is that the company has already tried the obvious tactics. It built its own search-ad platform from scratch and spent $6 billion to buy a major online advertising company, aQuantive. Microsoft overhauled its search engine technology, and most analysts agree that its results are at least as good as Google’s. It tweaked the design of its Live Search service to become more like Google.
Whoa, Nellie!
The most interesting information for me was Ms. Mintz presents a series of action items. I’m not sure if these are Mr. Ballmer’s or if these have been constructed from the search experts Ms. Mintz interviewed for this story. Set aside provenance for a moment. Let’s look at each action item. For ease of comparison, I put Ms. Mintz’s suggestions in the column “Microsoft Tasks” and my comment in the column labeled “Beyond Search”.
Microsoft Tasks | Beyond Search Comment |
Do the basics | Google’s been at the basics since 1998. Time to start I guess |
Innovate in “quick waves” to force Google to play catch up | “quick” and “Microsoft” are an oxymoron |
Change the basic experiences of communication and search | Microsoft needs to deliver search that people actually use |
Gain scale | Good idea. Google’s been building plumbing for a decade. Microsoft’s just started |
I don’t think my research for Google Version 2.0 supports the idea that Microsoft can catch Google with these four actions, individually or collectively. Let me run through my reasoning based on the information available to me.
First, Google delivers a search experience that is increasing its market share. Google’s approach works. Microsoft’s approach hasn’t. What’s astounding to me is that with Internet Explorer’s default search the Live.com service, the canyon in market share is almost unbelievable. IE users are ignoring the default search box and consciously selecting Google. That’s just amazing. One bit of bad news. The market share data are not accurate. Google’s market share is in the 80 percent range. In countries like Denmark, Google’s share is over 90 percent.
Google: Content Management for YouTube
May 9, 2008
My hobby is reading Google’s opaque, jargon-filled, and disjointed patent documents. If you are following the $1 billion legal dispute between the GOOG and the media dinosaur Viacom or you upload video to Google, you will want to take a gander at US 20080109369, “Content Management System” by eight Googlers.
The invention is a control panel that shifts certain content tasks to the person posting content to the Google system. There are references to bits of Google technical magic that make the system smarter than the clunky content management systems that most organizations use.
In my opinion, this Google disclosure could shift the burden from Google to the person or software function posting content. You can download the document from the wonder system provided without charge by the US Patent & Trademark Office. I’m interested in your views of US 10080109369. The Verizon attorneys have undoubtedly gone over this invention with the legal acumen embodied in their sleek selves. I just read this stuff as I find it. This one’s worth a quick look if you are curious about one of Google’s systems for handling the more than one million video uploads pumped into the company every three or four weeks.
Keep in mind that the system and method in this patent document can be extended to other types of content. This invention could–note the could, please–make Google into a great big database publisher. Now Google is just inventing, not doing, what the system and method asserts. Patent applications aren’t products and services.
Stephen Arnold, May 9, 2008
Lingospot: In Text Content Discovery Means Auto Linking
May 9, 2008
A semi-happy quack to the person who called Lingospot to my attention. The company uses linguistic analysis to identify and create dynamic links on publishers’ or bloggers’ pages. The idea is that you hover over a Lingospot link, a “Discovery Bubble” pop ups up and shows content from related Web sites. The idea is that a user will discover new, contextually relevant content. The company offer “online content discovery services”. The idea is that a publisher doesn’t have the money to pay a human to build these “See Also” references.
Lingospot inked a deal with Yedda.com. You can read the full news story here. Hurry, PR announcements can be ephemeral. A more compelling illustration of Lingospot’s system in action appears on the Forbes.com Web site. The idea is that the technology will increase a Forbes visitor’s “engagement”. Translation: time on the site and clicks which presumably will boost ad revenue. The Lingospot asserts that a typical licensee would enjoy a two to five percent increase in page views, a significant boost for a high-traffic site.
Forbes.com exposes the content to the Lingospot system, and then the Lingospot linguistic technology generates the self-referential links. My tests on Forbes.com may have been erroneous, but I was bounced around the sprawling Forbes.com Web site, not set to relevant content on Business Week’s or Fortune’s Web site, which would have been more useful to me.
My typical behavior on any site that features pop ups is to dismiss and ignore annoying fly over ads. I then avoid any links in the article text that produce these pop up links.
I may be the odd duck out (see logo to understand the metaphor), but I want to scan the Forbes.com article, not ads, not related content, and not pop ups that get in the way of reading the story. The Forbes’ story is what caused me to click in the first place. You may have a different view of these helpful “Discovery Bubbles”, and I encourage you to form your own opinion.
There’s a modest amount of information on the Lingospot.com Web site. If I were to sign up for the service, the Lingospot.com system puts a JavaScript snippet on the Web log. At this time, the company supports Blogger, Moveable Type, TypPad, and WordPress. You can, however, put the Lingospot function on any page. I decided to opt out of the service. My experience is that link services have to process the content on the Web site, and the delay can be a couple of days. However, I added Lingospot.com to my list of auto linkers which includes such companies as AdValiant.com, EchoTopic.com, and Kontera.com. You can find even more of these services on the Online Marketing Innovations’ Web log Folden here.
One of my sources told me that the company’s NLP technology is based on five years of research and development. The beta service became available in 2005. The company was founded in 2006.
Lingospot’s CEO is Nikos Iatropoulos. You can hear an interview with him at Social Buzz. Bob Sherry, formerly at ValueClick,is the senior VP of sales and marketing for the company. The firm operates from its offices in Los Angeles. If you want to know more about Lingospot, you could ring 310 475 1600 and leave a message.
The use of linguistic technology to make related information available is a good one. Describing these pop ups as a content discovery tools seems to be massaging a well-worn advertising chestnut. As Google’s dominance of online search continues without a significant challenge from Ask.com, Microsoft.com, or Yahoo.com, these “new marketing tools” becomes important to Web masters who can’t generate enough revenue from traffic to keep the lights on.
What’s interesting to me is that the once-exotic linguistic and semantic technologies are now sufficiently tame for use by marketing companies. The semantic revolution is indeed here when an account rep can mouth a phrase like “in text content discovery”, confident that the demo relies on high-tech voodoo that sort of works. For cash-strapped and traffic-challenged publishers, auto linking may be a modest silver bullet.
Stephen Arnold, May 9, 2008
Search Mountebanks
May 8, 2008
Author’s Note: This is an opinion piece, and it relates to the challenges that organizations face when trying to get the straight dope on an enterprise search, text mining, or any other complex enterprise software solution. If you are supremely confident of your knowledge, enjoy cutting corners, or perceive yourself as smarter than your customers when it comes to business–do not read this essay. Others may proceed at their own risk. I have masked the identify of the companies and individuals in the two “stories” so these folks can continue to pull skunks from their hats without their colleagues and customers seeing the reality behind the stage dressing.
Let’s look at this statement from a ZDNet Web log on May 7, 2008. The post is an interview conducted by Michael Krigsman, a good writer. The subject is US government information technology failures. Mr. Krigsman interviewed technical professionals working at CA (I think that’s the acronym for the “old” Computer Associates). One CA participant is Gil Digioia, a CA vice president, and the other is Jose (sic) Mora, a senior director. Both of these CA specialists are involved with “Federal Project Portfolio Management Sales for CA Clarity”. You can read more about CA Clarity here.) I’m not sure whom Mr. Krigsman is quoting in the segment below, but I thought these comments were remarkable:
There is room for execution improvement regarding the triple constraints of scope, time, and budget. The big reason is lack of “critical corrective action” from high-level decision makers within the organization. This can result from either lack of decision-making or leaders who don’t have the proper information to make decisions that ultimately impact the project.
Requirements also tend to change after projects have been awarded, and are often different at project conclusion from what was specified in the original proposal. These changes tend to disrupt project work flow and collaboration. Such challenges pose particular difficulties for organizations that don’t have a repeatable governance process in place or lack the proper technology to react easily to those changes.
Problems can arise at the project management level, the executive decision-making level, and with technology. For example, problems are sometimes caused by legacy systems that can’t adapt to the rapid changes in information these organizations face.
My interpretation of these statements is that Federal managers can’t manage. The folks involved in requirements don’t know what they need, so technical requirements are built on Jello. Legacy systems screw up the newer systems. What this says to me is: “The vendor is NOT at fault when project fail.”
Is this your IT or search consultant? Are you getting a skunk instead of a more tractable animal? Image source: http://cjonline.com/images/092906/41543_270.jpg
Powerful stuff. I relished how the interview subjects shifted the problems to the client. In fact, these comments have been made about enterprise search (what I call behind-the-firewall search or Intranet search). A search vendor groused to me at the Boston Search Engine Meeting that one of his largest clients doesn’t know what search is supposed to do. The client is the problem. When I pay to have my roof repaired, if the roof leaks, am I at fault. I expect the roofing guy to fix the roof. I guess my simple reasoning doesn’t apply to information technology projects.
Cluuz.com: Military Intelligence-Like Functions for Web Metasearch
May 8, 2008
One of my business associates in Canada sent me a link to an interesting search engine named Cluuz.com. The system–unlike the shy Powerset, a media darling developing a semantic search engine–is available for anyone to use. Navigate to Cluuz.com. Make sure you add the extra “u”, or you will be looking at a plain text page from the graphically restrained Clue Computing operation in cow country.
Cluuz.com takes results and applies semantic processes to them. Some of the company’s display options are a bit too sophisticated for my 64-year-young eyes, but I found the system quite useful. Let’s run through a basic search and take a cursory look at some of the features that I found interesting. Then I want to comment on the semantic search boom or boomlet (depending on how jaded you are), and conclude with several observations. In the last few days, the shrinking violets in the Big Name search vendors’ public relations department have reduced their flow of 30-something insights. Perhaps my comments about semantic search will “goose” them into squawking. I certainly hope so. Life’s no fun in rural Kentucky without well-groomed Ivy League wizards asserting their intellectual superiority in email speak.
A Query for Cluuz.com
Navigate to the Cluuz.com splash screen. Make certain that you have checked the option under the search box for “Charts”. We’ll look at the other options in a moment. Now enter the test query as shown in italics: Google +”programmable search engine”. Here’s my result for this query on May 7, 2008:
The system processes results from MSN (search.live.com) and Yahoo, processes them, and displays this map. Note that the system identifies important people and companies. The system correctly identifies the Google Forms service as related to the “programmable search engine”.
The system offers other ways to view the results set. For example, you can look at hits from the search engines to which the query is passed as a traditional laundry list. Other choices include a cluster display and a Flash display which is, in my opinion, cluttered with sliders, controls, and options.
You can also enter a more complex query using the Cluuz.com advanced search page. In my tests, the system did a good job of dealing with specific Boolean queries. You can also set preferences, which may not be necessary for a metasearch-based approach to generating hits.
Apple Going Its Own Way in Search
May 8, 2008
On May 6, 2008, the USPTO granted US 7,369,987 to Apple Inc. In my research for Beyond Search, one source told me that Apple was having some “difficulties” with its search-and-retrieval system for iTunes and OS X. I dismissed the comment because I had no corroboration. Apple is paranoid about what it does and how it does it. I was, therefore, intrigued by the invention disclosed as a “Multi-Language Document Search and Retrieval System”.
I’m no attorney, so you will need to download the document from the wonderful search system provided without charge by the US Patent & Trademark Office. Please, pay close attention to the syntax the USPTO’s outstanding search system requires. Google-style queries won’t work on this puppy.
Apple’s invention, according to US 7,369,987 is:
A multi-lingual indexing and search system … that performs tokenization and stemming in a manner which is independent of whether index entries and search terms appear as words in a dictionary.
The disclosures in this document make it clear that Apple, like Google and Microsoft, are poking around in similar algorithmic gardens. The claims put Apple in the search game. The document makes for interesting reading if you like legalese and information retrieval jargon. Maybe the iTunes’ search system will be juiced. I’m pretty happy with the built-in search function on my trusty Mac.
Stephen Arnold, May 8, 2008
Redmond Magazine: Brainware a Winner in Desktop Search; Google, the Loser
May 8, 2008
Redmond Magazine, an independent publication that tracks things Microsoft, featured a “bake off” among desktop search systems. The companies in the technology comparison are Brainware, dtSearch, Google Desktop Search, and Microsoft Windows Desktop Search. The winner? Brainware’s Globalbrain search technology based on patented technology. The company uses trigrams–three letter sequences–to identify relevant documents.
You can read the summary of the bake off here. These types of reports can disappear or become hard to find in a blink, so click quickly.
The most interesting part of the analysis is that Globalbrain scored lower on features, but beat the other vendors’ systems on documentation, ease of installation, ease of use, and administration. dtSearch, however, matched Globalbrain on installation ease. In my own tests of these systems, I found some of Globalbrain’s terminology confusing, particularly with regard to selecting specific collections to index and search. Obviously the Redmond Magazine team didn’t have any problem with terminology.
Another key finding was that the lowest-rated search system was Google Desktop Search. In our tests, Google fared near the top of the stack, but it lagged behind both Coveo and ISYS Search Software. These two vendors’ products were not included in the Redmond Magazine analysis.
Microsoft Windows Desktop Search scored only slightly better than Google. In our tests of Windows Desktop Search, we encountered odd latency when the system processed certain queries. Again, our experience seems to be at variance with the Redmond Magazine results.
Bottomline: Redmond Magazine flagged Brainware’s Globalbrain as the “winner”; Google, the vendor with the desktop search system that needs the most work. One tip: if you run a query on Live.com or Yahoo.com to locate Windows desktop search, make certain you get the desktop system, not the SharePoint Search Express system. Two different animals, and the Redmond Magazine test looked at the desktop version which does not require SharePoint.
Stephen Arnold, May 8, 2008
EasyAsk: Business Intelligence for End Users
May 7, 2008
Progress Software purchased EasyAsk in May 2005. Prior to the change in ownership, EasyAsk offered natural language search to a range of government and commercial clients. After the buy out, Progress narrowed the focus of EasyAsk, as I understand the transition, from a broad search vendor to eCommerce.
The initial positioning, according to information in my files, was:
The Progress EasyAsk Division provides natural language ad-hoc query solutions that empower non-technical users to quickly find and retrieve critical business information from multiple enterprise data sources. In addition, EasyAsk provides an integrated search, navigation and merchandising platform that optimizes the shopping experience on many of the world’s most successful eCommerce sites.
The value add that EasyAsk offered customers was a higher conversion rate than the conversion rate achieved by competitive software. In 2006, some of the company’s licensees–for example, Redcats USA and Lillian Vernon–reported conversion rates 15 percent or higher than the rates from competitor’s software. You can try out the EasyAsk system yourself by navigating to Lillian Vernon or Lands’ End. EasyAsk’s commercial customers don’t make their system accessible to outsiders. If you get a chance to access Ceridian’s Intranet , you can check out EasyAsk in a behind-the-firewall setting because EasyAsk is now pushing into the business intelligence market.
Now EasyAsk is expanding its scope and asserting that its system is a front end for the data mart or data warehouse. EasyAsk calls its approach “operational business intelligence”. EasyAsk describes its system as being “closer to the ground”; that is, it’s more accessible than traditional BI systems. Users require little or no training to create a custom report. Interaction is via a traditional search box or a point-and-click, assisted navigation interface. If a data warehouse is already built, EasyAsk can deploy its system in a matter of days.
In an interview on the Business Intelligence Network, EasyAsk’s Dr. Larry Harris, vice president and general manager of the EasyAsk division of Progress Software, said:
The inherent complexity of traditional BI tools prevents organizations from deploying these solutions company wide, and this inhibits individuals who might otherwise be able to act on the insight these tools provide from making better business decisions. EasyAsk for Operational BI provides employees at all levels of the organization with the ability to perform ad hoc business analysis as well as search for existing reports through the familiar search box interface, empowering them to make better business decisions more quickly.
A number of vendors are addressing the knowledge barrier that prevents industrial-strength business intelligence systems from broader use in an organization. If you know how to code and have a degree in statistics, the complexities of building queries and manipulating data cubes are trivial. For the average MBA, building a chopper from a stack of parts would be less difficult.
This graphic shows typical outputs from EasyAsk in response to a user’s natural language query.The user types a query; for example, “Crosstab sales by customer’s state and category” or “What account in the Bay area had the most orders in Q4, 2007?”
That’s the hurdle BI or business intelligence must leap over without tripping. EasyAsk’s trampoline is its NLP or natural language processing capability. The idea is that the user can type a “natural” question. The EasyAsk system “understands” the user’s query, converts it to a form understandable by the system, retrieves the needed information, and displays an answer.