FleeQ, a Semantic Search Engine
June 17, 2010
FleeQ is “a Web 3.0” search engine. The company’s Web site says, “Search everything in real time!” Another universal affirmative. According to the company’s Web site, “FleeQ pays 20X the CPC of AdSense.” I am a simple goose, so one site describes itself in two different ways. The company is based in Palo Alto, California.
The system, according to the firm’s Web site:
“FleeQ is a new kind of network. It powers your websites search/discovery for your users.”
In order to get a better sense of the system, I ran a number of test queries. You can follow along but make certain you enter the address: http://www.fleeq.com. Once you enter the site, it is a bit of work to get back to the search box.
Here’s the splash page which points out that I am using the Flash version of the service:
My most interesting test query was for the term “taxonomy.” The list of hits include two references to Wikipedia. This is the default results list:
The points to note are the two tabs which allow one click access to images and videos. There is a list of tabs across the screen below the search box. A click on the Facebook tab displays hits from Facebook that include the string “taxonomy”.
I did not discuss FleeQ.com in my lecture at the SLA’s Spotlight session. There are other real time search engines that illustrate the concepts in my talk.
I found FleeQ.com useful. The system strikes me as a metasearch with considerable plumbing designed to generate revenue from partners’ Web traffic.
Worth a look and the revenue generating options may be of interest. You can find some monetization information at http://www.fleeq.com/new/publishers.php. I am not sure I noted the “semantic” angle of the system, but you may be more discerning than I.
Stephen E Arnold, June 17, 2010
Freebie
Exalead and Dassault Tie Up, Users Benefit
May 24, 2010
A happy quack to the reader who alerted us to another win by Exalead.
Dassault Systèmes (DS) (Euronext Paris: #13065, DSY.PA), one of the world leaders in 3D and Product Lifecycle Management (PLM) solutions, announced an OEM agreement with Exalead, a global software provider in the enterprise and Web search market. As a result of this partnership, Dassault will deliver discovery and advanced PLM enterprise search capabilities within the Dassault ENOVIA V6 solutions.
The Exalead CloudView OEM edition is dedicated to ISVs and integrators who want to differentiate their solutions with high-performing and highly scalable embedded search capabilities. Built on an open, modular architecture, Exalead CloudView uses minimal hardware but provides high scalability, which helps reduce overall costs. Additionally, Exalead’s CloudView uses advanced semantic technologies to analyze, categorize, enhance and align data automatically. Users benefit from more accurate, precise and relevant search results.
This partnership with Exalead demonstrates the unique capabilities of ENOVIA’s V6 PLM solutions to serve as an open federation, indexing and data warehouse platform for process and user data, for customers across multiple industries. Dassault Systèmes PLM users will benefit from its Exalead-empowered ENOVIA V6 solutions to handle large data volumes thus enabling PLM enterprise data to be easily discovered, indexed and instantaneously available for real-time search and intelligent navigation. Non-experts will have the opportunity to access PLM know-how and knowledge with the simplicity and the performance of the Web in scalable online collaborative environments. Moreover, PLM creators and collaborators will be able to instantly find IP from any generic, business, product and social content and turn it into actionable intelligence.
Stephen E Arnold, May 22, 2010
Freebie.
Cognition and Bing
May 20, 2010
“Cognition Technologies to power Microsoft’s Bing now!” discloses Cognition Technologies’ semantic technology as applied to Microsoft’s “decision engine” Bing. How will this improve Bing? At the core, the technology will help Bing deal with an “understanding” of the English language, says the official press release .
The “semantic map,” as it is dubbed, contains a gigantic collection of semantic contexts (over ten million), including representations, taxonomy, and word meaning distinctions. Cognition writes in their press release that over “540,000 word senses; 75,000 concept classes; 8,000 nodes; and 510,000 word stems” and other high-level features of semantic processing exist to help Bing process queries properly. The resources were codified and reviewed by lexicographers and linguists over a period of 25 years.
Will the semantic map make Bing understand our garbled search pecks instantaneously and deliver accurate results? Maybe, but with Google’s “humongous amount of data it indexes” and loyal site traffic, it may be a long battle. According to the blog article’s author, what Bing does have going for it is a clean interface, excellent “information aggregation,” and solid concept/summary extraction. The semantic technology should only add to that and make Bing stronger.
Samuel Hartman, May 20, 2010
Milward from Linguamatics Wins 2010 Evvie Award
April 28, 2010
The Search Engine Meeting, held this year in Boston, is one of the few events that focuses on the substance of information retrieval, not the marketing hyperbole of the sector. Entering its second decade, the conference speakers tackle challenging subjects. This year speakers addressed such topics as “Universal Composable Indexing” by Chris Biow, Mark Logic Corporation, “Innovations in Social Search” by Jeff Fried, Microsoft, and “From Structured to Unstructured and Back Again: Database Offloading”, by Gregory Grefenstette, Exalead, and a dozen other important topics.
From left to right: Sue Feldman, Vice President, IDC, Dr. David Milward, Liz Diamond, Stephen E. Arnold, and Eric Rogge, Exalead.
Each year, the best paper is recognized with the Evvie Award. The “Evvie” was created in honor of Ev Brenner, one of the pioneers in machine-readable content. After a distinguished career at the American Petroleum Institute, Ev served on the planning committee for the Search Engine Meeting and contributed his insights to many search and content processing companies. One of the questions I asked after each presentation was, “What did Ev think?”. I valued Ev Brenner’s viewpoint as did many others in the field.
The winner of this year’s Evvie award is David R. Milward, Linguamatics, for his paper “From Document Search to Knowledge Discovery: Changing the Paradigm.” Dr. Milward said:
Business success is often dependent on making timely decisions based on the best information available. Typically, for text information, this has meant using document search. However, the process can be accelerated by using agile text mining to provide decision-makers directly with answers rather than sets of documents. This presentation will review the challenges faced in bringing together diverse and extensive information resources to answer business-critical R&D questions in the pharmaceutical domain. In particular, it will outline how an agile NLPbased approach for discovering facts and relationships from free text can be used to leverage scientific knowledge and move beyond search to automated profiling and hypothesis generation from millions of documents in real time.
Dr. Milward has 20 years’ experience of product development, consultancy and research in natural language processing. He is a co-founder of Linguamatics, and designed the I2E text mining system which uses a novel interactive approach to information extraction. He has been involved in applying text mining to applications in the life sciences for the last 10 years, initially as a Senior Computer Scientist at SRI International. David has a PhD from the University of Cambridge, and was a researcher and lecturer at the University of Edinburgh. He is widely published in the areas of information extraction, spoken dialogue, parsing, syntax and semantics.
Presenting this year’s award was Eric Rogge, Exalead, and Liz Diamond, niece of Ev Brenner. The award winner received a recognition award and a check for $500. A special thanks to Exalead for sponsoring this year’s Evvie.
The judges for the 2010 Evvie were Dr. David Evans (Evans Research), Sue Feldman (IDC), and Jill O’Neill, NFAIS.
Congratulations, Dr. Milward.
Stuart Schram IV, April 28, 2010
Sponsored post.
The Seven Forms of Mass Media
April 21, 2010
Last evening on a pleasant boat ride on the Adriatic, a number of young computer scientists to be were asking about my Google lecture. A few challenged me, but most seemed to agree with my assertion that Google has a large number of balls in the air. A talented juggler, of course, can deal with five or six balls. The average juggler may struggle to keep two or three in sync.
One of the students shifted the subject to search and “findability.” As you know, I floated the idea that search and content processing is morphing into operational intelligence, preferably real-time operational intelligence, not the somewhat stuffy method of banging two or three words into a search box and taking the most likely hit as the answer.
The question put to me was, “Search has not kept up with printed text, which has been around since the 1500s, maybe earlier. What are we going to do about mobile media?”
The idea is that we still have a difficult time locating the precise segment of text or datum. With mobile devices placing restraints on interface, fostering new types of content like short text messages, and producing an increasing flow of pictures and video, finding is harder not easier.
I remembered reading “Cell Phones: The Seventh Mass Media” and had a copy of this document on my laptop. I did not give the assertion that mobile derives were a mass medium, but I thought the insight had relevance. Mobile information comes with some interesting characteristics. These include:
- The potential for metadata derived from the user’s mobile number, location, call history, etc
- The index terms in content, if the system can parse information objects or unwrap text in an image or video such as converting an image to ASCII and then indexing the name of a restaurant or other message in an object
- Contextual information, if available, related to content, identified entities, recipients of messages, etc.
- Log file processing for any other cues about the user, recipient(s), and information objects.
What this line of thinking indicates is that a shift to mobile devices has the potential for increasing the amount of metadata about information objects. A “tweet”, for instance, may be brief but one could given the right processing system impart considerable richness to the information object in the form of metadata of one sort or another.
The previous six forms of media—[I] print (books, magazines, and newspapers), [II] recordings; [III] cinema; [IV] radio; [V] television; and [VI] Internet—fit neatly under the umbrella of [VII] mobile. The idea is mobile embraces the other six. This type of reasoning is quite useful because it gathers some disparate items and adds some handles and knobs to the otherwise unwieldy assortment in the collection.
In the write up referenced above, I found this passage interesting: “Mobile is as different from the Internet as TV is from the radio.”
The challenge that is kicked to the side of the information highway is, “How does one find needed information in this seventh mass media?” Not very well in my experience. In fact, finding and accessing information is clumsy for textual information. After 500 years, the basic approach of hunting, Easter egg style, has been facilitated by information retrieval systems. But I think most people who look for information can point out some obvious deficiencies. For example, most retrieval systems ignore content in various languages. Real time information is more of a marketing ploy than a useful means of figuring out the pulse count for a particular concept. A comprehensive search remains a job for a specialist who would be recognized by an archivist who worked in Ephesus’ library 2500 years ago.
Are you able to locate this video on Ustream or any other video search system? I could not, but I know the video exists. Here is a screen capture. Finding mobile content can be next to impossible in my opinion.
When I toss in the radio and other rich media content, finding and accessing pose enormous challenges to a researcher and a casual user alike. In my keynote speech on April 15, 2010, I referenced some Google patent documents. The clutch of disclosures provide some evidence that Google wants to apply smart software to the editorial job of creating personalized rich media program guides. The approach strikes me as an extension of other personalization approaches, and I am not convinced that explicit personalization is a method that will crack the problem of finding information in the seventh medium or any other for that matter.
Here’s my reasoning:
- Search and retrieval methods for text don’t solve problems. The more information processed means longer results lists and an increase in the work required to figure out where the answer is.
- Smart systems like Google’s or the Cuil Cpedia project are in their infancy. An expert may find fault with smart software that is actually quite stupid from the informed user’s point of view.
- Making use of context is a challenging problem for research scientists but asking one’s “friends” may be the simplest, most economical, and widely used method. Facebook’s utility as a finding system or Twitter’s vibrating mesh may be the killer app for finding content from mobile devices.
- As impressive as Google’s achievements have been in the last 11 years, the approach remains largely a modernization of search systems from the 1970s. A new direction may be needed.
The bright young PhDs have the job of figuring out if mobile is indeed the seventh medium. The group with which I was talking or similar engineers elsewhere have the job of cracking the findability problem for the seventh medium. My hope is that on the road to solving the problem of the new seventh medium’s search challenge, a solution to finding information in the other six is discovered as well.
The interest in my use of the phrase “operational intelligence” tells me one thing. Search is a devalued and somewhat tired bit of jargon. Unfortunately substituting operational intelligence for the word search does not address the problem of delivering the right information when it is needed in a form that the user can easily apprehend and use.
There’s work to be done. A lot of work in my opinion.
Stephen E Arnold, April 20, 2010
No sponsor for this post, gentle reader.
Eclectic List of Semantic Tools
April 20, 2010
I reviewed a list of semantic tools in the write up “Brown Bag Lunch: Methods for Semantic Discovery, Annotation and Mediation”. If you want a list of links to help orient you to the varied, interesting world of semantics, take a peek at the table in this article. I noted some unusual and possibly incorrect entries, but on the whole you will find the information in the table thought provoking. The list begins below the somewhat intimidating diagram of a semantic process.
Stephen E Arnold, April 20, 2010
Lexalytics Reaches for the Cloud
April 20, 2010
Reaching out to a varied audience of users, Lexalytics Web Service can augment brand/reputation management by providing advanced text analytics from a variety of sources.
PRWeb reports in their article, “Lexalytics Unveils Lexascope Web Service for Social Media & Sentiment Analysis” that this new service works easily and inexpensively from the get go to integrate Lexalytics’ sentiment analysis, entity extraction, and thematic analysis directly into the user’s own business intelligence applications. According to Seth Redmore, vice president of products, “If it’s text, and it’s English, we can read it and add value to it.”
Targeting three different types of audiences, Lexalytics is looking at larger enterprises with specific, “point” text analytics problems they need to address; companies that are providing specific media and reputation management service; and companies who want to add value to the content that they are distributing. In short, this Web services provides an extremely quick analysis of thousands of documents; the work of many, many humans.
Melody K. Smith, April 20, 2010
Note: Post was not sponsored.
Google Recipes
April 19, 2010
Years ago I tried to keep pace with Google’s recipe demonstrations. There were classifications of recipes. There we some ingredient angles. Then recipes became – well – plain old recipes. I read in “Google Adds Rich Snippets for Recipes” that recipes queries on Google return “rich snippets.” the write up says you may need to do some fiddling. My sources suggest that more recipe fiddling will be forthcoming. What ever happened to the Google refrigerator?
Stephen E Arnold, April 19, 2010
A freebie, no ingredients.
3M and NLP
April 18, 2010
Natural Language Processing (NLP) seems to be the hot topic of late. More and more technology companies are utilizing this in their software packages. 3M has released the next generation computer-assisted coding for both inpatient and outpatient coding. You can see 3M’s next generation NLP system for coding patient intake forms by downloading the demo from this link. With claims of improving productivity and decreasing costs, 3M shares a product study that produced immediate results. A small controlled study measured the improvement a hospital might see when they install 3M Codefinder Computer-Assisted Edition for inpatient coding. Within a single afternoon—and with only one hour of training—coders were able to reduce the time spent coding records by nearly 30 percent. The time savings became even greater as the complexity of the medical records increased. Impressive claims, check it out for yourself.
Melody K. Smith, April 18, 2010
Note: Post was not sponsored.
Explaining Artificial Intelligence to Everyone
April 18, 2010
Science Daily ran a story on April 1, 2010. I was not sure if this story was a joke or whether it was serious. I will let you decide. The title was “Grand Unified Theory of AI: New Approach Unites Two Prevailing but Often Opposed Strains in Artificial-Intelligence Research.” The write up explains the Math Club approach; that is, the use of numerical methods, which are now popular. The article describes the rules based approach, which requires a human to write the rules. The core of the story is a pitch for the “Church system”. Science Daily explains:
“With probabilistic reasoning, you get all that structure for free,” Goodman says. A Church program that has never encountered a flightless bird might, initially, set the probability that any bird can fly at 99.99 percent. But as it learns more about cassowaries — and penguins, and caged and broken-winged robins — it revises its probabilities accordingly. Ultimately, the probabilities represent all the conceptual distinctions that early AI researchers would have had to code by hand. But the system learns those distinctions itself, over time — much the way humans learn new concepts and revise old ones. “What’s brilliant about this is that it allows you to build a cognitive model in a fantastically much more straightforward and transparent way than you could do before,” says Nick Chater, a professor of cognitive and decision sciences at University College London. “You can imagine all the things that a human knows, and trying to list those would just be an endless task, and it might even be an infinite task. But the magic trick is saying, ‘No, no, just tell me a few things,’ and then the brain — or in this case the Church system, hopefully somewhat analogous to the way the mind does it — can churn out, using its probabilistic calculation, all the consequences and inferences. And also, when you give the system new information, it can figure out the consequences of that.”
We talked about this write up at lunch and decided that we would invite readers to read the article and draw a conclusion about a “unified theory of artificial intelligence.”
Stephen E Arnold, April 19, 2010
A freebie.