Q-Sensei 2.0

April 27, 2012

Q-Sensei adds features to its ontology-based search system, we learn in MarketWatch’s “Q-Sensei Enterprise V2.0 Unveiled to Rapidly Develop Tailored Search applications for Big Data.” Prominently featured are an ontology-based data processing/ configuration and a new API to more efficiently handle big data.

What’s an ontology? We keep forgetting. The dictionary says it’s “the branch of metaphysics that studies the nature of existence or being as such.” Wait, that can’t be right. . . . Ok, in information system lingo, ontology “formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts.” That’s better.

The press release says the newest version of Q-Sensei’s enterprise search platform is designed to tailor search-based applications quickly and flexibly to the needs of its clients, using data from Intranets, social media, third parties, and the Internet. We learn from the write up:

“With Q-Sensei Enterprise’s new ontology-based data processing, businesses can rapidly develop new, tailored search-based applications by using existing RDF and OWL resources such as database models, industry or domain-specific ontologies, process definitions and project configurations. This new processing approach also enables harmonization of semantics, components and functionality across business applications. It also improves the speed and efficiency of data process and indexing, increasing platform performance.”

Version 2.0 also boasts a semi-automatic, guided configuration and a new API that makes it easier to integrate  Q-Sensei into other applications.

Q-Sensei was created in 2007 with the merger of the German Lalisio and the American QUASM, and now has offices in both Brooklyn and Erfurt, Germany. Q-Sensei focuses on multi-dimensional search, which it defines as combining full-text and dynamic faceted search with real-time content analysis. The company maintains that its solutions make it easy to find what you need, even if you don’t have the appropriate keywords on hand.

Cynthia Murrell, April 27, 2012

Sponsored by Ikanow

IBM Buys Vivisimo Allegedly for Its Big Data Prowess

April 25, 2012

Big data. Wow. That’s an angle only a public relations person with a degree in 20th century American literature could craft. Vivisimo is many things, but a big data system? News to me for sure.

IBM has been a strong consumer and integrator of open source search solutions. Watson, the game show winner, used Lucene with IBM wrapper software to keep the folks in Jeopardy post production on their toes.

vivisimo search

A screen shot of the Vivisimo Velocity system displaying search results for the RAND organization. Notice the folders in the left hand panel. The interface reveals Vivisimo’s roots in traditional search and retrieval. The federating function operates behind the scenes. The newest versions of Velocity permit a user to annotate a search hit so the system will boost it in subsequent queries if the comment is positive. A negative rating on a result suppresses that result.

I learned that IBM allegedly purchased Vivisimo, a company which I have covered in my various monographs about search and content processing. Forbes ran a story which was at odds with my understanding of what the Vivisimo technology actually does. Here’s the Forbes’ title: “IBM To Buy Vivisimo; Expands Bet On Big Data Analytics.” Notice the phrase “big data analytics.”

Why do I point out the “big data” buzzword? The reasons include:

  • Vivisimo has a clustering method which takes search results and groups them, placing similar results identified by the method in “folders”
  • Vivisimo has a federating method which, like Bright Planet’s and Deep Web Technologies’, takes a user’s query and sends the query to two or more indexing systems, retrieves the results, and displays them to the user
  • Vivisimo has a clever de-duplication method which makes the results list present one item. This is important when one encounters a news story which appears on multiple Web sites.

According to the write up in Forbes, a “real” news outfit:

IBM this morning said it has agreed to acquire Vivisimo, a Pittsburgh-based provider of big data access and analysis tools.

Okay, but in Beyond Search we have documented that Vivisimo followed this trajectory in its sales and marketing efforts since the company opened for business in 2000. In fact, the Wikipedia write up about Vivisimo says this:

Vivisimo is a privately held enterprise search software company in Pittsburgh that develops and sells software products to improve search on the web and in enterprises. The focus of Vivisimo’s research thus far has been the concept of clustering search results based on topic: for example, dividing the results of a search for “cell” into groups like “biology,” “battery,” and “prison.” This process allows users to intuitively narrow their search results to a particular category or browse through related fields of information, and seeks to avoid the “overload” problem of sorting through too many results.

Read more

Google and Its Stock Split

April 16, 2012

I pointed out that the big news from the Google quarterly report was the erosion of revenue from Google’s core business.

Other addled geese, poobahs, and mavens found the stock split more troubling. A good example of the reaction is this Reuters real news story: “Google’s Evil Stock Split.” The idea is that Google seems to be perilously close to violating guidelines put in place 90 years ago. Here’s the key point in my opinion:

Google has, now, clearly violated the spirit of the NYSE rules, if not their letter. It took 15 months for the independent directors on the board to be persuaded of this, in long and secret deliberations.

Well, the independent directors * were * convinced.

I also enjoyed this comment in the Reuters real news story:

This move, then, is basically a way for Google to try to retreat back into its pre-IPO shell as much as possible. It never really wanted to go public in the first place — it was forced into that by the 500-shareholder rule — but at this point, Google is far too entrenched in the corporate landscape to be able to turn back the clock. It’s too big, and too important, and has been public for too long. That’s the thing about going public: it might suck, but once you’ve done it, you’ve done it. And at that point, if you try to pull a stunt like this, you risk looking all too much like Rupert Murdoch.

Okay, real Silicon Valley is starting to look like the real news paragon, Rupert Murdoch.

Wow.

My take is very simple.

The Googlers know that revenue softening can no longer be swept under the rug or surrounded with big band music and fancy dancing. The numbers are too big. The declines are double digits. The grousing about Panda and the push to get people to buy AdWords to visible to some Web site operators.

Therefore, the stock play is designed to leave the existing management team in charge as the financial news get increasingly dodgey. The Google senior management team does not want to be looking at a start up to fund without the Google ID card in their pocket.

So the erosion of online ad efficiency is causing the control push. Because this has been going on among the independent directors, I have concluded that the revenue erosion was noticeable in 2010, maybe earlier.

Will control reverse the online advertising money machine’s functioning. Nah, but the days of the “Google can do no wrong” are either over or drawing to a close. Google has these issues with which to contend:

  1. Legal hassles. Big disc brake applied to some activities.
  2. Amazon, Apple, and Facebook. Each of these companies has learned from Google. This is The Google Legacy I wrote about back in 2004 or 2005. You might want to check it out because Amazon, Apple, and Facebook have out Googled Google and seem to be gaining strength as Google does the fancy dancing.
  3. Costs from brute force solutions. Google spends a lot of dough to keep its brute force indexing system up and running. Facebook, on the other hand, can just spider Web urls which its members have posted. No brute force required to get started with an interesting search solution. Amazon has slapped A9 in the AWS plumbing and can move into search niches where Google has not gotten significant traction. Apple, which Google really wants to emulate, keep chugging along with a walled garden and customers’ religious fervor.    Do you know anyone with religious fervor toward the Google. Well, I know one company. Oracle. See item one above.

Net net: Blekko/Yandex and Facebook could put the squeeze on the Google with a little luck and some good timing. How will Google respond? No clue. Google is not accustomed to playing defense. Ego is a potent concept. As the Greek tragedian said:

Cleverness is not wisdom. Bacchæ l. 395

Stephen E Arnold, April 18, 2012

Sponsored by Pandia.com

Boxfish Brings Search to TV

April 16, 2012

Technology Review recently reported on a new startup that helps users search for words and phrases from TV in the article “Searching the Small Screen.”

According to the article, as of late March, California based Boxfish opened a beta version of its site to the public, allowing users to search through words and phrases that have been seen on television over the past month. The site also allows users to see topics that are trending and set up alerts for specific terms.

Boxfish is currently indexing TV dialogue from the US, UK and Ireland and they plan to add Australia and Canada soon.

The article states:

“The site is simple to use. If you search for, say, “cookie,” you’ll receive a list of results posted in chronological order along with a bit of the transcript in which the word appeared. On the right side of the screen you can see how many times it has been used recently, on how many channels, and also the words most commonly used in the same context. Click on a search result and you’ll see a big chunk of the transcript with bold text indicating the section that includes the search term.”

Since the product is so new, Boxfish still has a few kinks to work out. However, this could be a cool new way for TV watchers to keep up with anything from politics and current events to the latest celebrity gossip.

Jasmine Ashton, April 16, 2012

Sponsored by Pandia.com

Desktop Search Moves to the Cloud

April 12, 2012

Tech Crunch’s Colleen Taylor recently reported on a new app called Found, that lets you find and access your documents whether they are on your computer or online, in the article “Found Makes Searching for Files Anywhere Super Simple (and Really Sick).”

According to the article, the San Francisco based app aims to organize the mess of documents that are relevant to our work and personal lives. Found currently plugs into Gmail, Google Documents, and Dropbox and the company says that it will be adding additional integrations in the near future.

Taylor states:

“Once you install it on your computer, looking for things in Found quickly becomes second nature — and you quickly start to wonder about how much time you wasted searching for things before you had it. Of course, the real key will be seeing how snappy the Found app is once more people are using it after the public launch later this spring — nowadays, an app is only as good as it can scale. But at the moment, Found is looking very like a very promising tool for the those of us who are a bit less organized with our files than we’d like to be.”

While the app won’t be released to the public until mid-May, you can see how Found works via an embedded video in Taylor’s article. The notion of a cloud service indexing content on a local machine may give some users pause. We prefer to use behind-the-firewall solutions. Even cloud back ups are solutions which don’t address the issues we face.

Jasmine Ashton, April 12, 2012

Sponsored by Pandia.com

Open Source Analytics Information Service Now Available

April 9, 2012

ArnoldIT has rolled out The Trend Point information service. Published Monday through Friday, the information services focuses on the intersection of open source software and next-generation analytics. The approach will be for the editors and researchers to identify high-value source documents and then encapsulate these documents into easily-digested articles and stories. In addition, critical commentary, supplementary links, and important facts from the source document are provided. Unlike a news aggregation service run by automated agents, librarians and researchers use the ArnoldIT Overflight tools to track companies, concepts, and products. The combination of human-intermediated research with Overflight provide an executive or business professional with a quick, easy, and free way to keep track of important developments in open source analytics. There is no charge for the service.

trendpoint splash

Stories include:

According to the publisher, Stephen E Arnold:

We believe that commercial abstracting and indexing services have become untenable for the busy professional. We have combined traditional indexing, literature reviews, and critical commentary which help reduce the time required to pinpoint the meaningful information in this exploding open source analytics field.

Our business model is to provide high value information without a fee. Individuals, law firms, and private equity firms wanting additional information about the people, companies, and products we cover are free to contact us. Like other professional services’ firms, we rely on motivated individuals with an information need to tap into our full-scale, in-depth research.

What sets TheTrendPoint and other ArnoldIT.com information services apart is that its approach is similar to that used by commercial information services such as Medline and Disclosure, two information services designed to make reference services more useful.

At this time, TheTrendPoint.com is designed to complement the finding services which ArnoldIT.com publishes. ArnoldIT.com is one of the leading sources of information on subjects ranging from search and content processing to next-generation intelligence systems.

New content is added to the service Monday to Friday. For more information about the service, contact the publisher at seaky2000 at yahoo dot com.

Kenneth Toth, April 9, 2012

Sponsored by Pandia.com

Video Search: An Open Opportunity for GreenButton

April 9, 2012

New Zealand is known for its beautiful countryside and all the popular movies filmed there, sheep, and Dot Com. Business Insider reports there is another item to add to the island nation’s “list of reasons to be famous,” “Tiny New Zealand Company Brings Cool Microsoft Video Tech to the World.” The small startup GreenButton used search technology from Microsoft Research and created InCus, a service that transcribes audio and video files to make them searchable. It is aimed at corporation enterprises to make their digital media libraries searchable. We learned:

“InCus is based on Microsoft’s Audio Video Indexing Service (MAVIS), which was previously only being tested by a few government agencies. That makes this the first commercially available use of MAVIS, GreenButton CEO Scott Huston told Business Insider. Naturally, inCus is running on Windows Azure.”

GreenButton also sells an Amazon-like cloud and other cloud applications—they specialize in 3-D rendering apps. Other companies like Cisco and Autonomy have similar services for video and audio, but GreenButton’s InCus is the only one for the cloud. GreenButton has a corner in the market now, but it won’t be too long before a bigger company develops their own video indexing service. Things are heating in this part of the cloud market.

Whitney Grace, April 9, 2012

Sponsored by Pandia.com

Iowa Government Gets a Digital Dictionary Provided By Access

April 7, 2012

How did we get by without the invention of the quick search to look up information?  We used to use dictionaries, encyclopedias, and a place called the library.  Access Innovations, Inc. has brought the Iowa Legislature General Assembly into the twenty-first century.

The write-up “Access Innovations, Inc. Creates Taxonomy for Iowa Code, Administrative Code and Acts” tells us the data management industry leader has built a thesaurus that allows the Legislature to search its library of proposed laws, bills, acts, and regulations.  Users can also add their unstructured data to the thesaurus.  Access used their Data Harmony software to provide subscription-based delivery and they built the thesaurus on MAIstro.

“The project differed from typical index and thesaurus creation because the Iowa Legislative Services Agency needed to maintain its existing codes from each back-of-the-book index, rather than starting from scratch and creating new codes.  One reference alone, the Blue Index, included 2,300 index terms.  To create the thesaurus, Access looked at different methods to apply to each term according to the existing references, tied preferred terms to the existing codes, and added related terms to the preferred terms.   The codes covered previous legislation dating as far back as 1953 to legislation through 2010.  Also, the custom taxonomy was built with only four levels in order to meet Iowa Legislative Services’ navigation requirements.  Typically, thesauri are not limited by a specified number of levels.”

The new legal thesaurus makes it much easier to find new laws and their changes instead of having to browse through pages of book.  Access Innovations hopes their project for the Iowa Legislature General Assembly will encourage other government bodies to turn their libraries over to them for indexing.  Not only would that make it easier for politicians and their staff to conduct research, maybe it could improve the political situation in the US.  Making part of a job easier tends to make people happy.

Whitney Grace, April 7, 2012

Sponsored by Pandia.com

Michael Moody Joins Lucid Imagination

March 30, 2012

Market Watch recently reported on Lucid Imagination, the commercial company for Apache Lucene and Solr search technology, in the article “Lucid Imagination Names Software Development Luminary Michael Moody Senior Vice President of Engineering.”

According to the article, Michael Moody brings more than 30 years of software engineering to the search technology company.  He has held senior positions in several different companies including: Spigit, Jaspersoft, and Portal Software.

Mr. Moody said:

Thanks to Lucid Imagination, companies will be able to meet the challenge of analyzing their big data before the rapid adoption leads to operational chaos, lost opportunities, and reduced competitiveness,” said Moody. “We have the technology, business model and people in place to help drive a complete transformation of enterprise search and retrieval that will lead to phenomenally better and faster decision making.

My colleagues and I are very excited to see Michael Moody’s addition to the Lucid Imagination team.

I speak for the ArnoldIT team when I assert that we are confident that his expertise will help the company come up with even better ways to overcome the challenges of enterprise search and big data access.

We have noticed that a number of open source search vendors are touting performance enhancements, fail over methods, and value added indexing advantages which Lucid’s search system allegedly do not provide. Assertions are easy. Real world deployments are different from talking about delivering cost savings and improved efficiencies to a customer.

We have just completed an fly over of open source search vendors. In our view, Lucid’s search system out distanced the other Lucene-based search systems we examined.

We try to avoid Mac vs. PC type hassles, but the key difference among open source search vendors boils down to who can deliver efficiencies to the licensee, offer financial stability, and 24×7 engineering support and services. When measured against our “real world” yardstick, trust Lucid Imagination. There is more to the company than a single entrepreneur working nights and weekends to compete. Just our view. Maybe our Overflight report will become publicly available. Who knows?

In the meantime, navigate to www.lucidimagination.com and learn more about the company.

Jasmine Ashton, March 30, 2012

Sponsored by Pandia.com

How to Create Your Own Oracle Text Index

March 22, 2012

The Swiss-Army Development blog recently released some useful information about key word search with Oracle Text in the post “Keyword Search via Oracle Text.”

The post attempts to create a foundation for using Oracle Text to implement full text search in a table. It takes readers step-by-step through the process of building the back end of an Oracle Text Index and then leveraging that index to include full text search.

The writer states the reasoning behind this project:

“Oracle text is a feature available in the Oracle Database and is used to provide keyword search indexing to large blocks of text and even binary formatted files like Word and PDF files. As part of a project I am working on, I need to create a keyword search index that spans multiple columns. This will allow my users to search for keywords in the title, abstract and content of a note entered into the system. The note could be in the form of an uploaded file, or it could be manually entered through the interface.”

The Swiss wash their cows, useful activity if not germane to milk, cheese, and beef.

Similar to Oracle Text perhaps?

Stephen E. Arnold, March 22, 2012

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta