Coveo Connects

November 1, 2010

Knowledge and information are directly related to a company’s success. Coveo taps on this aspect as a leading provider of enterprise search and customer information access solutions. The PR-USA.net article “Coveo Announces New Information Indexing Connectors Including Support for Microsoft SharePoint 2010,” tells the story of how “Coveo offers a richer, more integrated view of enterprise knowledge and information compared to what’s available with Microsoft’s native search.”

The article further discloses that through its Enterprise Search 2.0 approach, it is possible for Coveo to “bring the benefits of unified information access to customers faster, and less expensively, than is possible with traditional solutions including SharePoint Search or Microsoft FAST.” Since Coveo dynamically indexes the data and presents it in a unified view, it helps the organizations with instant value of the information and knowledge stored in form of structured and unstructured data across the enterprise, in any system without moving data. Thus, the extended Coveo offers superior functionality and integration. Our recommendation: connect with Coveo.

Harleena Singh, November 1, 2010

Anti Search in 2011

November 1, 2010

In a recent meeting, several of the participants were charged with disinformation from the azurini.

You know. Azurini, the consultants.

Some of these were English majors, others former print journalists, and some unemployed search engine optimization experts smoked by Google Instant.

But mostly the azurini emphasize that their core competency is search, content management, or information governance (whatever the heck that means). In a month or so, there will be a flood of trend write ups. When the Roman god looks to his left and right, the signal for prognostication flashes through the fabric covered cube farms.

To get ahead of the azurini, the addled goose wants to identify the trends in anti search for 2011. Yep, anti search. Remember that in a Searcher article several years ago, I asserted that search was dead. No one believed me, of course. Instead of digging into the problems that ranged from hostile users to the financial meltdown of some high profile enterprise search vendors, search was the big deal.

And why not? No one can do a lick of work today unless that person can locate a document or “find” something to jump start activity. In a restaurant, people talk less and commune with their mobile devices. Search is on a par with food, a situation that Maslow would find interesting.

The idea for this write up emerged from a meeting a couple of weeks ago. The attendees were trying to figure out how to enhance an existing enterprise search system in order to improve the productivity of the business. The goal was admirable, but the company was struggling to generate revenues and reduce costs.The talk was about search but the subtext was survival.

The needs for the next generation search system included:

  • A great user experience
  • An iPad app to deliver needed information
  • Seamless access to Web and Intranet information
  • Google-like performance
  • Improved indexing and metatagging
  • Access to database content and unstructured information like email.

Read more

Open Source Search Run Down

October 25, 2010

Open Source Search with Lucene & Solr” provides a useful overview of information similar to that presented at the Lucene Revolution in Boston, October 7 and 8, 2010. I found the information useful. Even though I poked my head into most sessions and met a number of speakers, Igvita.com has assembled a number of useful factoids. Here’s a selection of four.

First, the Salesforce.com implementation of Lucene “consists of roughly 16 machines, which in turn contain may small and sharded Lucene indexes. Currently, [Salesforce.com] handles 4,000 queries per second (qps) and provides an incremental indexing model where the new user data is searchable within ~ three minutes.”

Second, iTunes is a Lucene user “said to be handling up to 800 queries per second.” I thought Apple was drinking Google Kool-Aid or was before the friction between the two companies entered into a marital separation without counseling.

Third, I found this description of Lucene/Solr interesting:

If Lucene is a low-level IR toolkit, then Solr is the fully-featured HTTP search server which wraps the Lucene library and adds a number of additional features: additional query parsers, HTTP caching, search faceting, highlighting, and many others. Best of all, once you bring up the Solr server, you can speak to it directly via REST XML/JSON API’s. No need to write any Java code or use Java clients to access your Lucene indexes. Solr and Lucene began as independent projects, but just this past year both teams have decided to merge their efforts – all around, great news for both communities. If you haven’t already, definitely take Solr for a spin.

Finally, this passage opened my eyes to some interesting opportunities.

Instead of running Lucene or Solr in standalone mode, both are also easily integrated within other applications. For example, Lucandra is aiming to implement a distributed Lucene index directly on top of Cassandra. Jake Luciani, the lead developer of the project, has recently joined the Riptano team as a full-time developer, so do not be surprised if Cassandra will soon support a Lucene powered IR toolkit as one of its features! At the same time, Lily is aiming to transparently integrate Solr with HBase to allow for a much more flexible query and indexing model of your HBase datasets. Unlike Lucandra, Lily is not leveraging HBase as an index store (see HBasene for that), but runs standalone, albeit tightly integrated Solr servers for flexible indexing and query support.

Navigate to the Igvita Web site and get the full scoop, not a baby cup of goodness.

Stephen E Arnold, October 25, 2010

Freebie

The Bonsai Method: Google and Change

October 24, 2010

When I was in Japan, I watched a bonsai “treasure” work his magic. I liked the idea of binding young shoots with wire and forcing the malleable living things to do what the “treasure wanted.” My guide explained that the “national treasure” could convert any species of tree into a model railroad scale plant. Remarkable.

The problem is that companies in general and a 12 year old Google in particular do not respond to the bonsai master’s interventions the way a sprouting maple does.

Let’s face it. Google is not likely to change in a meaningful way. The aircraft carrier is underway. Even a minor course correction takes a long time. Think about the six versions of the Google Search Appliance before Google could hook Google Apps content into the system.

image

Can the Google oak tree be shaped into a bonsai art work? Not likely, grasshopper.

Google has been chugging along on its “controlled chaos” approach to business for 12 years. If you have worked with juvie offenders, you may have encountered some 12 year olds who are going to grow their own way. Those 12 year olds are on their own often predictable path.

Opinion: Angry birds Android Market Snub Shows Google Has to Change” asserts:

What looks initially to be typical press release waffle is in fact a damning indictment of Google’s Android Market. If its own official retail channel is not seen as the “obvious choice” for a major app developer, something needs to be done to make it so – and fast. Perhaps, in hindsight, we shouldn’t have been so surprised at Rovio’s gutsy move. The signs of general dissatisfaction with Android Market have been there for all to see since its launch. Put bluntly, Android Market is an absolute mess. The navigation experience is blighted by a poor filter system that makes it very hard indeed to hone in on quality paid software. Dubious free ringtone and porn apps clog up the Multimedia, Entertainment, and even Games categories.

The author wanting Google to change is probably not going to do too well at bonsai. How is one to miniature and tame a 12 year old tree? Sure, the tree can be shaped, but the total control stuff is no longer possible. Make a pear tree look like Donald Duck. No problem. Make the pear tree fit into a dish for the dining room sideboard, problem.

I do think Google is changing, but the change has little to do with Angry Birds or even government regulators. Google is changing because of its addiction to money. The shift in StreetView policies is less about fear of legal hassles and mostly about the firm’s ability to get needed data from other methods not widely discussed in the blogosphere.

There are several important changes evident to me. Keep in mind that I look at Google in terms of its technical information freely available as open source content. Here’s my checklist, which you may compare with the Angry Birds’ example in the cited article.

First, Google is going consumer. The company’s roots are in brute force search and solving engineering problems that sank other brute force Web indexing companies. This consumer shift may be a turning point for Google. In my opinion, Google is betting the farm on its understanding of the consumer.

Second, Google faces a world in which Facebook and Apple are the hot tickets. Second or third billing is an issue for those who are sensitive to such shallow accolades. With Xooglers filling the ranks at Facebook, the notion that Google is not number one is a bitter pill in my opinion. Angst can manifest itself in interesting ways. Consider the Google TV which a number of people have found an amusing way to test their technical aptitude. The couch spud? Indifferent.

Third, online advertising is ramping up. But the big money is going to talent centric programming available on the Internet. AdWords is a great business, but a new ad business is emerging and Google has to figure out how to play a big part in that world. Adam Carolla may be a former radio DJ, but his growing empire represents an advertising opportunity that does not lend itself to Google’s algorithms at this moment.

The PocketGamer’s write up about Google and Angry Birds is interesting, but it does not apply to the larger forces at work on and within the Google. Google is in the closing innings of what is its worst public relations year in its 12 year history. Buzz, Wave, Germany, Google Books, and Google TV—quite a track record.

Controlled chaos is the method and it is now showing some flaws. And Google will find it difficult to change. There is no bonsai master able to take a 12 year old tree and squish it down to a seven inch living entity. One big tree does not make a forest.

Stephen E Arnold, October 24, 2010

Freebie

Microsoft and Google in the Library Stacks

October 18, 2010

You can read the original or stuff the url into Google Translate. Either way, Microsoft has Google in the library stacks looking for a missing deal. The library? Bibliothèque Nationale de France (BNF). The missing deal? Microsoft seems to have put the Google search solution in a glass case and tossed the key into a dark corner.  “Microsoft meilleur défenseur du Libre que Google pour les livres?” reported:

the agreement with Microsoft is not without consequence. Discussions are held between the BNF and Google to digitize the National Fund, with support from the Senate . But “since Google requires its exclusive partner libraries indexing, having signed an agreement with Microsoft that makes it so logically impossible now for the BNF to sign a partnership with Google to scan, except that it consents to back to one of the most important strategy. “

The problem is that Google wants exclusives. Microsoft offered a different deal. Now Google has to face the reality of Microsoft snatching a victory from Google’s fingers and putting Google in the rare book room.

How will this play out? Microsoft is likely to use this strategy to put the spotlight on Google’s terms for scanning and indexing. My view is that Microsoft should have continued its effort to compete with Google Books and implemented this spotlight tactic years ago. Now I think it is too late.

The comments to the article are interesting as well. Microsoft may have a trampoline to use to bounce on in this market sector.

Stephen E Arnold, October 18, 2010

Freebie

Google and Its Alleged SSD Innovations

October 18, 2010

I have shifted my attention from Google to Facebook. Nevertheless, my full scale Overflight continues to spit out information from open sources about Google. (This Overflight link shows a handful of features in the commercial version.) I wanted to capture what may be old news to my two or three readers in this blog post. As I shift from the uninteresting world of brute force indexing to more easily manipulated world of social search, some technical innovations at Google remain interesting in a general way.

You will need to navigate to the USPTO, click Search, and then download these documents. I am not providing explicit links to the source documents due to the “free” nature of the blog.

The subject is the usefulness of solid state storage devices as a speed up and cost down method of dealing with the need to fetch and write data. Solid state devices or SSDs are a mixed blessing. There are some performance and failure benefits, but there is also flakiness, particularly with certain vendors’ products.

image

One example of an SSD for scale. Source: http://commons.wikimedia.org/wiki/File:MicroSD.jpg

Google has been looking for many years – probably as early as 2003 – at ways to get around the hassle of spinning disc failures, heat generation, and size. As silicon fabs push to smaller traces, the cost and availability of SSDs becomes increasingly attractive.

My Overflight system lit up with a series of patent applications that featured inputs from a Googler by the name of Albert T. Borchers, more easily findable as “Al Borchers.” Don’t get too revved up looking for information. He like other post Alta Vista hires, is not a high profile type of guy in the Facebook sense of the word. With some poking around you can find some info like this bio at a conference site:

Al Borchers joined Google in 2004 in the Platforms group, developing system software for Google’s servers. In the last few years he has been working on high performance storage devices. He received a Ph.D. in theoretical computer science from the University of Minnesota in 1996, and has worked in industry for many years developing Unix and Linux device drivers and system software.

Read more

Coveo Adds Connectors

October 16, 2010

Coveo has announced new information indexing connectors. Among the new connectors are those for Jive SBS Versions 3.0 to 4.5, support for Microsoft SharePoint 2010, and Microsoft Exchange 2010. Coveo updated its connector for Lotus Notes. In the news release, we learned that Coveo is working with Netezza. Earlier this year we heard that Netezza was hooked into Attivio. Netezza, as you may know, is now part of IBM, a company which has been on a mini-spending spree.

One of the interesting comments in the news story was:

Out of the box, Coveo Information Indexing Connectors seamlessly and securely index enterprise-wide systems and data repositories. Coveo-developed connectors offer superior functionality and integration, including with the native security model of each system. Coveo Connectors feature live monitoring and dynamically index new, deleted and modified documents, ensuring just-in-time access to the timeliest information.

Connectors continue to have a pipeline to our in box. The i2 – Palantir legal matter is about connectors. With the green light turned on for this dust up, connectors are edging from back stage to center stage.

More information about Coveo is available at www.coveo.com.

Stephen E Arnold, October 16, 2010

Freebie

Linguamatics Joins Up with Accelrys

October 11, 2010

Linguamatics, a nifty content processing vendor in the UK, has formed a partnership for “streamlined, high performance text analytics” with Accelrys. Linguamatics will be giving a presentation at the Smart Content Conference in Manhattan later this month, so you can learn about the company first hand, or you can navigate to http://www.linguamatics.com/. The firm’s Web site has been refreshed and you can learn about the firm’s solutions directly.

Accelrys is a company that produces scientific informatics software. If you got a D in biology, you won’t be using Accelrys’ industrial strength analytics and visualization tools any time soon. Chemistry majors, engineers, and molecular biologists will be quite interested in the firm’s solutions.

What does the hook up mean?

According to “Linguamatics and Accelrys Announce Partnership for Streamlined, High-Performance Text Analytics,”

Mutual customers will benefit by embedding powerful natural language querying within more extensive informatics workflows including access via Accelrys web clients. Organizations continue to face the challenge of filtering ever-increasing volumes of text information to gain actionable knowledge. Linguamatics provides the ability to automate document indexing and querying within the I2E software platform in addition to its interactive information extraction capabilities. Embedding I2E within Pipeline Pilot workflows enables further streamlining of the process for high throughput text mining, and provides access to additional content processing, analytics and output display options.

I would not characterize the new capabilities as search or NLP. The companies are moving, like some others, into a data fusion space. Unlike search vendors who announce that they are now involved in Business Intelligence, Linguamatics and Accelrys have industrial strength technology in place to meet the needs of a specific market category. Just my opinion.

Stephen E Arnold, October 11, 2010

Freebie

Flexi-Search from Lucene

October 8, 2010

We have seen and experienced all types of searches, but here’s one simple yet smart search that uses Lucene.NET, a direct port of the popular open source Java Lucene project. It has all the features that you wanted your search to handle; synonyms, misspellings, prefixes, suffixes, result rankings, weighting, and others. John Sprunger discourses on his blog JSprunger.com about, “Getting started with Lucene.NET,” describing Lucene-based search as capable of indexing all types of content, “including files, database records, and web pages.” It can be tweaked as required and used for, “searching on ASP.NET web site, searching within a desktop app, as a web service search, or Windows service, etc.”

Embraced by biggies like EMC and Cisco, Lucene.NET has a highly abstracted class structure that allows ultimate flexibility, making it possible to change the way search works, to satisfy your clients. The article describes the procedure to create a simple search, illustrated with explanatory examples.

Harleena Singh, October 8, 2010

Freebie

Black Duck Flaps into Open Source Reference

October 6, 2010

Late last year or early this year, I explained to a giant publishing and information company about some of the important trends building in electronic information. I must admit that the audience wished it were someplace else, probably at the golf course or at a sales meeting where smiles and promises worked better than innovation.

before okay copy

I mentioned in passing that the world of open source was gaining momentum. I have documented one facet of open source in this blog. My two or three readers have been as indifferent as the big publishing and information company was.

One outfit, however, either by virtue of executive acumen or simply looking at what’s happening in open source has jumped on the open source information opportunity. That company is Black Duck Software. You can read an interview with one of the firm’s top mallards at “Bill McQuaide, Black Duck Software.”

I learned on October 5, 2010, that Black Duck is on the path of becoming the “Google of open source.” Yep, the Google of open source. Now that moniker is a tough one to shake, so my view is that Black Duck has made a play that seems to me to be pretty darned savvy.

after fixed okay

The firm acquired Ohloh.net. Navigate to the Ohloh Web site. You will be able to search a directory of open source software and a directory of developers. Now anyone who has fiddled with open source knows that the PL/1 dudes down the hall are not exactly ready to compile open source and hook it into software with unfamiliar names, often with obscure references to the Tolkien, Star Trek, and a high school Latin class.

I did some advice-from-the-pond work on a couple of open source search start ups. I panned these. Google, at the time, had an open source search service, which the 20 somethings who called me after a couple of failed journalists, mid tier consultants, and unemployed CMS consultants struck out in the knowledge department. The open source search is available as Code Search at http://www.google.com/codesearch. Black Duck and Ohloh went further.

Several observations:

  • Traditional publishing companies are probably going to have to buy, license, or stroke the features of Black Duck. The company has an opportunity to build a robust information service and the big boys who prefer to play golf and head out for an early lunch have missed the boat.
  • Open source software is operating a bit like one of those minor earth tremors in Turkey or one of the “stans.” One day the building is just crumbling. When change occurs, folks look around for information and my hunch is that Black Duck may come up Number One on the Dancing with the Coding Stars.
  • Open source, unlike indexing business information, is pretty much an insiders’ game at this time. The “community”, which is tough to define, can be a really major pain in the bursa scattered in various parts of one’s anatomy.

I suppose I should feel bad that the big information companies missed an opportunity. But, if my memory is correct, less agile outfits just pay lots of money to buy a company with a great opportunity. That is good for the black ducks out there. Geese? Now that’s another story.

Stephen E Arnold, October 6, 2010

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta