Connotate: Marketing by Listing Features

August 6, 2014

Connotate posted a page that lists 51 features. The title of the Web page is “What Connotate Does Better than Scripts, Scrapers, and Toolkits.” The 51 features are grouped into 10 categories. Several are standard content processing operations; for example, scaling, ease of use, and rapid deployment.

Several are somewhat fuzzy. A good example is the category “Efficiency”. Connotate explains this concept with these features:

  • Highly efficient code is automatically generated during Agent training
  • Agents bookmark the final destination and identify links that aren’t necessary, bypassing useless links and arriving at the desired data much faster
  • Optimized navigation also generates less traffic on target websites
  • Supports load balancing
  • Multi-threaded – supports simultaneous execution of multiple Agents on a single system
    • Optimizes resource usage by analyzing clues during runtime about the various intended uses of the extracted data

From my experience with training systems, I know that the process can be quite a job, particularly when the source content is not scientific, technical, and medical information. STM is somewhat easier because the terminology is less colorful than social media content, for example. The deployment of agents that do not trigger a block by a target is a good idea. But load balancing is a different type of efficiency and one that is becoming part of some vendor’s punch list.

I found the 51 items useful as a thought starter for crafting a listicle.

Stephen E Arnold, August 6, 2014

Exclusive Interview: Miles Kehoe, LucidWorks

January 30, 2013

Miles Kehoe, formerly a senior manager at Verity and then the founder of New Idea Engineering, joined LucidWorks in late 2012. I worked with Miles on a project and found him a top notch resource for search and the tough technical area which was our concern.

I was able to interview Miles Kehoe on January 25, 2013. He was forthcoming and offered me insights which I found fresh and practical. For example, he told me:

You know I come from a ‘platform neutral’ background, and I know many of the folks involved with ElasticSearch. Their product addresses many of the shortcomings in Solr 3.x, and a year or two ago that would have been a coup. But now, Solr 4 completely addresses those shortcomings, and then some, with SolrCloud and Zoo Keeper. ES says it doesn’t require a pesky ‘schema’ to define fields; and when you’re playing with a product for the first time, that is kind of nice. On the other hand, folks I know who have attempted production projects with ES tell me there’s no way you want to go into production without a schema. Apache Lucene and Solr enjoy a much larger community of developers. If you check the Wikipedia page, you’ll see that Lucene and Solr both list the Apache Software Foundation as the developer; Elastic Search lists a single developer, who it turns out, has made the vast majority of updates to date. While it is based on Apache Lucene, Elastic Search is not an Apache project. Both products support RESTful API usage, but Elastic requires all transactions to use JSON. Solr supports JSON as well, but goes beyond to support transactions in many formats including XML, Java, PHP, CSV and Python. This lets you write applications to interact with Solr in any language and with any protocol you want to use. But the most noticeable difference is that Solr has an awesome Web Based Admin UI, ES doesn’t. If you’re only writing code, you might not care, but the second a project is handed over to an Admin group they’re bound to notice! It makes me smile every time somebody says ES and “ease of use” in the same sentence – you remember the MS DOS prompt back in 1990? Although early adopters enjoyed that “simplicity”, business people preferred mouse-based systems like the Mac and Windows. We’re seeing this play out all over again – busy IT people want an admin UI – they don’t want to spend all day at what amounts to a “web command line”, stitching together URLs and JSON commands.

I found this comment prescient. I learned about a possible issue triggered by ElasticSearch in “Github Search Exposes Passwords Then Crashes.”

I pressed Mr. Kehoe for key points of differentiation in open source search. I pointed out that every vendor is rushing to embrace open source search. Some do it with lights flashing like IBM and others operate in a lower profile manner like Attivio. He told me:

Just as we have different products and services for our customers, we can customize our engagements to meet our customers’ needs. Some of our customers want to have deep product expertise in-house, and with training, best practice and advisory consulting, and operations/production consulting, we help them come up to speed. We also provide ongoing technical and production support for mission critical applications – just last month an eCommerce site ran into production problems on the Friday afternoon before Christmas. We were able to help them out and have them at full capacity before dinner. Not to dwell on it, but what sets LucidWorks apart is the people. We employ a large number of the team that created and enhances Lucene and Solr including Grant Ingersoll, Steve Rowe and Yonik Seeley. We also have significant expertise on the business side as well. At the top, Paul Doscher grew Exalead from an unknown firm into a major enterprise search player over just a few years; my former business partner Mark Bennett and I have built up deep understanding of search since our Verity days in the early 1990s.

Important information for those analyzing search systems I believe.

You can read the full text of the interview on the ArnoldIT Search Wizards Speak series at Search Wizards Speak is the largest, no cost, freely available collection of interviews with experts in search and content processing. There are more than 60 interviews available. You can find the full series listing at and

Stephen E Arnold, January 30, 2013

Sponsored by

IBM Returns to Pure Software Roots As Technology Evolves

December 27, 2012

Since IBM ceased their production of applications and reorganized into two organizations,  Middleware and Solutions in 2011, they have been pumping out infrastructure software and the complementary integration components to go with it. These inner organizational changes have helped them determine the type of solutions they can offer to companies as the industry itself evolves.

Seeking Alpha’s article “So What Does IBM Mean When It Says It’s In The Solutions Business?” explains what type of solutions IBM will be providing in the future:

“It is not individual packaged products per se, but groups of related software products, services, and systems. And we know at very high level where IBM is going to focus its solutions efforts. IBM has always been about software, services, and systems – although in recent years the first two have taken front stage. The flip side is that some of these solutions areas are overly broad. Smarter Analytics is a catch-all covering the familiar areas of business intelligence and performance management, predictive analytics and analytical decision management, and analytic applications.”

The need for sustainable ROI in technology, it is unsurprising that IBM returned to their software roots. IBM seeks opportunities with best in class partners and their association with leading enterprise search companies such as Intrafind,is a relationship that seems to be paying off well. Intrafind was an early IBM Pure integrator and both sides seem to be making the best of the relationship.

Jennifer Shockley, December 27, 2012

Sponsored by, developer of Augmentext

Companies Striving for Success Choose Proven Enterprise Search Software Providers

December 25, 2012

The days of limited mobile app options came to an end a few years ago with the increased popularity of BYOD (bring your own device) work options. A growing demand for products to simplify work processes brought about phenomenal improvements on tablets and mobile devices. In turn, the enterprise app market skyrocketed, not in price but in product offerings. Companies looking to invest in the most beneficial applications for their business will want to weigh their options carefully.

Enterprise Apps Today’s article “Choosing the Right Enterprise Apps for your Business” touches on the importance of all around support when filtering through application options:

“Today, a hefty proportion of cutting-edge applications can be found on cloud platforms in the form of SaaS (software-as-a-service). While a quick glance at the website of an enterprise software offering will tell a great deal about the maturity of a project, it is hardly the entire story. For the huge investment of time and money that a business expects to make in an enterprise software deployment, it’s important to first ensure that a supporting ecosystem is in place.”

The article offers good advice and guidance on choosing the best applications, but companies striving for success will choose a proven enterprise search software provider. Intrafind offers guidance on strategy, applications and use of enterprise search software that can help businesses make the most of their investment. Financial firms and pharmaceutical industry leaders are just a few examples of the types of enterprise that rely on Intrafind’s capabilities.

Jennifer Shockley, December 25, 2012

Sponsored by, developer of Augmentext

Hybrid Cloud with Cloning Capability May Not Bode Well for Cloud Platform Developers

December 17, 2012

The introduction of hybrid technology comes as no surprise, but one has to wonder how current developers will feel about being cloned in the future. TechCrunch’s article “CloudVelocity Launches With $5M from Mayfield to Bring the Hybrid Cloud to the Enterprise” discusses the introduction of a hybrid cloud and its growing potential, along with its cloud cloning ability.

This new technology could save companies a bundle off initial investments, but smart platform designers may take precautions against cloning in the future. One has to wonder what preparations have already been made, if any. Investors want to be certain the risk of this approach is worth the effort.

“One Hybrid Cloud platform, aims to extend the enterprise data center to the public cloud, by enabling multi-tier applications to run without modification in the cloud and access services that reside in the enterprise data center. In a nutshell, the startup allows enterprises to get the benefits of private clouds in the public cloud. Users can discover, blueprint, clone, and migrate applications between data centers and public clouds. Currently, CloudVelocity supports full server, networking, security and storage integration with AWS but plans to integrate other public clouds.”

The excitement around startups and cloud solutions is great but corporations are reluctant to take chances with sensitive data. Those enterprises seeking stability in the growing hybrid cloud universe may find some assurance in relying on a mature, capable enterprise provider. Intrafind offers consultative solutions and reliable cloud solutions with secure access.

Jennifer Shockley, December 17, 2012

Sponsored by, developer of Augmentext

Sound the Alarm: Reliable Enterprise Services Are Not Free

December 14, 2012

Sound the alarms! Information Week’s article “Google Apps No Longer Free For Businesses” announced dooms day news to those looking for a free ride including perks on the Big G. After 6 years, Google is finally pushing their premium business apps by eliminating upkeep and new availability for the free version.

Google does have a heart, as they will allow existing free users to continue utilizing the bare bone services with limited customer service and no new upgrades:

 “You get what you pay for because you can’t get what you didn’t pay for. That is, unless you already have it: Companies currently using the free version of Google Apps can continue to do so under the same terms. Individuals will be able to continue using Google’s Web apps, like Drive, Gmail and Docs at no cost through their Google Accounts. Businesses will be expected to pay for Google Apps for Business.”

The only surprise is that Google waited so long to push the remaining ‘free app’ businesses over to the premium side. When it comes to quality there is no such thing as free, and businesses who think they can get free, high performing enterprise solutions may be better off to invest in a tried, true and dedicated technologies. The Intrafind search technology is mature, feature rich and offers a worth return on investment – retrieving data when, where and how it is needed.

Jennifer Shockley, December 14, 2012

Sponsored by, developer of Augmentext

Companies Need Reliable Results Not another Plug and Play Experiment

December 10, 2012

The name Google instantly brings internet search, Android and mobile apps to mind, but that is just not enough for the Big G anymore. TechWeek’s article “Google Enterprise: More Than Just Apps” talks about a new device that Google representatives feel will take the enterprise by storm.

So, what is the next big step for Google? World enterprise domination via plug and play technology:

“This involves something called the ‘Google Search Appliance’ – a yellow box that can be plugged into the data center to look through and index business data. Recently launched Commerce Search is a similar project, but based in the cloud and focused on retail. A different part of the Enterprise department deals with geospatial products: Google Maps, Google Earth and the brand new Google Coordinate – the company’s first geo app to provide not just asset tracking, but the workflow management too.”

Of course this updated Google technology will be compatible with Chrome, Android and existing Google apps, but is this plug and play devise the right answer for sophisticated enterprise needs? What happens when a changes is needed to match unique enterprise requirements?  We have found that the mature solutions and dedicated customer service from Intrafind often meets the needs of enterprises with sophisticated requirements. Perhaps a commercial solution, built on open source can better match unique enterprise search needs than a plug and play appliance.

Jennifer Shockley, December 10, 2012

Sponsored by, developer of Augmentext

High Quality Research Surrounding Enterprise Search

November 13, 2012

Enterprise search requires companies to tap into their internal knowledge, and it has to be done in a way that makes the process quick and accessible for users. Some high-quality research is being done surrounding the capabilities and necessary features of search applications.

Research article “Designing for Enterprise Search in a Global Organization,” authored by the growing search consultancy Findwise, focuses on design concepts surrounding the company’s attempt at a search application. The company’s goal was to create a search application that provides quick access to all internal information, help users find and discover information, and create possibilities for collaboration.

The second attempt at an application focused on simplicity and design:

“The result was an application that seemed very simple at first glance, but still included all the different functionality needed in order to fulfill the information needs of the organization’s different user groups. The new design was evaluated through usage test and though it included the same functionality as the old search application the results were completely different. Users found it not only easier to use but also easy to discover new information.”

Intrafind was based upon open source technology that was developed in a similar fashion. The advantage, of course, lies in age and wisdom after years of business with well-qualified leadership such as that provided by the Director of Research at Intrafind, Christoph Goller. Goller’s experience in artificial intelligence research, as well as machine learning and neural networks, carry over into his work in scalable information retrieval and search-based applications at Intrafind.

Andrea Hayden, November 13, 2012

Sponsored by, developer of Augmentext

Intrafind Offers Tagging Service Among Other Enterprise Tools

November 12, 2012

We have been increasingly aware of software publisher Intrafind, and decided to take a self-directed tour of the company’s Web site to see what features and tools were offered. We were immediately impressed with the sleek look and easy-to-navigate menus, steering us from products, solutions, case studies, and consulting links.

Our team noted the clear explanations of Intrafind’s products to be particularly useful. The company’s Tagging Service, for example, detailed the types of tagging that are provided as well as how the system could be incorporated into a business’s existing infrastructure. Here’s the description from the product page:

“The IntraFind Tagging Service includes an automated generation of metadata / tags based on the processed content. The generated tags can be either inserted into a leading system or can be incorporated into a workflow of any customer-specific use case. The Tagging Service can be provided as an on-premise or cloud solution.

The service consists of different standardized tagging-types that can also be configured if needed: uncontrolled tagging, controlled tagging, the extraction of named entities and the generation of topic metadata.”

The enterprise data specialist company is located in Germany and has been operating since 2000. The team consists of 25 experts specializing in file systems, databases, document and content management, and Internet content. Intrafind provides everything from introductory analysis to maintenance and support. For more information, steer your browser to the company’s homepage.

Andrea Hayden, November 12, 2012
Sponsored by, developer of Augmentext

Reliable Vendors Offer Customized Enterprise Search

November 7, 2012

Companies utilizing reliable data management software have been reaping the benefits of Big Data but developers are now scrambling to one up each other. It seems some software designers are making unrealistic promises while others proudly announce claims of software reinvention that stem from simple modifications offered by longstanding developers as a given.

SC Magazine’s article “Splunk Claims Speedier Reports with Enterprise 5” stresses how Splunk plans on winning the Big Data race by improving report speed renders:

“Splunk Enterprise already provided the ability to search, analyze and visualize machine data on tablets, smartphones, laptops and non-flash browsers. In Splunk Enterprise 4.x., skilled users could refine searches to save time, but “most users didn’t have the skill set required.  An ad hoc report on ‘Web Errors broken out by URL and WebServer over the Last Month’ in a large multi-data center web environment across multiple terabytes of data might take 30 minutes to run. With report acceleration in Splunk Enterprise 5, that same report would render in less than two seconds.”

Faster does not necessarily mean better and big claims in Big Data should be swallowed with a grain of salt. Intrafind offers businesses enterprise search that is tailor made to fit specific enterprise information needs.  The ability to define requirements, tweak criteria and customize search solutions increases efficiencies and provides a better ROI thanks to more targeted relevant results.

Jennifer Shockley, November 7, 2012

Sponsored by, developer of Augmentext

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta