Francois Schiettecatte, FS Consulting
June 1, 2009
Through a mutual contact, I reconnected with François Schiettecatte, a search engine expert with other computer wizard skills in his toolbox. Mr. Schiettecatte worked on a natural language processing project in the late 1990s. He shifted focus and was a co-founder of Feedster.com. He told that he had contributed to a number of interesting projects and revealed that he was working on a new search and content processing system.
Mr. Schiettecatte consented to an interview. I spoke with him on May 29, and I put the full text of our discussion in the ArnoldIT.com Search Wizards Speak collection. You can find that series of interviews with influential figures in search and content processing here.
Mr. Schiettecatte and I had a lively discussion and he offered some interesting insights into the trajectory of search and retrieval. Let me highlight two of his comments and invite you to read the full text of the discussion here.
In response to a question about the new start ups entering the search and retrieval sector, Mr. Schiettecatte said:
You can apply different search approaches to different data sets, for example traditional search as well as NLP search to the same set of documents. And certain data set will lend themselves more naturally to one type of search as opposed to another. Of course user needs are key here in deciding what approaches work best for what data. I would also add that we have only begun to tackle search and that there is much more to be done, and new companies are usually the ones willing to bring new approaches to the market.
We then discussed the continuing interest in semantic technology. On this matter, Mr. Schiettecatte offered:
More data to search usually means more possible answers to a search, which means that I have to scan more to arrive at the answer, improved precision will go a long way to address that issue. A more pedestrian way to put this is: “I don’t care if there are about a million result, I just want the one result”. Also, having the search engine take the extra step in extracting data out of the search results and synthesizing that data into a meaningful table/report. This is more complicated but I has the potential to really save time in the long run.
For more information about Mr. Schiettecatte’s most recent project, read the full text of the interview here.
Stephen Arnold, June 2, 2009
Exclusive Interview with Bjorn Laukli, Comperio Search
May 28, 2009
I met Bjorn Laukli about nine years ago. At that time he was the chief technical officer of Fast Search & Transfer. In February 2008 I had Fast Search executives slated to participate in the Search Wizards Speak series here, but the Microsoft deal was underway and the lawyers pulled the plug. I ran into Mr. Laukli at a recent meeting, and I learned that he is now affiliated with Comperio Search, a specialist in the Fast Enterprise Search Platform. I was able to get some of his time on May 26, 2009. The result is a Search Wizards Speak interview. Highlights of the conversation with Mr Laukli included:
An observation that consolidation in the enterprise search sector will continue. Mr. Laukli said:
I think you will continue to see some consolidations, Microsoft acquiring FAST, Omniture buying Mercado etc. This trend will continue as larger companies are trying to strengthen their position within search, bring in new technologies that will bring value to their overall offering, or increasing their customer base.
He also said about the number of new companies entering the search sector:
Certainly, there are many new companies entering the search and content processing space. I still feel there are opportunities for most of them to succeed as long as they are able to differ themselves in some ways. This can be done either by focusing on certain segments of the market, or, deliver new capabilities as an addition to existing platforms.
You can read the full interview on Search Wizards Speak here. More information about Comperio, a Fast Certified Partner, is available from www.comperiosearch.nl.
Stephen Arnold, May 28, 2009
Exclusive Interview: Donna Spencer, Enterprise Systems Expert
April 20, 2009
Editor’s Note: Another speaker for what looks like a stellar conference agreed to an interview with Janus Boye. As you know, the Boye 09 Conference in Philadelphia takes place the first week in May 2009, May 5 to May 7, 2009, to be precise. Attendees can choose from a number of special interest tracks. These include a range of topics; including strategy and governance, Intranet, Web content management, SharePoint, user experience, and eHealth. Click here for more conference information. Janus Boye spoke with Donna Spencer on April 16, 2009.
Ms. Spencer is a freelance information architect, interaction designer and writer. She plans how to present the things you see on your computer screen, so that they’re easy to understand, engaging and compelling: Things like the navigation, forms, categories and words on intranets, websites, web applications and business systems.
The full text of the interview appears below.
Why is it so hard for organizations to get a grip on user experience design?
I don’t know that this is necessarily true. There are lots of organizations creating awesome user experiences. Of course, there are a lot who aren’t creating great experiences, but it isn’t because they can’t get a grip on user experience, it is because they care more about themselves than about their customers. If they really cared about their customers they’d do stuff to make their experiences great - and that’s possible without even knowing anything formal about user experience. But because they don’t care about their customers, they will fail, as they should…
Is content or visual design most important to the user experience?
Content (or functionality) is ultimately what people visit a website, intranet or application for. So it’s really, really important to get that right. If the core of the product is bad, it isn’t going to work.
But the visual design is often the part that helps people to get to the content. If the layout is poor, the colours and contrast awful and the site looks like it was designed in 1995, that’s going to stop people from even trying.
So both are important, though if I ever had to choose, I’d go for great content.
Is your book on card sorting really going to be released in 2009?
Yes, by the time the conference is on, there should be real, printed books. 150-odd pages of card sorting goodness. I hear that it should be out around 28 April. Really. I promise.
Does Facebook actually offer a better user experience after the redesign?
That’s a really interesting question. I can only speak for myself, but the thing that struck me about the redesign is that all of a sudden Facebook feels like a different beast. It used to be a site where friends were, but also where there were events, and groups and silly apps. Now it just feels like twitter that you can reply to. It feels like they have done a complete turn-around on who they actually are.
So for me the experience is worse. I can get a better idea of what my friends are doing, but I do that via twitter. Now it’s much harder for me to experience groups, events and all the other things we used to do there. I’m definitely using it less.
Why are you speaking at a Philadelphia web conference organized by a company based in Denmark?
Because they rock! But really, their core business overlaps a lot with what I do. I’m interested in the content the conference offers and I think my experience offers a lot to the attendees. Plus I’ve never been to Philly, and travelling to new places is a wonderful learning experience.
Exclusive Interview with MaxxCat
April 15, 2009
I spoke with Jim Jackson on April 14, 2009. Maxxcat is a search and content processing vendor delivering appliance solutions. The full text of the interview appears below:
Why another appliance to address a search and content processing problem?
At MaxxCat, we believe that from the performance and cost perspectives, appliance based computing provides the best overall value. The GSA and Google Mini are the market leaders, but provide only moderate performance at an expensive price point. We believe that by continuously obsessing about performance in the three major dimensions of search (volume of data, speed of retrieval, and crawl/indexing times), our appliances will continue to improve. Software only solutions can not match the performance of our appliances. Nor can software only, or general purpose hardware approaches provide the scaling, high availability or ease of use of a gray-box appliance. From an overall cost perspective, even free software such as Lucene, may end up being more expensive than our drop-in and use appliance.
Jim Jackson, Maxxcat
A second factor that is growing more important is the ease of integration of the appliance. Many of our customers have found unique and unexpected uses for our appliances that would have been very difficult to implement with black box architectures like Google’s. Our entry level appliance can be set up in 3 minutes, comes with a quick start guide that is only 12 pages long, and can be administered from two simple browser pages. That’s it! Conversely, software such as Lucene has to be downloaded, configured, installed, understood, matched with suitable hardware. This is typically followed by a steep learning curve and consulting fees from experts who are involved in getting a working solution, which sometimes doesn’t work, or won’t scale.
But just because the appliance features easy integration, this does not mean that complex tasks cannot be accomplished with it. To aid our customers in integrating our appliances with their computing environments, we expose most of the features of the appliance to a web API. The appliance can be started, stopped, backed up, queried, pointed at content, SNMP monitored, and reported upon by external applications. This greatly eases the burden on developers who wish to customize the output, crawl behavior and integration points with our appliance. Of course this level of customization is available with open source software solutions, but at what price? And most other hardware appliances do not expose the hardware and operating system to manipulation.
Throughput becomes an issue eventually. What are the scaling options you offer
Throughput is our major concern. Even our entry level appliance offers impressive performance using, for the most part, general purpose hardware. We have developed a micro-kernel architecture that scales from our SB-250 appliance all the way through our 6 enterprise models. Our clustering technology has been built to deliver performance over a wide range of the three dimensions that I mentioned before. Some customers have huge volumes of data that are updated and queried relatively infrequently. Our EX-5700 appliance runs the MaxxCAT kernel in a horizontal, front-facing cluster mode sitting on top of our proprietary SAN; storage heavy, adequate performance for retrieval. Other customers may have very high search volumes on relatively smaller data sets (< 1 Exabyte). In this case, the MaxxCAT kernel runs the nodes in a stacked cluster for maximum parallelism of retrieval. Same operating system, same search hardware, same query language, same configuration files etc, but two very different applications. Both heavy usage cases, but heavy in different dimensions. So I guess the point I am trying to make is that you can say a system scales, but does it scale well in all dimensions, or can you just throw storage on it? The MaxxCAT is the only appliance that we know of that offers multiple clustering paradigms from a single kernel. And by the way, with the simple flick of a switch on one of the two administration screens I mentioned before, the clusters can be converted to H/A, with symmetric load balancing, automatic fault detection, recovery and fail over.
Where the the idea for the MaxxCat solution originate?
Maxxcat was inspired by the growing frustration with the intrinsic limitations of the GSA and Google Mini. We were hearing lamentations in the market place with respect to pricing, licensing, uptime, performance and integration. So…we seized the opportunity to build a very fast, inexpensive enterprise search capability that was much more open, and easier to integrate using the latest web technologies and general purpose hardware. Originally, we had conceived it as a single stand alone appliance, but as we moved from alpha development to beta we realized that our core search kernel and algorithms would scale to much more complex computing topologies. This is why we began work on the clustering, H/A and SAN interfaces that have resulted in the EX-5000 series of appliances.
What’s a use case for your system?
I am going to answer your question twice, for the same price. One of our customers had an application in which they had to continuously scan literally hundreds of millions of documents for certain phrases as part of a service that they were providing to their customers, and marry that data with a structured database. The solution they had in place before working with us was a cobbled together mish mash of SQL databases, expensive server platforms and proprietary software. They were using MS SQLServer to do full text searching, which is a performance disaster. They had queries that were running on very high end Dell quad core servers maxed out with memory that were taking 22 hours to process. Our entry level enterprise appliance is now doing those same queries in under 10 minutes, but the excitement doesn’t stop there. Because our architecture is so open, they were able to structure the output of the MaxxCAT into SQL statements that were fed back into their application and eliminate 6 pieces of hardware and two databases. And now, for the free, second answer. We are working with a consortium of publishers who all have very large volumes of data, but in widely varying formats, locations and platforms. By using a MaxxCAT cluster, we are able to provide these customers, not customers from different divisions of the same company, but different companies, with unified access to their pooled data. So the benefits in both of these cases is performance, economy, time to market, and ease of implementation.
Where did the name “MaxxCat” come from?
There are three (at least) versions of the story, and I do not feel empowered to arbitrate between the factions. The acronym comes from Content Addressable Technology, an old CS/EE term. Most computer memories work by presenting the memory with an address, and the memory retrieves the content. Our system works in reverse, the system is presented with content, and the addresses are found. A rival group, consisting primarily of Guitar Hero players, claims that the name evokes a double x fast version of the Unix ‘cat’ command (wouldn’t MaxxGrep have been more appropriate?). And the final faction, consisting primarily of our low level programmers claim that the name came from a very fast female cat, named Max who sometimes shows up at our offices. I will make as many friends as enemies if I were to reveal my own leanings. Meow.
What’s the product line up today?
Our entry level appliance is the SB-250, and starts at a price point of $1,995. It can handle up to several million web pages or documents, depending upon size. None of our appliances have artificial license restrictions based upon silly things like document counts. We then have 6 models of our EX-5000 enterprise appliances that are configured in ever increasing numbers of nodes, storage, and throughput. We really try to understand a customer’s application before making a recommendation, and prefer to do proofs of concept with the customer’s actual data, because, as any good search practitioner can tell you, the devil is in the data.
8. What is the technical approach of your search and content processing system?
We are most concerned with performance, scalability and ease of use. First of all, we try to keep things as simple as possible, and if complexity is necessary, we try to bury it in the appliance, rather than making the customer deal with it. A note on performance; our approach has been to start with general purpose hardware and a basic Linux configuration. We then threw out most of Linux, and built our operating system that attempts to take advantage of every small detail we know about search. A general purpose Linux machine has been designed to run databases, run graphics applications, handle network routing, sharing and interface to a wide range of devices and so forth. It is sort of good at all of them, but not built from the ground up for any one of them. This fact is part of the beauty of building a hardware appliance dedicated to one function — we can throw out most of the operating system that does things like network routing, process scheduling, user accounting and so forth, and make the hardware scream through only the things that are pertinent to search. We are also obsessive about what may seem to be picayune details to most other software developers. We have meetings where each line of code is reviewed and developers are berated for using one more byte or one more cycle than necessary. If you watch the picoseconds, the microseconds will take care of themselves.
A lot of our development methodology would be anathema to other software firms. We could not care less about portability or platform independence. Object oriented is a wonderful idea, unless it costs one extra byte or cycle. We literally have search algorithms that are so obscure, they take advantage of the Endianess of the platform. When we want to do something fast, we go back to Knuth, Salton and Hartmanis, rather than reading about the latest greatest on the net. We are very focused on keeping things small, fast, and tight. If we have a choice between adding a feature or taking one out, it is nearly unanimous to take it out. We are all infected with the joy of making code fast and small. You might ask, “Isn’t that what optimizing compilers do”. You would be laughed out of our building. Optimizing compilers are not aware of the meta algorithms, the operating system threading, the file system structure and the like. We consider an assembler a high level programming tool, sort of. Unlike Microsoft Operating systems which keep getting bigger and slower, we are on a quest to make ours smaller, faster. We are not satisfied yet, and maybe we won’t ever get there. Hardware is changing really fast too, so the opportunities continue.
How has the Google Search Appliance affected the market for your firm’s appliance?
I think that the marketing and demand generation done by Google for the GSA is helping to create demand and awareness for enterprise search, which helps us. Usually, especially on the higher end of the spectrum, people who are considering a GSA will shop a little, or when they come back with the price tag, their boss will tell them “What??? Shop This!”. They are very happy when they find out about us. What we share with Google is a belief in box based search (they advocate a totally closed black box, we have a gray box philosophy where we hide what you don’t need to know about, but expose what you do). Both of our companies have realized the benefits of dedicating hardware to a special task using low cost, mass produced components to build a platform. Google offers massive brand awareness and a giant company (dare I say bureaucracy). We offer our customers a higher performing, lower cost, extensible platform that makes it very easy to do things that are very difficult with the Google Mini or GSA.
What hooks / services does your API offer?
Every function that is available from the browser based user interface is exported through the API. In fact, our front end runs on top of the API, so customers who are so inclined to do so could rewrite or re-organize the management console. Using the API, detailed machine status can be obtained. Things such as core temperature, queries per minute, available disk space, current crawl stats, errors and console logs are all at the user’s fingertips. Furthermore, collections can be added, dropped, scheduled and downloaded through the API. Our configuration and query languages are simple, text based protocols, and users can use text editors or software to generate and manipulate the control structures. Don’t like how fast the MaxxCAT is crawling your intranet, or when? Control it with external scheduling software. We don’t want to build that and make you learn how to use it. Use Unix cron for that if that’s what you like and are used to. For security reasons, do you want to suspend query processing during non-business hours? No problem. Do it from a browser or do it from a mainframe.
We also offer a number of protocol connectors to talk to external systems — HTTP, HTTPS, NFS, FTP, ODBC. And we can import the most common document formats, and provide a mechanism for customers to integrate additional format connectors. We have licensed a very slick technology for indexing ODBC databases. A template can be created to create pages from the database and the template can be included in the MaxxCAT control file. When it is time to update say, the invoice collection, the MaxxCAT can talk directly to the legacy system and pull the required records (or those that have changed or any other SQL selectable parameters), and format them as actual documents prior to indexing. This takes a lot of work off of the integration team. Databases are traditionally tricky to index, but we really like this solution.
With respect to customizing output, we emit a standard JSON object that contains the result and provide a simple templating language to format those results. If users want to integrate the results with SSIs or external systems, it is very straightforward to pass this data around, and to manipulate it. This is one area where we excel against Google, which only provides a very clunky XML output format that is server based, and hard to work with. Our appliance can literally become a sub-routine in somebody else’s system.
What are new features and functions added since the last point release of your product?
Our 3.2 OS (not yet released) will provide improved indexing performance, a handful of new API methods, and most exciting for us, a template based ODBC extractor that should make pulling data out of SQL databases a breeze for our customers. We also have scheduled toggle-switch H/A, but that may take a little more time to make it completely transparent to the users.
13. Consolidation and vendors going out of business like SurfRay seem to be a feature of the search sector. How will these business conditions affect your company?
Another strange thing about MaxxCAT, in addition to our iconoclastic development methods is our capital structure. Unlike most technology companies, especially young ones, we live off of revenue, not equity infusions. And we carry no debt. So we are somewhat insulated from the current downturn in the capital markets, and intend to survive on customers, not investors. Our major focus is to make our appliances better and faster. Although we like to be involved in the evaluation process with our customers, in all but the most difficult of cases, we prefer to hand off the implementation to partners who are familiar with our capabilities and who can bring in-depth enterprise search know how into the mix.
Where do I go to get more information?
Visit www.maxxcat.com or email sales@maxxcat.com
Stephen Arnold, April 15, 2009
Bob Boiko, Exclusive Interview
April 9, 2009
The J Boye Conference will be held in Philadelphia, May 5 to May 7, 2009. Attendees can choose from a number of special interest tracks. These include strategy and governance, Intranet, Web content management, SharePoint, user experience, and eHealth. You can get more information about this conference here.
One of the featured speakers, is Bob Boiko, author of Laughing at the CIO and a senior lecturer at the University of Washington iSchool. Peter Sejersen spoke with Mr. Boiko about the upcoming conference and information management today.
Why is it better to talk about “Information Management” than “Content Management”?
Content is just one kind of information. Document management, records management, asset management and a host of other “managements” including data management all deal with other worthy forms of information. While the objects differ between managements (CM has content items, DM has file, and so on) the principles are the same. So why not unite as a discipline around information rather than fracture because you call them records and I call them assets?
Who should be responsible for the information management in the organization?
That’s a hard question to answer outside of a particular organizational context. I can’t tell you who should manage information in *your* organization. But it seems to me in general that we already have *Information* Technology groups and Chief *Information* Officers, so they would be a good place to start. The real question is are the people with the titles ready to really embrace the full spectrum of activities that their titles imply
What is your best advice to people working with information management?
Again, advice has to vary with the context. I’ve never found two organizations that needed the same specific advice. However, we can all benefit from this simple idea. If, as we all seem to believe, information has value, then our first requirement must be to find that value and figure out how to quantify it in terms of both user information needs and organizational goals. Only then should we go on to building systems that move information from source to destination because only then will we know what the right sources and destinations are.
Your book “Laughing at the CIO” has a catchy title, but have you ever laughed at you CIO yourself?
I don’t actually. But it is always amazing to me how many nervous (and not so nervous) snickers I hear when I say the title. The sad fact is that a lot of the people I interact with don’t see their leadership as relevant. Many (but definitely not all) IT leaders forget or never knew that there is an I to be lead as well as a T. It’s not malicious, it has just never been their focus. I gave the book that title in an attempt to make it less ignorable to IT leaders. Once a leader (or would be leader) picks the book up, I hope it helps them build a base of strength and power based on the strategic use of information as well as technology.
Why are you speaking at a Philadelphia web conference organized by a company based in Denmark?
Janus and his crew are dynamite organizers. They know how to make a conference much more than a series of speeches. They have been connecting professionals and leaders with each other and with global talent for a long time. Those Danes get it and they know how to get you to get it too.
Peter Sejersen, J Boye. April 9, 2009
Exclusive Interview with David Pogue
April 8, 2009
This year’s most exciting conference for online professionals in Philadelphia is now only four weeks away. In addition to top notch speakers like David Pogue, the networking opportunities at a J. Boye conference are excellent.
One attendee said, “What I like about the J. Boye Conferences is that they bring together industry experts and practitioners over high-quality content that seems to push participants’ professional limits and gets everyone talking. So if you want to learn – but participate as well – consider joining us in Philadelphia this May.”
Instead of product pitches, the speakers at a J. Boye conference deliver substance. For example, among the newest confirmed case studies are Abercrombie & Fitch, Foreign Affairs and International Trade Canada, Pan American Health Organization, Hanley Wood and Oxford University (UK).
For a preview of what you will experience. Here’s an exclusive with David Pogue, technology expert and New York Times’s journalist. Sign up here and secure one of the remaining seats.
Why is Google so much more used than its competitors?
Mostly because it’s better. Fast, good, idiotproof, uncluttered, ubiquitous. There’s also, at this point, a “McDonald’s factor” happening. That is, people know the experience, it’s the same everywhere they go, there’s no risk. They use Google because they’ve always used Google. It would be very hard, therefore, for any rival to gain traction.
David Pogue, one of the featured speakers at JBoye 09 in Philadelphia May 5 to 7, 2009.
When will Gmail become the preferred email solution for organizations?
August 3, 2014. But seriously, folks. Nobody can predict the future of technology. Also, I’m sure plenty of organizations use it already, and it’s only picking up steam. Gmail is becoming truly amazing.
Will Google buy Twitter - and what will it mean if they do?
I don’t know if they’ll buy it; nobody does. It would probably mean very little except a guaranteed survival for Twitter, perhaps with enhancements along the way. That’s been Google’s pattern (for example, when it bought GrandCentral.)
Why is it so hard for organizations to get a grip on user experience design?
The problems include lack of expertise, limited budget (there’s an incentive to do things cheaply rather than properly), and lack of vision. In other words, anything done by committee generally winds up less elegant than something done by a single, focused person who knows what he’s doing.
Why are you speaking at a Philadelphia web conference organized by a Denmark-based company?
Because they obviously have excellent taste.
Stephen Arnold, April 8, 2009
Adhere Solutions: Sticky Solutions and Connectors
March 31, 2009
I like Adhere Solutions’ software. I should. The company was conceived by my son, Erik S. Arnold. He once worked with the goslings, but he flew the coop to Chicago and services clients worldwide with his sticky solutions and connectors technology. Stuart Schram IV, one of ArnoldIT’s top geese, interviewed Erik Arnold. The full text of the conversation appears below. After the interview, you can read the full text of the Adhere Solutions news release about its newest product
Erik S. Arnold, Adhere Solutions. Quite Googley and reliable I wish to add.
What’s an Adhere?
Adhere Solutions is a Google Enterprise Partner providing products and services that help businesses create solutions based on Google and other cloud computing technologies. We have an experienced team of consultants to help our customers leverage Google’s Enterprise products (Search, Maps, Apps) to create business applications that improve access to information, communication and collaboration. Adhere will compliment Google’s enterprise products with other software and services to meet clients’ needs. Using Google as a foundation delivers applications faster and cheaper than traditional enterprise software approaches, while making end users happy. Few managers understand how they can create high-end solutions leveraging Google technologies.
Why are you providing connectors?
Connectors are an important piece of the puzzle to take advantage of Google technologies. For the GSA, it allows users to search across different sources of information inside an enterprise. I call the the Google Search Appliance a “SaaS in the Box,” because you can do sophisticated things with it if you leverage its APIs. However, you do have to have a good deal of search expertise to use the advanced capabilities.
Adhere Solutions wants to make it easy for GSA customers to index their enterprise data, and our connectors bridge the gap between the GSA and internal content stored in databases, document management systems, etc. This approach is the same as other enterprise software solutions, but customers are shielded through expensive professional services and setup fees. We want to educate the marketplace that they can use the GSA to perform these functions with connectors for a lower cost.
What’s a typical use case for your software?
Good question. I think that connectors in a search environment are easier to understand. We have a customer at a government agency that wishes to index a Documentum system with Google. Our connector extracts the data from Documentation, processes the data, and feeds it to the GSA. This process takes place on a server that outside of the GSA.
Image source: http://homepages.ius.edu/USTEWART/super_glue.jpg
A major reason for our investment in connectors, though, has to do with improvements to Google Apps. Google recently announced its visualization tools (http://googleenterprise.blogspot.com/2009/03/charts-charts-charts.html), so it is now possible to send selected enterprise data into Google Apps and have access to real-time visualization of your enterprise data. This to me is groundbreaking, I think that it is very cost efficient way to create business intelligence applications in a Google interface.
Can you deliver custom connectors?
We can build custom connectors, but we tend to license our connectors from established software vendors. Connecting into enterprise systems is not new, it is just that until now, no one has packaged a high end connector suite for the GSA. For lack of a better term, Adhere Solutions is more of an integrator than a software company. We use existing high quality products whenever we can.
How does a connector differentiate you from other GSA specialists?
Adhere Solutions is unique in that everyone involved has many years of enterprise search experience. Our goal as a company is to introduce Google into higher end search procurements. While Google Search Appliance is easy to get up and running, it is not uncommon to need help with basic search tasks. What is easy to Google is not easy for everyone. There are many fine GSA specialists who can help with basic setups, but we see ourselves as unique in delivering Google for high end solutions.
How do people reach you?
Write me: erik at adheresolutions dot com or call. Our number is 800 799 0520.
Here’s the full text of the Adhere Solutions news release:
Adhere Solutions Expands Its All Access Connector Suite For the Google Search Appliance to Include Enterprise Content Management Systems
Businesses now can provide employees greater access to enterprise data through the Google Search Appliance’s popular interface
Chicago, IL — March 31, 2009 — Today, Adhere Solutions, a certified Google Enterprise Partner, announced that its All Access Connector for the Google Search Appliance includes instant connectivity to over 30 popular enterprise content management systems, including EMC, Documentum, eRoom, IBM FileNet, and Lotus Notes, Interwoven’s TeamSite and Work Site, Microsoft SharePoint, Open Text, Oracle Stellent, Xerox Docushare and many more.
Adhere Solutions’ connector suite for the Google Search Appliance allows users to find information stored in disparate data sources and applications with Google’s user interface. This relieves users from having to separately search within each application and information repository. The Google Search Appliance combined with the All Access Connector empowers companies to efficiently unify information access and help users quickly find information to effectively perform their job.
“Users don’t particularly know or care about the subtleties of universal search vs. federated search - their mission is not to search, but rather to find. They are also not terribly interested in knowing WHY they cannot search for certain information,” said Dan Keldsen, noted Findability expert, Co-founder and Principal at Information Architect (www.InformationArchitected.com). “If factors in their findability frustrations have been because Google ‘couldn’t get there from here’ - the odds just significantly improved that the Google Search Appliance will be able to search across ALL of your information, rather than the ‘web native’ content Google is known for.”
Indexing connectors for enterprise content management systems are the newest addition to Adhere Solutions’ All Access Connector for the Google Search Appliance, which already includes federated search access to over 5,400 internal and external databases, repositories, subscription content sources, data feeds and business intelligence applications. With this addition Adhere Solutions delivers a suite of secure connectors to reduce the complexity and cost of searching across enterprise data repositories.
“Many organizations struggle with how to unlock their data when they have multiple content and document management solutions dispersed throughout their organization.” said Erik Arnold, Co-founder and President of Adhere Solutions. “We want every manager, IT or otherwise, to know that we enable the Google Search Appliance to provide enterprise search better, cheaper, and faster than other approaches.”
About Adhere Solutions
Adhere Solutions is a Google Enterprise Partner providing products and services that help businesses increase productivity through the accelerated adoption of Google and other technologies. Adhere’s experienced team of consultants help customers leverage Google’s Enterprise Search products, Google Maps, and Google Apps to create business applications that improve access to information, communication and collaboration.
For more information on Adhere Solutions products or services visit the company’s Web site at www.adheresolutions.com or write info@adheresolutions.com.
###
If you would like more information on this topic, or to schedule an interview with Erik Arnold, please contact Amy DiNorscio at (312) 380-5772 or write to pr@adheresolutions.com
Stuart Schram IV, March 31, 2009
EntropySoft: Exclusive Interview with Nicolas Maquaire, CEO
March 25, 2009
A search engine or content processing system is deaf and dumb without a connector to a content source. Most text processing systems include these software connectors (sometimes called “filters” or “adaptors”) to process flat text such as the ASCII generated by a simple text editor. But plain text makes up a small part of the content stored on an organization’s file servers, workstations, and computers. In order to index content from a legacy AS/400 system running the Ironsides enterprise resource planning system, a specialized software connector is required. Writing these connectors is tricky. EntropySoft is a content integration company. The firm has a strong competency in creating software to perform a range of content manipulations; for example, content transformation of an XML file into a file type required by another business process or enterprise system. Mr. Maquaire spoke with Stephen E. Arnold, ArnoldIT.com on March 24, 2009, about EntropySoft’s software and services.
Nicolas Maquaire, the chief executive officer, of EntropySoft described his company this way:
EntropySoft is a connector factory. We have more than 30 read/write connectors for unstructured data, possibly the biggest portfolio on the market. Our connectors enable most of the features of popular content-centric applications such as Alfresco, IBM FileNet P8, Hummingbird DM, Interwoven TeamSite, IBM Lotus Quickplace, Microsoft SharePoint etc… The extensive support of features and the size of the connector portfolio make this technology perfect OEM material for many software industries. On top of the read / write connectors, EntropySoft has two technological layers (Content ETL and Content Federation) that are also available as OEM components.
A number of the world’s leading search and content processing companies use EntropySoft’s connectors. Examples include Coveo, Exalead, and Image Integration Systems.
Mr. Maquaire, in an exclusive interview with ArnoldIT.com’s Search Wizards Speak series, said:
The market for content integration is complex. Building a single connector for a specific use case seems nonsensical to us. If you develop many connectors, interoperability then becomes reality. Thanks to its more than 30 (and growing!) connectors, EntropySoft is becoming a one-stop-shopping point for connectivity and interoperability. For the past four years, EntropySoft has acquired valuable knowledge on all popular content-centric systems. EntropySoft connectors have been market-tested for years. EntropySoft connectors are put to work daily in critical business conditions, and EntropySoft unique in-house developed testing system allows fast implementation of customer-driven connectors improvements.
You can read the full-text of the Maquaire interview on the ArnoldIT.com Web site here. The interview is number 37 in this series. The interviews provide one of the most useful bodies of information about enterprise search and content processing available at this time. The Search Wizards Speak is available as a service to organizations and information professionals worldwide. Knowledge about search and content processing increases the payoff from an investment in information retrieval.
Marc Krellenstein Interview: Inside Lucid Imagination
March 17, 2009
Open source search is gaining more and more attention. Marc Krellenstein, one of the founders of Lucid Imagination, a search and services firm, talked about the company’s technology with Stephen E. Arnold, ArnoldIT.com. Mr. Krellenstein was the innovator behind Northern Light’s search technology, and he served as the chief technical officer for Reed Elsevier, where he was responsible for search.
In an exclusive interview, Mr. Krellenstein said:
I started Lucid in August, 2007 together with three key Lucene/Solr core developers – Erik Hatcher, Grant Ingersoll and Yonik Seeley – and with the advice and support of Doug Cutting, the creator of Lucene, because I thought Lucene/Solr was the best search technology I’d seen. However, it lacked a real company that could provide the commercial-grade support and other services needed to realize its potential to be the most used search software (which is what you’d expect of software that is both the best core technology and free). I also wanted to continue to innovate in search, and believed it is easier and more productive to do so if you start with a high quality, open source engine and a large, active community of developers.
Mr. Krellenstein’s technical team gives the company solid open source DNA. With financial pressures increasing and many organizations expressing dissatisfaction with mainstream search solutions, Lucid Imagination may be poised to enjoy rapid growth.
Mr. Krelllenstein added:
I think most search companies that fail do so because they don’t offer decisively better and affordable software than the competition and/or can’t provide high quality support and other services. We aim to provide both and believe we are already working with the best and most affordable software. Our revenue comes not only from services such as training but also from support contracts and from value-add software that makes deploying Lucene/Solr applications easier and makes the applications better.
You can read the full text of the interview on the ArnoldIT.com Web site here. Search Wizards Speak is a collection of 36 candid interviews with movers and shakers in search, content processing, and business intelligence. Instead of reading what consultants say about a company’s technology, read what the people who developed the search and content processing systems say about their systems. Interviews may be reprinted and distributed without charge. Attribution and a back link to ArnoldIT.com and the company whose executive is featured in the interview are required. Stephen E. Arnold provides these interviews as a service to those interested in information retrieval.
Stephen Arnold, March 17, 2009
EveryZing: Exclusive Interview with Tom Wilde, CEO
March 16, 2009
Tom Wilde, CEO of EveryZing, will be one of the speakers at the April 2009 Boston Search Engine Meeting. To meet innovators like Mr. Wilde, click here and reserve your space. Unlike “boat show” conferences that thrive on walk in gawkers, the Boston Search Engine Meeting is content muscle. Click here to reserve your spot.
EveryZing here is a “universal search and video SEO (vSEO) firm, and it recently launched MediaCloud, the Internet’s first cloud-based computing service for generating and managing metadata. Considered the “currency” of multimedia content, metadata includes the speech transcripts, time-stamped tags, categories/topics, named entities, geo-location and tagged thumbnails that comprise the backbone of the interactive web.
With MediaCloud, companies across the Web can post live or archived feeds of video, audio, image and text content to the cloud-based service and receive back a rich set of metadata. Prior to MediaCloud and the other solutions in EveryZing’s product suite — including ezSEARCH, ezSEO, MetaPlayer and RAMP — discovery and publishing of multimedia content had been restricted to the indexing of just titles and tags. Delivered in a software-as-a-service package, MediaCloud requires no software to purchase, install or maintain. Furthermore, customers only pay for the processing they need, while obtaining access to a service that has virtually unlimited scalability to handle even large content collections in near real-time. The company’s core intellectual property and capabilities include speech-to-text technology and natural language processing.
Harry Collier (Infonortics Ltd) and I spoke with Mr. Wilde on March 12, 2009. The full text of our interview with him appears below.
Will you describe briefly your company and its search / content processing technology?
EveryZing originally spun out of BBN technologies in Cambridge MA. BBN was truly one of the godfathers of the Internet, and developed the email @ protocol among other breakthroughs. Over the last 20 years, the US Government has spent approximately $100MM with BBN on speech-to-text and natural language processing technologies. These technologies were spun out in 2006 and EveryZing was formed. EveryZing has developed a unique Media Merchandising Engine which is able to connect audio and video content across the web with the search economy. By generating high quality metadata from audio and video clips, processing it with our NLP technology to automatically “tag” the content, and pushing it through our turnkey publishing system, we are able to make this content discoverable across the major search engines.
What are the three major challenges you see in search / content processing in 2009?
Indexing and discovery of audio and video content in search; 2) Deriving structured data from unstructured content; 3) Creating better user experiences for search & navigation.
What is your approach to problem solving in search and content processing?
Well, yes, meaning that all three are critical. However, the key is to start with the user expectation. Users expect to be able to find all relevant content for a given key term from a single search box. This is generally known as “universal search”. This requires then that all content formats can be easily indexed by the search engines, be they web search engines like Google or Yahoo, as well as site search engines. Further, users want to be able to alternately search and browse content at will. These user expectations drive how we have developed and deployed our products. First, we have the best audio and video content processing in the world. This enables us to richly markup these files and make them far more searchable. Second, our ability to auto-tag the content makes it eminently more browsable. Third, developing a video search result page that behaves just like a text result page (i.e. keyword in context, sortability, relevance tuning) means users can more easily navigate large video results. Finally, plumbing our meta data through the video player means users can search within videos and jump-to the precise points in these videos that are relevant to their interests. Combining all of the efforts together means we can deliver a great user experience, which in turn means more engagement and consumption for our publishing partners.
Search / content processing systems have been integrated into such diverse functions as business intelligence and customer support. Do you see search / content processing becoming increasingly integrated
into enterprise applications?
Yes, absolutely. Enterprises are facing a growing pile of structured and unstructured content, as well as an explosion in multimedia content with the advent of telepresence, Webex, videoconferencing, distance learning etc. At the same time, they face increasing requirements around discovery and compliance that requires them to be able to index all of this content. Search is rapidly gaining the same stature as databases and document management systems as core platforms.
Microsoft acquired Fast Search & Transfer. SAS acquired Teragram. Autonomy acquired Interwoven and Zantaz. In your opinion, will this consolidation create opportunities or shut doors?
Major companies are increasingly looking to vendors with deep pockets and bench strength around support and R&D. This has driven some rapid market consolidation. However, these firms are unlikely to be the innovators, and will continue to make acquisitions to broaden their offerings. There is also a requirement to more deeply integrate search into the broader enterprise IT footprint, and this is also driving acquisitions.
Multi core processors provide significant performance boosts. But search / content processing often faces bottlenecks and latency in indexing and query processing. What’s your view on the performance of
your system or systems with which you are familiar?
Yes, CPU power has directly benefited search applications. In the case of EveryZing, our cloud architecture takes advantage of quad-core computing so we can deliver triple threaded processing on each box. This enables us to create multiple quality of service tiers so we can optimize our system for latency or throughput, and do it on a customer by customer basis. This wouldn’t be possible without advances in computing power.
Graphical interfaces and portals (now called composite applications) are making a comeback. Semantic technology can make point and click interfaces more useful. What other uses of semantic technology do you see gaining significance in 2009?
Semantic analysis is core to our offering. Every clip we process is run through our NLP platform, which automatically extracts tags and key concepts. One of the great struggles publishers face today is having the resources to adequately tag and title all of their video assets. They are certainly aware of the importance of doing this, but are seeking more scalable approaches. Our system can use both a unsupervised and supervised approach to tagging content for customers.
Where can I find more information about your products, services, and research?
Our Web site is www.everyzing.com.


