Clearwell: Another eDiscovery Platform
June 9, 2008
The giant Thomson Reuters owns an outfit called Thomson Litigation Consulting. Thomson Litigation Consulting, in turn, recommends systems to its law firm customers. The consulting unit of Thomson Reuters earned some praise for its recommendation to DLA Piper, a firm that had a need for fast-cycle eDiscovery. You can read the effusive write up as reported on Law.com here,
Clearwell processed all 570,000 e-mail messages and attachments within our deadline of five days, providing enough time for analysis, review and production of the data. Clearwell’s incremental processing capabilities enabled TLC to start the analysis process for initial custodians within 25 minutes. The platform’s communication flow analysis enabled the legal team to quickly find all e-mails sent to specific individuals and to specific organizations (domains) within a confined date range. Clearwell’s organizational discovery automatically identified all variations of a custodian’s e-mail address, ensuring that no data for a custodian was missed.
A happy quack to Thomson Legal Consulting and to the happy, happy client. With as many as two-thirds of search and content processing systems dissatisfied, it is gratifying to know that there are success stories. The question is, “What’s a Clearwell?” The purpose of this short article is to provide some basic information about this system and make several observations about the niche strategy in search and content processing.
This is a screen shot of the Clearwell interface to see a thread or chain of related emails. The attorney can use the system to move forward and backward in the email chain. A new query can be launched. A point-and-click interface allows the attorney to filter the processed content by project, name, and other filters. The interface automatically saves an attorney’s query.
What’s a Clearwell?
The metaphor implied by the name of the company is to see into a deep, dark pit. The idea is that technology can illuminate what’s hidden.
The company is backed by Sequoia Capital, Redpoint Ventures, DAG Ventures, and Northgate Capital. In short, the firm has “smart money”. “Smart money” opens doors, presumably to secretive outfits like the Thomson Corporation. Clearwell conducted a Webinar with Google, which illustrates the company’s ability to hook up with the heavy hitters in online to educate companies about eDiscovery.
As one of the investors describes the company, Clearwell
delivers a new level of analysis of information contained in corporate document and email systems. As the first e-discovery 2.0 solution, Clearwell is poised to capitalize on this emerging market, which we expect to become a multi-billion dollar industry with the next few years.
In a nutshell, the company bundles content processing, analytics, and work flow into a product that is tailored to the needs of eDiscovery. “eDiscovery” is the term applied to figuring out what’s in the gigabytes of digital email, Word files, and depositions generated in the course of a legal matter. eDiscovery means that a research tries to know what it is in the discovered information so the lawyers know what they don’t know.
The company, unlike a generalized enterprise search platform, focuses its technology on specific markets unified by each market’s need to perform eDiscovery. These markets are:
- Corporate security. Think email analysis.
- Law firms. Grinding through information obtained in the discovery process
- Service providers. Data centers, ISPs, telcos processing content for compliance
- Government. Generally I associate the government with surveillance and intelligence operations.
Technology
There are more than 300 companies in the text processing business. I track about 12 firms focusing on the eDiscovery angle. I published a short list of some vendors as a general reference to readers of this Web log here.
The key differentiator for Clearwell is that it is a platform; that is, the customer does not have to assemble a random collection of Lego blocks into a system. Clearwell arrives, installs its system, and provides any technical assistance. For law firms in a time crunch, the Clearwell appliance is packaged as a solution that is:
- Transparent which means another attorney can figure out what produced a particular result
- Easy to use which means attorneys aren’t technical wizards
- Able to handle different type of documents and language, including misspellings
- Capable of not missing a key document which is a bad thing when the opposing attorney did not miss a document.
How does this work?
Clearwell ships an appliance that can be up and running in less than a half hour, maybe longer if the law firm doesn’t have a full-time system administrator. A graphical administration utility allows the collection or corpus to be identified to the system. Clearwell then processes the content and makes it available to authorized users.
The appliance implements the Electronic Discovery Reference Model which is a methodology supported by about 100 firms. The idea is that EDRM standardizes the eDiscovery process so an opposing attorney has a shot at figuring out where “something” comes from.
As part of the content processing, Clearwell generates entities, metadata, and indexes. One key feature of the system is that Clearwell automatically links emails into threads. An attorney can locate an email of interest and then follow the Clearwell thread through the email processed by the system. Before Clearwell, a human had to make notes about related emails. Other systems provide similar functionality. Brainware, for example, offers similar features, and it is possible to use Recommind and Stratify in this way. The idea is that Clearwell is an “eDiscovery toaster”. Lawyers understand toasters; lawyers don’t understand complex search and content processing systems.
The technical components of the Clearwell system include:
- Deduplication
- Support for multiple languages
- Entity extraction
- On-the-fly classification
- Canned analytics to count number of references to entities
- Basic and advanced search.
The system can be configured to allow an authorized user to add a tag or a flag so a particular document can be reviewed by another person. This function is generally described as a “social search” operation. It is little more than an interface to permit user-assigned index terms.
One of the most common requests made of enterprise search systems is a case function; that is, the ability to keep track of information related to a particular matter. Case operations are quite complex, and the major search platforms make it possible for the licensee to code these functions themselves. In effect, mainstream search systems don’t do case management operations out of the box.
Clearwell does. My review of the system identified this function as one of the most useful operations baked into the appliance. Case management means keeping track of who looked at what and when. In addition, the case management system bundles information about content and operations in one tidy package.
The Clearwell case function includes these features:
- Analytics which can be used for time calculations, verifying that a person who was supposed to review a document did in fact open the document
- Ability to handle multiple legal matters
- Function to permit tags and categories to be set for different legal matters
- User management tools
- Audit trails.
Attempting to implement these features with an enterprise search platform is virtually a six month job, not one that can be accomplished in a day or less.
Observations
Clearwell is an example of how a start up can look at a crowded field like enterprise search and content processing, identify points of pain, and build a business providing a product that makes the pain bearable. Clearwell’s technology is, like most search vendors’, is not unique; that is, other companies provide similar functions. What sets the company apart is the packaging of the technology for the target market. Clearwell’s technical acumen is evident in the case management functions and the useful exposure of threaded emails.
Other points that impressed me are:
- An appliance. I like appliances because I don’t have to build anything. Search is such a basic need in organizations, why should I build a search system. I don’t build a toaster.
- Bundled software. Clearwell–unlike Exegy, Google, and Thunderstone–delivers a usable application out of the box. Index Engines comes close with its search-back ups solution. But Clearwell is the leader in the appliance-that-works niche in search at this time.
- Smart money. When investors with a track record bet on a company, I think it’s worth paying attention.
I don’t have a confirmation on the cost of the appliance. My hunch is that it will be competitive with one-year fees from Autonomy, Endeca, and Fast Search (Microsoft) which is to say a six-figure number. If you have solid prices for Clearwell, use the comments section of the Web log to share that information. Please, check out the company at ClearwellSystems.com.
Stephen Arnold, June 9, 2008
Comments
4 Responses to “Clearwell: Another eDiscovery Platform”
FYI – the URL is not clearwell.com
This is a great review – thanks! I agree 100% – the key with these kinds of products is solving a real problem in a usable way. It’s very difficult to make a successful infrastructure product, *unless* you can clearly target a specific problem. In which case you’re no longer selling infrastructure, but rather an application of the infrastructure.
Yikes! Sorry I left off the last part of the url. I fixed it in the write up., Here’s the url for those who want to cut and paste it in your browser. http://www.clearwellsystems.com.
No need to praise me. I just screwed up.
Stephen Arnold, June 10, 2008
[…] This summer I have been asked about email analysis on two different occasions. In order to respond to these requests, I had to grind through my archive of email-related information. I wrote about Clearwell Systems and its approach earlier this year. You can read this essay here. […]
[…] alternative to Clearwell Systems. I have described Clearwell’s approach to content processing here. I am working on a more thorough analysis of Xobni now. My hypothesis is that Xobni is designed for […]