ISYS Search Software: A Case Study about Patent Analysis
March 18, 2008
One of the questions I’m asked is, “What tools do you use to analyze Google’s patent applications, patents, and engineering documents?” The answer is that I have a work horse tool and a number of proprietary systems and methods. On March 7, 2008, the AIIM conference organizers gave me an opportunity to explain to about 40 attendees my techniques.
Before beginning my talk, I polled the audience for their interest in text analysis. Most of those in the audience were engaged in or responsible for eDiscovery. This buzz word means “taking a collection of documents and using software to determine [a] what the contents of the document collection are and [b] identifying important people, places, things, and events in documents,” eDiscovery needs more than key word, Boolean, and free text search. The person engaged in eDiscovery does not know what’s in the collection, so crafting a useful key word query is difficult. In the days before rich text processing tools, eDiscovery meant sitting down with a three-ring binder of hard copies of emails, depositions, and other printed material. The lucky researcher began reading, flagging important segments with paper clips or color bits of paper. The work was time consuming, tedious, and iterative. Humans — specifically, this human — have to cycle through printed materials in an effort to “connect the dots” and identify the substantive information.
You can see a version of the PowerPoint deck used on March 6, 2008, here. The key points in the presentation were:
- Convert the source documents into a machine-manipulable form. This is a content transformation procedure. We use some commercial products but use custom scripts to handle most of our transformation work. The reason is that patent applications and patents are complicated and very inconsistent. Commercial products such as those available from open source or third – party vendors are not easily customized.
- Collections are processed using ISYS Search Software. This system — which we have been using for more than five years — generates an index of the documents in the collection, identifies entities (names of people, for example), and provides a number of useful access points by category. We typically copy claims or sections of the abstract and run additional queries in order to pinpoint similar inventions. In the case of Google’s disclosures, this iterative process is particularly important. Google routinely incorporates other patent applications in patent applications. Chasing down these references is difficult without ISYS’s functionality. Keep in mind that other vendors’ systems may work as well, but I have standardized in order to minimize the surprises that certain text processing systems spring on me. ISYS has proven to be reliable, fast, and without unexpected “gotchas”. You can learn more about this system here.
- Specific documents of interest are then reviewed by a human who creates an electronic “note card” attached to the digital representation of the source document and to its Portable Document Format instance if available. Having one-click access to patent applications and patents in PDF form is essential. The drawings, diagrams, figures, and equations in the text of a document must be consulted during the human analysis process.
The PowerPoint deck is here. What software do you use to analyze patent applications, patents, and engineering documents? Let me know.
Stephen Arnold, March 18, 2008
Comments
3 Responses to “ISYS Search Software: A Case Study about Patent Analysis”
In your piece on Patent Analysis you mentioned that a PowerPoint was avilable but I saw no lionk to it. I would like to take a look if it is available.
Thank You.
The link appears on the Arnoldit.com Web site. It’s the first or second news item.
Stephen Arnold, March 27, 2008 at 4 44pm eastern
Hi,
Just came across your interesting post. I have one question–Are there any free/
open source tools for patent analysis?Regards.Ram.