SirsiDynix Search Plus Discovery for Libraries
May 24, 2009
Brainware landed a deal to provide search and discovery to SirsiDynix. After a bit of poking around, I learned that SirsiDynix wanted to move beyond key word search and provide users of its library systems with discovery functions. “Discovery”, as used in this sense, refers to giving a person looking for information easy-to-use methods to look for related information and suggested information also germane to the user’s query. Endeca hooked up with Ebsco to provide “guided navigation” to Ebsco customers. Most online public access catalogs and library-centric search systems match the users’ query terms or force the user to search by entering an author’s name. Change, at long last, seems to be coming to the library for search of an institution’s textual information. I wrote about some of the Brainware system’s capabilities in my 2008 study “Beyond Search” for the Gilbane Group here. I also did a short write up about Brainware in this Web log in early 2008 here.
A reader alerted me to an announcement here that SirsiDynix will roll out an enhanced enterprise search and discovery system to over 30 libraries. You can read that announcement here. The system includes such features as:
- Trigram analysis, or “fuzzy logic” which evaluates each trigram in a word to allow for typos, diacritics and more: a first in the library search and discovery market
- “Did you mean” suggestions which are based on terms in the catalog (rather than a generic third-party dictionary)
- Dynamic search suggestions
- Delivery of saved searches through an RSS web feed
- Email and print options for search results
- Built-in “Library Favorites”
- The capability for libraries to define their own “Favorites”, profiles, languages and filters.
You can test the Brainware power “enterprise” service at the Wells County Public Library here.
The library market has been under severe price competition. This information sector is coming under more and more pressure from Google. The world’s largest search provider has been slowly expanding its services, including the controversial Google Books’ program. So far, specialized vendors of library information systems have been able to maintain the grip in today’s slippery economic one lane highway. The impact of Google on this market will be interesting to observe.
Stephen Arnold, May 24, 2009
Google and Libraries
May 1, 2009
The USPTO must be clearing backlogs. A flurry of Google patent documents became available. Several were uninteresting (floating data centers, query expansion), but one struck me as having some disruptive potential. I refer to Library Citation Integration, US7526475. You can get the document from the USPTO at http://www.uspto.gov. The abstract stated:
An online search system generates an index of documents using index information received from a library. Some documents have restricted access; some documents may not be available online. The search system provides links to documents in the library as well as other sites based on a search, and may include link resolvers received from the library. The search system provides access links to the link resolvers if an identifier, such as a user identification or IP address, matches an affiliation list from the library.
Why? Think for a moment about the commercial database vendors, the online public access catalog vendors, and the companies building content for institutional use. I thought the pointing function to items in the OCLC system was interesting. This invention gives the Google some an opportunity to stomp, should it choose to do so, in some other vineyards. Who will be squashed into fine wine? I don’t drink, so I might not be affected. Those in the library ecosystem might have a different view.
Stephen Arnold, May 1, 2009
Amsterdam Breathes New Life into Old Information Institution
April 19, 2009
A happy quack to the reader who sent me the link to Andrew Keen’s “Digital Dutch Masterpieces” here. The article points out that libraries can be both old and new media. He wrote:
at the Amsterdam public library. Instead of the dustiness and crustiness of the typical 20th century library, visitors to Amsterdam’s central public library will find not only books, but a restaurant as well as a children’s theatre and a public radio and television studio. The library, which is open every day from 10.00 am to 10.00 pm, also holds a series of cultural festivals – such as the upcoming week of poetry – which it then broadcasts on the Internet. Amsterdam library’s website epitomizes its innovative approach to the 21st curation of knowledge. The website features its own customized search engine, the “aquabrowser”, which has integrated the library’s books, CDs and DVDs as well as a rich archive of Amsterdam’s history and culture. Equally innovatively, the website provides those who use it within the walls of the library itself open access to all its digital content.
I did not resonate with the assertion that the library has a “return on investment”. That phrase has a specific meaning in financial circles. I think that the Amsterdam effort returns significant social value. One hopes other libraries absorb the lessons of this case.
Stephen Arnold, April 19, 2009
Potential Trouble for LexisNexis and Westlaw
March 2, 2009
Most online surfers don’t click to Reed Elsevier’s LexisNexis or Thomson Reuters Westlaw. The reason? These commercial services charge money–quite a lot of money–to access legal documents. Executives at both firms can deliver compelling elevator pitches about the added value each company brings to legal documents. In the pre-crash era, legal indexing was a manual process. Then the cost crunch arrived so both outfits are trying to slap software against the thorny problem of making sense of court documents, rulings, and assorted effluvia of America’s legal factories. I may write about how these two quasi US outfits have monopolized for fee legal information about American law for lawyers, government agencies. Both Reed and Thomson then turn around and sell access to these documents to the agencies that created them in the first place. I wonder if the good senator is aware of this aspect of commercial online services’ busness practices?
What’s the trouble? I bet you thought I was going to mention Google. Wrong. Google is on the edge of indexing legal information in a more comprehensive way. But the right now trouble is Senator Joe Lieberman. Wired reported that the good senator wondered by public documents are not available without a charge. You can read the story “Lieberman Asks, Why Are Court Docs Still Behind Paid Firewall?” here. Senator Lieberman’s question may lead to a hearing. The process could, in my opinion, start a chain reaction that further erodes the revenue Reed Elsevier and Thomson Reuters derive from public documents. Somewhere in the chain, the Google will beef up the legal content in its Uncle Sam service here.
At their core, Reed Elsevier and Thomson Reuters are traditional publishing and information companies. As such, their business model is fragile. Within the present financial pressure cooker, the Lieberman question could blow the lid off these two organization’s for fee legal business. If government agencies shift to a service provided by Google, Microsoft, or Yahoo, I think these two dead tree outfits will crash to the forest floor.
What the likelihood of this downside scenario. I would put it at better than 60 percent. Have another view? Share it, please. Set the addled goose straight.
Stephen Arnold, March 2, 2009
Another British Library Fear
January 28, 2009
Nick Farrell’s “British Library Fears Loss of History” reminded me that libraries are struggling for relevance in a Google-centric world. You can find his Register story here. For me the most interesting comment was:
The British Library has established a department dedicated to the collection of all these digital materials which are stored on your computer in the same way that it stores books, newspapers, documents, maps, personal letters.
I find categorical affirmatives quite amusing. The UK is collecting email and mobile data. Now the British Library wants “all” of a couple of types of digital information. Right now, the only outfit in a position to capture “all” information is Google, not a country, a company.
Libraries find themselves asked to provide shelter, job hunting, and coffee shop duties. One library expressed an interest in mobile furniture and off site book storage. The idea was that users of the library did not need some books right away.
The fear is well founded. Google will allay that fear in my opinion.
Stephen Arnold, January 28, 2009
New Google Study Announced
January 21, 2009
In July 2007, I vowed, “No more Google studies.” I was tired. Now I am just about finished with my third analysis of Google’s technology and business strategy. The two are intertwined. My publisher (Harry Collier, Infonortics Ltd.) has posted some preliminary information here about the forthcoming monograph, Google: The Digital Gutenberg. If you are curious how a Web search engine can be a digital Gutenberg, you will find this analysis of Google’s newest information technology useful. None of the information in this monograph has appeared in the more than 1,200 posts on this Web log, in my two previous Google studies, nor in my more than 200 publicly available articles, columns, and talks.
In short, the monograph will contain new information.
If you are involved in traditional media as a distributor, producer, content creator, aggregator, reseller, indexer, or user–you will find the monograph useful. You may get a business idea or two. If you are the nervous type, the monograph will give some ideas on which to chew. This study represents more than one year of research and analysis. I don’t pay much attention to the received wisdom about Google. I do focus almost exclusively on the open source information about Google’s technology using journal articles, presentations, and patent documents. The result is a look at Google that is quite different from the Google is an advertising agency approach that continues to dominate discourse. Even the recent chatter about Google’s semantic technology is old hat if you read my previous Google monographs. In short, I think this third study provides a solid look at what Google will be unveiling in the period between mid 2009 and the end of 2010. Here are the links to my two earlier studies.
- The Google Legacy. Describes how Google’s search system became an application platform. You know this today, but my analysis appeared in early 2005.
- Google Version 2.0. Explores Google’s semantic technology and the company’s innovations that greased the skids for applications, enterprise solutions, and disintermediation of commercial database publishers. A recent podcast broke the old news just a few days ago. Suffice it to say that most pundits were unaware of the scope and scale of Google’s semantic innovations. Cluelessness is reassuring, just not helpful when trying to assess a competitive threat in my opinion.
I don’t have the energy to think about a fourth Google study, but this trilogy does provide a reasonably comprehensive view of Google’s technical infrastructure. I know from feedback from Googlers that the information about some of Google’s advanced technology is not widely known among Google’s rank and file employees. Google’s top wizards know, but these folks are generally not too descriptive about Google’s competitive strengths. Most pundits are happy to get a Google mouse pad or maybe a Google baseball hat. Not me. I track the nitty gritty and look past the glow of the lava lamps. I don’t even like Odwalla strawberry banana juice.
Stephen Arnold, January 21, 2009
Google: Betting on Demographics
January 21, 2009
A reader groused about my poking fun at the British Library research reports. You can read these Swiftian essays here and here. Libraries are in a tough spot. With the financial crisis expanding, libraries are now the go to place to get warm and look for a job. Most libraries depend on a funding authority for money. As those authorities find themselves short of cash, libraries find themselves fighting for enough dough to keep staff and pay the heating and electricity bills. Book and journal acquisitions are lower on the list. Therefore, libraries have to justify the monetary needs. The British Library and the other national libraries are leading a charge for the relevance and importance of buildings stuffed with people looking for work. Oh, yes, these libraries want to collect dead tree outputs of publishers, pictures so these can be placed on Yahoo’s Flickr service, and electronic information so a user can access these data. The problem with this picture is that the Google has become the global library. National libraries are becoming more like branch offices of Google. Now librarians get annoyed when I point out that:
- Google is indexing books, magazines, Web sites, and Web logs
- Google is indexing government information
- Google is offering a job service that few know anything about but you can read about this in my forthcoming study of Google due in April 2009 from Infonortics (I’m sure an entitlement generation blogger will jump on this item and write about it before my study comes out. Imitation is a form of flattery I suppose, but it is more of a character trait of the trophy children in my opinion)
- Google is gathering videos.
What are libraries doing? Well, I don’t think libraries are in step with what users’ information needs are. I think college professors, mayors, and government officials have their views of libraries. Street people hanging out in the Louisville Free Public Library probably have a different view, however.
I thought of this problem when I read the ZDNet article by Zack Whittaker, “Can We Rely Entirely on Google and Wikipedia?” here. The core of the write up is that Mr. Whittaker doesn’t need the library. He’s of the opinion that Google and Wikipedia provide enough information to write “essays and research”. He does a very good job of comparing what’s available fro free and what’s available from a library. To be fair, he does point out that his university library has some utility. But the online services are able to deliver “more than the full library”. The best combination is Internet access and access to a university library.
Now what’s this mean? Mr. Whittaker looks to be about 22 or 23 years old. But what about the kids who are 11 or 12 years old. I think the individuals in this younger demographic chunk will be more comfortable with iPhone and netbook form factors. Libraries may be a very foreign experience. But libraries have to shift into gear or their budgets will continue to shrink. That solves the problem for the 11 to 12 year old. The library will be like a lounge no fungible information artifacts required. A connection to the network and access are sufficient.
Stephen Arnold, January 21, 2009
Google’s Knol Milestone
January 18, 2009
Everyone in the drainage ditch in Harrod’s Creek, Kentucky, thinks Knol is a Wikipedia clone. This addled goose begs to differ. This addled goose thinks Knol is a way for the Google to obtain “knowledge” about topics and the experts who contribute to a Knol (a unit of knowledge). Sure, Knol can be used like Wikipedia, but the addled goose thinks the Knol is more, much, much more.
At any rate, the Google announced on January 16, 2008, after the goose tucked its head under its wing for the week that there are now 100,000 Knols. What this goose found interesting was the headline: “100,000th Knol Published.” I love that word “published”. Google emphasizes that it is not a publisher, but it is interesting to me how the word turns up. You can read the story here.
The blog post contains some interesting insights into Knol; for example, people from 197 countries visit Knol “on an average day.” The interface is available in eight languages. Visitors are editing Knols.
Now how long will it take Knol to reach one million entries?
Stephen Arnold, January 18, 2009
European Digital Library Back Online
December 26, 2008
The Inquirer reported here that the digital library sponsored by the European Union is online again. You can read the announcement here in “Euro Library Re-Opens”. More servers and more optimism should help the service which crashed when it first opened. The addled goose asks, “When will the EU lose its appetite for pumping money into infrastructure?” I am now calculating the odds that the EU seeks help from a company able to scale. Google is a long shot, but the Exalead engineers could contribute.
Stephen Arnold, December 26, 2008
Arnold White Study Published
December 8, 2008
Galatea has published Successful Enterprise Search Management by Stephen E. Arnold and Martin White. The authors are widely known for their research and consulting in search and information management. An interview with Martin White is here.
The study approaches the management aspect of search in information-dense environments: Ineffective information access can make the difference between an organization meeting its goals and actually going out of business. Managers spend up to two hours a day searching for information, and more than 50% of the information they obtain has no value to them.
To support its advice, the book outlines case studies and references to specific vendors’ systems while offering practical guidance on how to better manage key elements of enterprise search including planning, preparation, implementation, and adaptation. Specific topics addressed include text mining and advanced content processing, information governance, and the challenges language itself presents.
“This book will be of value to any organization seeking to get the best out of its current search implementation, considering whether to upgrade the implementation or starting the process of specifying and selecting enterprise search software,” co-author Martin White said.
A detailed summary of the contents of the 130 page report is available on the Galatea Web site here. You can order a copy, which costs about US$200 here. A number of the longer essays in the Beyond Search Web log consists of information excised from the final report.
Stephen Arnold, December 8, 2008


