Potential Trouble for LexisNexis and Westlaw

March 2, 2009

Most online surfers don’t click to Reed Elsevier’s LexisNexis or Thomson Reuters Westlaw. The reason? These commercial services charge money–quite a lot of money–to access legal documents. Executives at both firms can deliver compelling elevator pitches about the added value each company brings to legal documents. In the pre-crash era, legal indexing was a manual process. Then the cost crunch arrived so both outfits are trying to slap software against the thorny problem of making sense of court documents, rulings, and assorted effluvia of America’s legal factories. I may write about how these two quasi US outfits have monopolized for fee legal information about American law for lawyers, government agencies. Both Reed and Thomson then turn around and sell access to these documents to the agencies that created them in the first place. I wonder if the good senator is aware of this aspect of commercial online services’ busness practices?

What’s the trouble? I bet you thought I was going to mention Google. Wrong. Google is on the edge of indexing legal information in a more comprehensive way. But the right now trouble is Senator Joe Lieberman. Wired reported that the good senator wondered by public documents are not available without a charge. You can read the story “Lieberman Asks, Why Are Court Docs Still Behind Paid Firewall?” here. Senator Lieberman’s question may lead to a hearing. The process could, in my opinion, start a chain reaction that further erodes the revenue Reed Elsevier and Thomson Reuters derive from public documents. Somewhere in the chain, the Google will beef up the legal content in its Uncle Sam service here.

At their core, Reed Elsevier and Thomson Reuters are traditional publishing and information companies. As such, their business model is fragile. Within the present financial pressure cooker, the Lieberman question could blow the lid off these two organization’s for fee legal business. If government agencies shift to a service provided by Google, Microsoft, or Yahoo, I think these two dead tree outfits will crash to the forest floor.

What the likelihood of this downside scenario. I would put it at better than 60 percent. Have another view? Share it, please. Set the addled goose straight.

Stephen Arnold, March 2, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Google, Library automation, Microsoft, News, Online (general), Publishing, Yahoo | 6 Comments

Another British Library Fear

January 28, 2009

Nick Farrell’s “British Library Fears Loss of History” reminded me that libraries are struggling for relevance in a Google-centric world. You can find his Register story here. For me the most interesting comment was:

The British Library has established a department dedicated to the collection of all these digital materials which are stored on your computer in the same way that it stores books, newspapers, documents, maps, personal letters.

I find categorical affirmatives quite amusing. The UK is collecting email and mobile data. Now the British Library wants “all” of a couple of types of digital information. Right now, the only outfit in a position to capture “all” information is Google, not a country, a company.

Libraries find themselves asked to provide shelter, job hunting, and coffee shop duties. One library expressed an interest in mobile furniture and off site book storage. The idea was that users of the library did not need some books right away.

The fear is well founded. Google will allay that fear in my opinion.

Stephen Arnold, January 28, 2009

Written by Stephen E. Arnold · Filed Under Library automation, News, Online (general), Technology | 2 Comments

New Google Study Announced

January 21, 2009

In July 2007, I vowed, “No more Google studies.” I was tired. Now I am just about finished with my third analysis of Google’s technology and business strategy. The two are intertwined. My publisher (Harry Collier, Infonortics Ltd.) has posted some preliminary information here about the forthcoming monograph, Google: The Digital Gutenberg. If you are curious how a Web search engine can be a digital Gutenberg, you will find this analysis of Google’s newest information technology useful. None of the information in this monograph has appeared in the more than 1,200 posts on this Web log, in my two previous Google studies, nor in my more than 200 publicly available articles, columns, and talks.

In short, the monograph will contain new information.

If you are involved in traditional media as a distributor, producer, content creator, aggregator, reseller, indexer, or user–you will find the monograph useful. You may get a business idea or two. If you are the nervous type, the monograph will give some ideas on which to chew. This study represents more than one year of research and analysis. I don’t pay much attention to the received wisdom about Google. I do focus almost exclusively on the open source information about Google’s technology using journal articles, presentations, and patent documents. The result is a look at Google that is quite different from the Google is an advertising agency approach that continues to dominate discourse. Even the recent chatter about Google’s semantic technology is old hat if you read my previous Google monographs. In short, I think this third study provides a solid look at what Google will be unveiling in the period between mid 2009 and the end of 2010. Here are the links to my two earlier studies.

The Google Legacy. Describes how Google’s search system became an application platform. You know this today, but my analysis appeared in early 2005.
Google Version 2.0. Explores Google’s semantic technology and the company’s innovations that greased the skids for applications, enterprise solutions, and disintermediation of commercial database publishers. A recent podcast broke the old news just a few days ago. Suffice it to say that most pundits were unaware of the scope and scale of Google’s semantic innovations. Cluelessness is reassuring, just not helpful when trying to assess a competitive threat in my opinion.

I don’t have the energy to think about a fourth Google study, but this trilogy does provide a reasonably comprehensive view of Google’s technical infrastructure. I know from feedback from Googlers that the information about some of Google’s advanced technology is not widely known among Google’s rank and file employees. Google’s top wizards know, but these folks are generally not too descriptive about Google’s competitive strengths. Most pundits are happy to get a Google mouse pad or maybe a Google baseball hat. Not me. I track the nitty gritty and look past the glow of the lava lamps. I don’t even like Odwalla strawberry banana juice.

Stephen Arnold, January 21, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, Google, Library automation, News, Online (general), Publishing, Semantic, Social, Technology, Text analytics, Text processing | Comments Off on New Google Study Announced

Google: Betting on Demographics

January 21, 2009

A reader groused about my poking fun at the British Library research reports. You can read these Swiftian essays here and here. Libraries are in a tough spot. With the financial crisis expanding, libraries are now the go to place to get warm and look for a job. Most libraries depend on a funding authority for money. As those authorities find themselves short of cash, libraries find themselves fighting for enough dough to keep staff and pay the heating and electricity bills. Book and journal acquisitions are lower on the list. Therefore, libraries have to justify the monetary needs. The British Library and the other national libraries are leading a charge for the relevance and importance of buildings stuffed with people looking for work. Oh, yes, these libraries want to collect dead tree outputs of publishers, pictures so these can be placed on Yahoo’s Flickr service, and electronic information so a user can access these data. The problem with this picture is that the Google has become the global library. National libraries are becoming more like branch offices of Google. Now librarians get annoyed when I point out that:

Google is indexing books, magazines, Web sites, and Web logs
Google is indexing government information
Google is offering a job service that few know anything about but you can read about this in my forthcoming study of Google due in April 2009 from Infonortics (I’m sure an entitlement generation blogger will jump on this item and write about it before my study comes out. Imitation is a form of flattery I suppose, but it is more of a character trait of the trophy children in my opinion)
Google is gathering videos.

What are libraries doing? Well, I don’t think libraries are in step with what users’ information needs are. I think college professors, mayors, and government officials have their views of libraries. Street people hanging out in the Louisville Free Public Library probably have a different view, however.

I thought of this problem when I read the ZDNet article by Zack Whittaker, “Can We Rely Entirely on Google and Wikipedia?” here. The core of the write up is that Mr. Whittaker doesn’t need the library. He’s of the opinion that Google and Wikipedia provide enough information to write “essays and research”. He does a very good job of comparing what’s available fro free and what’s available from a library. To be fair, he does point out that his university library has some utility. But the online services are able to deliver “more than the full library”. The best combination is Internet access and access to a university library.

Now what’s this mean? Mr. Whittaker looks to be about 22 or 23 years old. But what about the kids who are 11 or 12 years old. I think the individuals in this younger demographic chunk will be more comfortable with iPhone and netbook form factors. Libraries may be a very foreign experience. But libraries have to shift into gear or their budgets will continue to shrink. That solves the problem for the 11 to 12 year old. The library will be like a lounge no fungible information artifacts required. A connection to the network and access are sufficient.

Stephen Arnold, January 21, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Cloud computing, Library automation, News, Online (general), Search | 2 Comments

Google’s Knol Milestone

January 18, 2009

Everyone in the drainage ditch in Harrod’s Creek, Kentucky, thinks Knol is a Wikipedia clone. This addled goose begs to differ. This addled goose thinks Knol is a way for the Google to obtain “knowledge” about topics and the experts who contribute to a Knol (a unit of knowledge). Sure, Knol can be used like Wikipedia, but the addled goose thinks the Knol is more, much, much more.

At any rate, the Google announced on January 16, 2008, after the goose tucked its head under its wing for the week that there are now 100,000 Knols. What this goose found interesting was the headline: “100,000th Knol Published.” I love that word “published”. Google emphasizes that it is not a publisher, but it is interesting to me how the word turns up. You can read the story here.

The blog post contains some interesting insights into Knol; for example, people from 197 countries visit Knol “on an average day.” The interface is available in eight languages. Visitors are editing Knols.

Now how long will it take Knol to reach one million entries?

Stephen Arnold, January 18, 2009

Written by Stephen E. Arnold · Filed Under Cloud computing, Database, Google, Library automation, Publishing, Text analytics, Text processing | Comments Off on Google’s Knol Milestone

European Digital Library Back Online

December 26, 2008

The Inquirer reported here that the digital library sponsored by the European Union is online again. You can read the announcement here in “Euro Library Re-Opens”. More servers and more optimism should help the service which crashed when it first opened. The addled goose asks, “When will the EU lose its appetite for pumping money into infrastructure?” I am now calculating the odds that the EU seeks help from a company able to scale. Google is a long shot, but the Exalead engineers could contribute.

Stephen Arnold, December 26, 2008

Written by Stephen E. Arnold · Filed Under Library automation, News, Online (general) | 2 Comments

Arnold White Study Published

December 8, 2008

Galatea has published Successful Enterprise Search Management by Stephen E. Arnold and Martin White. The authors are widely known for their research and consulting in search and information management. An interview with Martin White is here.

The study approaches the management aspect of search in information-dense environments: Ineffective information access can make the difference between an organization meeting its goals and actually going out of business. Managers spend up to two hours a day searching for information, and more than 50% of the information they obtain has no value to them.

To support its advice, the book outlines case studies and references to specific vendors’ systems while offering practical guidance on how to better manage key elements of enterprise search including planning, preparation, implementation, and adaptation. Specific topics addressed include text mining and advanced content processing, information governance, and the challenges language itself presents.

“This book will be of value to any organization seeking to get the best out of its current search implementation, considering whether to upgrade the implementation or starting the process of specifying and selecting enterprise search software,” co-author Martin White said.

A detailed summary of the contents of the 130 page report is available on the Galatea Web site here. You can order a copy, which costs about US$200 here. A number of the longer essays in the Beyond Search Web log consists of information excised from the final report.

Stephen Arnold, December 8, 2008

Written by Stephen E. Arnold · Filed Under Library automation, News, Publishing, Search, Semantic, Text analytics, Text processing | 5 Comments

Arnold on Disintermediation in New Italian Compendium

December 8, 2008

December 2008 is shaping up as a busy book month. I received on December 6, 2008, my copy of “Galassia Web: La Cultura nella Rete”, published by Civita Associazione with the support of Boeing. I contributed a chapter that begins on page 67 and ending on page 80. My contribution was “Giochi di Open Access e altre nuove tecnologie di communicazione: la tentazione disintermediazion”. If your Italian is a bit rusty, the approximate English translation is “The Interplay of Open Access and Other New Technologies.”

The main point of my contribution hinges on Disintermediation. Institutions such as museums and libraries want to provide an online catalog and some type of access to the information under their stewardship. But large companies such as Google are slowly aggregating a broad range of content. For now, commercial enterprises have not shown a desire to create an aggregated service that includes indexes, images, music, and other information public institutions have created. The risk is that unless groups of institutions take the lead in aggregation, the commercial service may by default become the library or the museum for Internet users. In short, the disintermediation that ravaged commercial online services and corporate libraries may now have an impact on the information now in the control of universities, public agencies, privately-endowed institutions, and governmental entities. I don’t have a timeline but I make the point that acting in a parochial way may waste time. Action can provide a countermeasure for the forces of disintermediation.

I want to send a happy quack to the publisher, Moira Macpherson, and the editorial team that made this collection of essays a reality. So, here comes, “Quack!”

Stephen Arnold, December 8, 2008

Written by Stephen E. Arnold · Filed Under Library automation, News, Online (general), Publishing | Comments Off on Arnold on Disintermediation in New Italian Compendium

Why Countries Must Compete with Google

November 20, 2008

A happy quack to one of my two or three readers in Australia. The story “Massive EU Online Library Looks to Compete with Google” sparked a number of ideas in my mind. You can find the full text of the Syndney Morning Herald’s story here. The story described that the European Union will launch what will be called Europeana. The made up word suggests big collection of European content. For me, the most interesting comment was:

By 2010, the date when Europeana is due to be fully operational, the aim is to have 10 million works available, an impressive number yet a mere drop in the ocean compared to the 2.5 billion books in Europe’s more common libraries. The process of digitalisation is a massive undertaking. Around one percent of the books in the EU’s national libraries are now available in digital form, with that figure expected to grow to four percent in 2012. And even when they are digitalised, they still have to be put online.

My research suggested in 2004 that Google was building a 21st century version of the pre-break up American Telephone & Telegraph system. The Google vision was global and the 19th century telco was giving way to an applications platform that could deliver digital services from Google data centers to any type of network aware device. In speaking with my publisher about the new distributor for my 2007 Google Version 2.0 study, we touched upon the idea that Google is essentially a country. It is not a company.

I won’t repeat the country argument that I explicate in my Google studies. The point is that the European Union has reached the same conclusion. No one is able to fund a start up that will index the European Union members’ information. Google is aiming for global information. The EU is happy with a couple of dozen countries’ information. More importantly, the EU approach will be to act on behalf of almost 24 nations.

That’s a fairly good example of my assertion: A single company cannot compete with Google. I hope you will disagree. I don’t want to say the pledge of allegiance to a kindergarten colors flag and recite such words as “googley nation” or “TCP/IP on everything”. Use the comments section to prove my assertion that Google can now only be challenged by countries.

Stephen Arnold, November 20, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Google, Library automation, News, Online (general) | 1 Comment

Google: Building Its Knowledge Base a Book at a Time

October 16, 2008

Google does not seem to want to create a Kindle or Sony eBook. “For what does the firm want to scan and index books?” I ask myself. My research suggests that Google is adding to its knowledge base. Books have information, and Google finds that information useful for its various processes. Google’s book search and its sale of books are important, but if my information are correct, Google is getting brain food for its smart software. The company has deals in place that increase the number of publishers participating in its book project. Reuters’ “Google Doubles Book Scan Publisher Partners” provides a run down on how many books Google processes and the number of publishers now participating. The numbers are somewhat fuzzy, but you can read the full text of the story here and judge for yourself. Google’s been involved in legal hassles over its book project for several years. The fifth anniversary of these legal squabbles will be fast upon us. Nary a word in the Reuters story about Google’s knowledge base. Once again the addled goose is the only bird circling this angle. What do you think Google’s doing with a million or more books in 100 languages? Let me know.

Stephen Arnold, October 16, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Google, Library automation, News, Technology, Text analytics, Text processing | Comments Off on Google: Building Its Knowledge Base a Book at a Time

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Potential Trouble for LexisNexis and Westlaw

Another British Library Fear

New Google Study Announced

Google: Betting on Demographics

Google’s Knol Milestone

European Digital Library Back Online

Arnold White Study Published

Arnold on Disintermediation in New Italian Compendium

Why Countries Must Compete with Google

Google: Building Its Knowledge Base a Book at a Time

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta