Library automation – Beyond Search

British Library: The Math of Can Kicking Security Down the Road

Stephen E. Arnold — Tue, 09 Jan 2024 10:55:00 +0000

This essay is the work of a dumb dinobaby. No smart software required.

I read a couple of blog posts about the security issues at the British Library. I am not currently working on projects in the UK. Therefore, I noted the issue and moved on to more pressing matters. Examples range from writing about the antics of the Google to keeping my eye on the new leader of the highly innovative PR magnet, the NSO Group.

Two well-educated professionals kick a security can down the road. Why bother to pick it up? Thanks, MSFT Copilot Bing thing. I gave up trying to get you to produce a big can and big shoe. Sigh.

I read “British Library to Burn Through Reserves to Recover from Cyber Attack.” The weird orange newspaper usually has semi-reliable, actual factual information. The write up reports or asserts (the FT is a newspaper, after all):

The British Library will drain about 40 per cent of its reserves to recover from a cyber attack that has crippled one of the UK’s critical research bodies and rendered most of its services inaccessible.

I won’t summarize what the bad actors took down. Instead, I want to highlight another passage in the article:

Cyber-intelligence experts said the British Library’s service could remain down for more than a year, while the attack highlighted the risks of a single institution playing such a prominent role in delivering essential services.

A couple of themes emerge from these two quoted passages:

Whatever cash the library has, spitting distance of half is going to be spent “recovering,” not improving, enhancing, or strengthening. Just “recovering.”
The attack killed off “most” of the British Libraries services. Not a few. Not one or two. Just “most.”
Concentration for efficiency leads to failure for downstream services. But concentration makes sense, right. Just ask library patrons.

My view of the situation is familiar of you have read other blog posts about Fancy Dan, modern methods. Let me summarize to brighten your day:

First, cyber security is a function that marketers exploit without addressing security problems. Those purchasing cyber security don’t know much. Therefore, the procurement officials are what a falcon might label “easy prey.” Bad for the chihuahua sometimes.

Second, when security issues are identified, many professionals don’t know how to listen. Therefore, a committee decides. Committees are outstanding bureaucratic tools. Obviously the British Library’s managers and committees may know about manuscripts. Security? Hmmm.

Third, a security failure can consume considerable resources in order to return to the status quo. One can easily imagine a scenario months or years in the future when the cost of recovery is too great. Therefore, the security breach kills the organization. Termination can be rationalized by a committee, probably affiliated with a bureaucratic structure further up the hierarchy.

I think the idea of “kicking the security can” down the road a widespread characteristic of many organizations. Is the situation improving? No. Marketers move quickly to exploit weaknesses of procurement teams. Bad actors know this. Excitement ahead.

Stephen E Arnold, January 9, 2024

Sweden Has a Social Fabric: The Library Pattern

Stephen E. Arnold — Wed, 29 Nov 2023 10:10:00 +0000

This essay is the work of a dumb dinobaby. No smart software required.

If a building is left unlocked and unattended, it’s all but guaranteed that people will trespass, rob, and vandalize. While we expect the worst from humanity, sometimes our faith in the species is restored with amazing stories like this from ZME Science: “A Door At A Swedish Library Was Accidentally Left Open-446 People Came In, Borrowed 245 Books. Every Single One Was Returned.”

Gothenburg librarian Anna Carin Elf noticed something odd when she went to work one day. The library was supposed to be closed because it was All Saint’s Day. People, however, were browsing shelves, reading, using computers, and playing in the children’s section. A member of the library staff forgot to shut one of the doors. The next day patrons took advantage of the open door and used the facilities.

When Elf saw the library was open she jumped into action:

"As people were coming in and out of the library, one librarian (Elf) walked by and noticed the people using the library. She realized what was happening, called her manager and a colleague, and then announced that the library was closing. The visitors calmly folded their books closed and left. But some left with books.”

When the library was accidentally left open, the people of Gothenburg borrowed 245 books and every single one was returned. It’s wonderful when communities recognize the importance of libraries and decide to respect them. Libraries continue to be an important part of cities as they provide access to information, Internet, books, activities, and more.

Whitney Grace, November 29, 2023

Racy Poetry Now Available

Stephen E. Arnold — Thu, 26 Oct 2023 15:23:17 +0000

This essay is the work of a dumb humanoid. No smart software required.

My hunch is that you either have forgotten or we not aware of the Wife of Bath. Well, let me tell you that was a hot read in the 16th century. Now you can review the pre-1600 manuscripts of Chaucer’s works. Many years ago my professor for a 15 week class in Chaucer was one of the editors of the then standard text of Chaucer’s poetry. I think his name was J.J. Campbell.

Microsoft’s art generator thinks that the Wife of Bath looks like this machine-generated image. I don’t think the dreamy pix matches my reconstruction of the Wife of Bath, who wore red socks and a method to generate hard cash on demand.

What he did, I think, was get students like me to undertake specific research and write papers about the topic. My assignments involved tracking references to the even more salacious volumes (at least in the 16th century) of the Apocrypha. Imagine the fun that was. The British Library has digitized the manuscripts and books. These are available at this link. How long will it take Alamy, Getty, and other image wizards to suck out the images and charge people for the use of content created centuries ago? Not long. Not long at all. By the way, watch out for friars in the woods.

Stephen E Arnold, October 26, 2023

The Secret Cultural Erosion Of Public Libraries: Who Knew?

Stephen E. Arnold — Fri, 25 Aug 2023 09:05:00 +0000

Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

It appears the biggest problem public and school libraries are dealing with are demands to ban controversial gay and trans titles. While some libraries are facing closures or complete withdrawals of funding, they mostly appear to be in decent standing. Karawynn Long unfortunately discovered that is not the case. She spills the printer’s ink in her Substack post: “The Coming [Cultural Erosion] Of Public Libraries” with the cleverly deplorable subtitle “global investment vampires have positioned themselves to suck our libraries dry.”

Before she details how a greedy corporation is bleeding libraries like a leech, Long explains how there is a looming cultural erosion brought on by capitalism. A capitalist economic system is not inherently evil but bad actors exploit it. Long uses a more colorful word to explain libraries’ cultural erosion. In essence the colorful word means when something good deteriorates into crap.

A great example is when corporations use a platform, i.e. Facebook, Twitter, and Amazon, to pit buyers and sellers against each other while the top runs away with heaps of cash.

This ties back to public libraries because they use a digital library app called OverDrive. Library patrons use OverDrive to access copies of digital books, videos, audiobooks, magazines, and other media. It is the only app available to public libraries to manage digital media. Patrons could access OverDrive via an app call Libby or a Web site portal. In May 2023, the Web site portal deleted a feature that allowed patrons to recommend new titles to their libraries.

OverDrive wants to force users to adopt their Libby app. The Libby app has a “notify me” option that alerts users when their library acquires an item. OverDrive’s overlords also want to collect sellable user data, like other companies. Among other details, OverDrive is owned by the global investment firm KKR, Kohlberg Kravis Roberts.

KKR’s goal is one of the vilest investment capital companies, dubbed a “vampire capitalist” company, and it has a fanged hold on the US’s public libraries. OverDrive flaunts its B corporation status but that does not mask the villain lurking behind the curtain:

“ As one library industry publication warned in advance of the sale to KKR, ‘This time, the acquisition of OverDrive is a ‘financial investment,’ in which the buyer, usually a private equity firm or other financial sponsor, expects to increase the value of the company over the short term, typically five to seven years.’ We are now three years into that five-to-seven, making it likely that KKR’s timeframe for completing maximum profit extraction is two to four more years. Typically this is accomplished by levying enormous annual “management fees” on the purchased company, while also forcing it (through Board of Director mandates) to make changes to its operations that will result in short-term profit gains regardless of long-term instability. When they believe the short-term gains are maxed out, the investment firm sells off the company again, leaving it with a giant pile of unsustainable debt from the leveraged buyout and often sending it into bankruptcy.”

OverDrive likely plans to sell user data then bleed the public libraries dry until local and federal governments shout, “Uncle!” Among book bans and rising inflation, public libraries will see a reckoning with their budgets before 2030.

Whitney Grace, August 25, 2023

AI-Search Tool Talpa Burrows Into Library Catalogues

Stephen E. Arnold — Wed, 19 Jul 2023 09:05:00 +0000

Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

For a few years now, libraries have been able to augment their online catalogue with enrichment services from Syndetics Unbound, which adds details and imagery to each entry. Now the company is incorporating new AI capabilities, we learn from its write-up, “Introducing Talpa Search.” Talpa is still experimental and is temporarily available to libraries already using Syndetics Unbound.

A book lover in action. Thanks MidJourney. You made me more appealing than I was in the 1951 when I got kicked out of the library for reading books for adults, not stuff about Freddy the Pig.

Participating libraries will get a year of the service for free. We cannot know just how much they will be saving, though, since the pricing remains a mystery. Writer Tim Spalding describes how Talpa works:

“First, Talpa queries large language models (from Claude AI and ChatGPT) for books and other media. Critically, every item is checked against true and authoritative bibliographic data, solving the problem of invented answers (called ‘hallucinations’) that such models can fall into. Second, Talpa uses the natural-language abilities of large language models to parse and understand queries, which are then answered using traditional library data. Thus a search for ‘novels about World War II in France’ is broken down into subjects and tags and answered with results from the library’s collection. Our authoritative book data comes from Syndetics Unbound, Bowker and LibraryThing. Surprisingly, Talpa’s ability to find books by their cover design isn’t powered by AI at all, but by the effort of thousands of book lovers who have played LibraryThing’s CoverGuess cover-tagging game since 2010!”

Interesting. If you don’t happen to be part of a library using Syndetics, you can try Talpa out at one of the three libraries linked to in the post. The tool sports a cute mole mascot and, to add a bit of personality, supplies mole facts beneath the search bar. As with many AI tools, the functionality has plenty of room to grow. For example, my search for “weaving velvet” did return a few loom-centered books scattered through the results but more prominently suggested works of fiction or philosophy that simply contained “velvet” in the title. (Including, adorably, several versions of “The Velveteen Rabbit.”) The write-up does not share when the tool will be available more widely, but we hope it will be more refined when it is. Is it AI? Isn’t everything?

Cynthia Murrell, July 19, 2023

Need Research Assistance, Skip the Special Librarian. Go to Elicit

Stephen E. Arnold — Mon, 17 Jul 2023 09:05:00 +0000

Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Academic databases are the bedrock of research. Unfortunately most of them are hidden behind paywalls. If researchers get past the paywalls, they encounter other problems with accurate results and access to texts. Databases have improved over the years but AI algorithms make things better. Elicit is a new database marketed as a digital assistant with less intelligence than Alexa, Siri, and Google but can comprehend simple questions.

“This is indeed the research library. The shelves are filled with books. You know what a book is, don’t you? Also, will find that this research library is not used too much any more. Professors just make up data. Students pay others to do their work. If you wish, I will show you how to use the card catalog. Our online public access terminal and library automation system does not work. The university’s IT department is busy moonlighting for a professor who is a consultant to a social media company,” says the senior research librarian.

What exactly is Elicit?

“Elicit is a research assistant using language models like GPT-3 to automate parts of researchers’ workflows. Currently, the main workflow in Elicit is Literature Review. If you ask a question, Elicit will show relevant papers and summaries of key information about those papers in an easy-to-use table.”

Researchers use Elicit to guide their research and discover papers to cite. Researcher feedback stated they use Elicit to answer their questions, find paper leads, and get better exam scores.

Elicit proves its intuitiveness with its AI-powered research tools. Search results contain papers that do not match the keywords but semantically match the query meaning. Keyword matching also allows researchers to narrow or expand specific queries with filters. The summarization tool creates a custom summary based on the research query and simplifies complex abstracts. The citation graph semantically searches citations and returns more relevant papers. Results can be organized and more information added without creating new queries.

Elicit does have limitations such as the inability to evaluate information quality. Also Elicit is still a new tool so mistakes will be made along the development process. Elicit does warn users about mistakes and advises to use tried and true, old-fashioned research methods of evaluation.

Whitney Grace, July 16 , 2023

What Is the Purpose of a Library? Maybe WiFi?

Stephen E. Arnold — Thu, 13 Jul 2023 09:10:00 +0000

Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

The misinformed believe libraries only offer access to free books, DVDs, and public computers in various states of obsoleteness. Libraries are actually a hub for Internet access, including WiFi. The Internet is a necessary tool and many people do no have reliable access either due to low income, homelessness, and rural locations. KQED reports on how San Francisco is handling WiFi access and homelessness at a local library: “What Happens When Libraries Stop Sharing Wi-Fi?”

San Francisco, California is experiencing record high homelessness. Businesses and people are abandoning the city, crime is running rampant on the streets, and law enforcement’s hands are tied. Homeless people regularly visit libraries to use the computers and the WiFi. Libraries usually keep their WiFi on 24/7 so their parking lots and outside areas are active hotspots.

The Eureka Valley/Harvey Milk Memorial Branch Library turns off its WiFi at night. This is the only San Franciscan library that shuts off its WiFi, because homeless people visited the library after hours and the surrounding neighborhood had an increase in crime. Pulling the WiFi plug is another way San Francisco clears sidewalks and prevents people sleeping in areas.

“ ‘Neighbors in that area have been dealing with repeated encampments, open-air drug sales and use, harassment of local businesses and all-around problematic situations going on for a decade at this point,” said [Supervisor Rafael Mandelman]. ‘It reached its nadir in the pandemic in 2020. There were encampments on both sides of the street, the sidewalk was impassable, and the historic AIDS mural had been wildly defaced. Neighbors were being threatened. It was bad.’ ”

The library faced homeless people camping on the roof, hacking into their electricity, and breaking into a closet. After the library shut off the WiFi, emergency services were called less in the area. However, it is not 100% attributed to the WiFi shutdown. Some homeless people in the area found permanent housing, a mural was repainted, and other services were enacted.

While WiFi is an essential service. If the people and places that bring the service are harmed it is ruined for everybody. It is an ethical conundrum but if crime, drugs, debris, and homeless encampments make an area dangerous, then measures must be taken to resolve the problems.

Whitney Grace, July 13, 2023

Harvard and a Web Archive Tool

Stephen E. Arnold — Thu, 18 May 2023 09:15:00 +0000

Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

The Library of Congress has dropped the ball and the Internet Archive may soon be shut down. So it is Harvard to the rescue. At least until people sue the institution. The university’s Library Innovation Lab describes its efforts in, “Witnessing the Web is Hard: Why and How We Built the Scoop Web Archiving Capture Engine.”

“Our decade of experience running Perma.cc has given our team a vantage point to identify emerging challenges in witnessing the web that we believe extend well beyond our core mission of preserving citations in the legal record. In an effort to expand the utility of our own service and contribute to the wider array of core tools in the web archiving community, we’ve been working on a handful of Perma Tools. In this blog post, we’ll go over the driving principles and architectural decisions we’ve made while designing the first major release from this series: Scoop, a high-fidelity, browser-based, single-page web archiving capture engine for witnessing the web. As with many of these tools, Scoop is built for general use but represents our particular stance, cultivated while working with legal scholars, US courts, and journalists to preserve their citations. Namely, we prioritize their needs for specificity, accuracy, and security. These are qualities we believe are important to a wide range of people interested in standing up their own web archiving system. As such, Scoop is an open-source project which can be deployed as a standalone building block, hopefully lowering a barrier to entry for web archiving.”

At Scoop’s core is its “no-alteration principle” which, as the name implies, is a commitment to recording HTTP exchanges with no variations. The write-up gives some technical details on how the capture engine achieves that standard. Aside from that bedrock doctrine, though, Scoop allows users to customize it to meet their unique web-witnessing needs. Attachments are optional and users can configure each element of the capture process, like time or size limits. Another pair of important features is the built-in provenance summary, including preservation of SSL certificates, and authenticity assertion through support for the Web Archive Collection Zipped (WACZ) file format and the WACZ Signing and Verification specification. Interested readers should see the article for details on how to start using Scoop. You might want to hurry, before publishers jump in with their inevitable litigation push.

Cynthia Murrell, May 18, 2023

Libraries: A Target?

Stephen E. Arnold — Tue, 04 Oct 2022 09:15:00 +0000

Reading is FUNdamental. I am not sure that’s an accurate slogan today. “Libraries Across The US Are Receiving Violent Threats” reports:

In the last two weeks, at least a dozen public libraries across the U.S. received threats that resulted in canceled events and system-wide closures. While bomb and active shooter threats to public library systems in Nashville, Fort Worth, Denver, Salt Lake City, and Boston and other cities across the country were ultimately deemed hoaxes, library workers and patrons say they are still reeling in the aftermath.

Nice.

I grew up with the following impressions of libraries:

My mother took me to the library each week so she could return the books she read from the previous week. She checked out books. I am not sure how old I was when I became aware of this library routine. Didn’t everyone go to the library once a week? Not to protest or make threats, but to get books and introduce a child to the “routine”?
My sixth grade teacher, Ms. Costello, awarded a paper “flag” for each book read by a student. On the wall was a list of her students. The flags were pinned after each student’s name. One book received one white flag. Five books were converted to a white flag with a blue border. Ten books received a white flag with a red border. Twenty books were represented by a white flag with a yellow border. Each school year ended with Ms. Costello recognizing the students who read the most books. (Guess who won?) I made many trips to the Prospect Branch Library because I nuked the grade school library of books which interested me quickly.
In high school, wearing my worn out sneakers, my cool plaid shirt, and my blue jeans with cuffs no less, I went to the downtown library which I reached via the bus. In my high school, English teachers assigned essays which had to have footnotes. The reference desk librarians were helpful and showed me the ropes of microfilm newspapers (wow, that technology sucked. Wasn’t there a better way to search?), the Reader’s Guide to Periodical Literature (wow, that print index sucked. Wasn’t there a better way to search and get access to the full text of the article?), the mysteries of the books behind the reference desk. (Oh, Constance Winchell, I loved you!)
In college, I made the library my home away from home between classes. I had favorite tables at which to work. I loved the Library of Congress cataloging system. I knew exactly where certain book topics were shelved. I worked in the library on and off for a couple of years until I landed a higher paying job, but I learned how to get first crack at books professors put on reserve. I also located the COBOL instruction manuals and used them to do my first computer based indexed project for a professor named William Gillis. Believe it or not, that project was my ticket to the world of commercial database indexing and my first real job at Halliburton Nuclear in Washington, DC. I indexed nuclear information using good old PDP computers. Exciting? You bet.

Why have I isolated four library experiences?

None require terror threats, political actions, or any behavior other than respect for the professionals who assisted me. My wife has told me that I could have gone to work right after high school and skipped college. She’s wrong. I am not sure I learned too much in my college courses. The bulk of the information was repetitive or something with which I was familiar based on my reading.

What was valuable to me was the opportunity to spend significant time in the university library. Here’s a fun fact: I was thrilled when a college event took place on Friday nights. I knew I would be one of a very few students in the library when the event was underway. Silence, no delays at the photocopy machine, no waiting for a specific card catalog drawer, and no one clogging the space between the shelves.

What’s my view of libraries? Can’t figure it out? Perhaps you should consider what one can achieve by doing the library thing. Online is okay, but it sure isn’t the library thing. I should know because I was involved and maybe instrumental in a number of very successful and widely used commercial databases. I knew paper indexes sucked, and I did something about it.

But libraries. The prime mover for me. Why be afraid of learning, knowledge, information, and different ideas? My answer is that those without a library “backbone” are lost in a digital world in which TikTok information imparts wisdom. Ho ho ho.

Stephen E Arnold, October 4, 2022

Libraries and Google: Who Wins?

Stephen E. Arnold — Wed, 31 Aug 2022 09:10:00 +0000

Google uses various ways to protect users’ accounts, such as authentication through a mobile phone or non-Gmail address. This is a problem for large portions of the American population who don’t have regular access to the Internet. These include ethnic minorities, people with low socioeconomic status, and the elderly. These groups usually rely on public libraries for Internet access. These groups also need welfare and other assistance programs for survival.

Shelly R., a librarian in the Free Library of Philadelphia System, wrote a letter to Google in 2021 about how their security authentication hurts these groups. The letter was picked up by Hacker News and it was meant to be private. Her description of the services her library system provides is typical of many places in the United States.

People say that libraries are obsolete, but the naysayers are not taking into account the people that need Internet access, help with technology literacy, applying for benefits and jobs, and more. Librarians have one of the most stressful jobs in the country, because they are forced into more roles than helping people research: teacher, therapist, babysitter, and more. It is ridiculous the amount of roles librarians fill, however, helping people in their community get access to technology is one thing they excel at.

Shelly R. makes a valiant point that many groups cannot afford expensive technology or know how to use it. They rely on community resources such as the public library for assistance, but security features like Google’s authentication system do not help them.

Online accounts must remain secure to protect users, but people without regular Internet access or technology literacy must be taken into account as well. The Internet is supposed to be a great equalizer, but it does not work when everyone does not have equal access.

Shelly R. updated the letter in August 2022, said she spoke with Google’s security team, and things were better for her job. Is that true? We hope so. If only Google would do more to help equalize Internet access. Hey Google, maybe you could donate money or resources to public libraries? You have the power and ability to do so, plus it would be a tax write-off.

Whitney Grace, August 31, 2022