Online Books

June 16, 2020

The Internet Archive has pulled in its digital tentacles. Are there collections of online books that will not attract law suits from increasingly stressed “real” publishers?

The answer is, “Sort of.”

For a listing of “over three million free books on the Web”, point your Mother Hen browser at “The Online Books Page.” Some exploration is needed. The categories are not exactly easy to use, but what online index is these days.

The “Search Our Listings” lets a user search by author’s last name and title. The problem is, as many grade school students know, is that an author’s name can return many listings. To see what I mean, plug in “Plato”. There you go. A list of books that will dissuade some from locating the old guy who argued with Socrates (not the football playing medical doctor from Brazil).

You can also access a feature called “Exclude extended shelves.” Despite the name, the NOT function delivers the goods. Why make Boolean into something that makes little sense?

The new listings option delivers an earthworm result. Like to browse, this is your Disneyland. Want magazines? Just click “Serials.” This page leads to more pages listing magazines. Some of the journals in the link to the Electronic Journals Library are not free. Well, free is relative, I suppose.

The effort to gather the information is admirable. Polishing, editorial control, and consistent presentation may arrive in the future.

Worth checking into an author with whom one is familiar. Browsing can be interesting. Years ago I told a former client that no firm had a comprehensive index of electronic books. That company’s young and confident managers did not believe me. Flash forward to 2020, the problem still exists. There you go.

Stephen E Arnold, June 16, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general), Reference tool | Comments Off on Online Books

What Type of Content Is Plentiful? Ever Been to a Cow Barn?

June 9, 2020

DarkCyber enjoyed “Most Tech Content Is Bullshit.” The write up explains:

I saw developers taking other people’s solutions for granted. Not thinking twice about the approach, not bothering about analyzing it.

When asked about the behavior, the article highlights four common behaviors:

It was in some article.
I copy-pasted it from X.
I was doing it in my previous project.
Someone told me so.

Unfortunately, these four points cover the bases for odd, wrong, and off base information.

The logical error is “appeal to authority.” Information issued from someone perceived as authoritative may be accepted readily. Today some people believe just about anything available online.

Why is this human failing taking place? The write up provides four reasons:

We are lazy.
We don’t have time.
It’s comfortable.
We don’t believe in ourselves.

The problem is unlikely to be resolved. There are some minor concerns: Money, the pandemic, civil disturbances, and international tensions. Plus, I want to make clear that search engine optimization and a desire to be perceived as an expert are darned significant factors.

Net net: There’s little likelihood of rapid change. Social distance, wait for a bailout check, and be confident in your children’s future. No big deal. And that online fix for sluggish DNS look ups. Not to worry.

Stephen E Arnold, June 9, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general), Technology | Comments Off on What Type of Content Is Plentiful? Ever Been to a Cow Barn?

Brave Browsing Sniping

June 9, 2020

DarkCyber noted “The Brave Web Browser Is Hijacking Links, and Inserting Affiliate Codes.” The write up explains that the Brave browser is behaving in a way that is unseemly. The point is that a free Web browser is pitching privacy and at the same time performs some underhanded actions to generate revenue. The explanation of the digital sleight of hand is interesting and illustrates that those “gee, stuff is free” online users assume one thing and may find something different. The write up includes this list and suggestions for accessing Web sites in a non-Brave way. We quote:

There is no good reason to use Brave. Use Chromium — the open-source core of Chrome — with the uBlock Origin ad blocker. [Chromium download, uBO Chrome]

Or use Firefox with uBlock Origin — ‘cos it blocks more ads than the Chromium framework will let anything block. [uBO Firefox]

Or, if you want a really cleaned-out Chrome — ungoogled-chromium, with uBlock Origin. [GitHub]

If you’re on Android, use Firefox with uBlock Origin, or the new Firefox Focus browser. [Mozilla]

Brave is a browser for suckers who want to keep getting played — so it’s a 100% crypto enterprise. As Eich’s pinned tweet still tells us: “Who gets paid? If not you, then you’re ‘product’.” [Twitter]

DarkCyber is not sure if this comment is as ominous as it sounded to one DarkCyber researcher:

Brendan Eich has responded to this post by claiming “David lies about us all the time.” I have pointed out that this is a prima facie defamatory statement, and asked him to detail these claimed lies. [Twitter, archive]

Mr. Eich is the alleged perpetrator of the Brave misdeeds. Online marketing and advertising are fascinating disciplines.

Stephen E Arnold, June 8, 2020

Written by Stephen E. Arnold · Filed Under cybercrime, News, Online (general) | Comments Off on Brave Browsing Sniping

Technology TapsBrakes on Learning? Well, Well, Well

June 8, 2020

DarkCyber is often amused when friends and acquaintances explain that tablets, personal computers, and mobile phones make their children more intelligent. Living in a cloud of unknowing is a delightful mental condition. One can only imagine the disdain with which the information in “Harder to Learn about Science with Modern Technology – Astronomer Royal” will be greeted.

The person dubbed the Big Dog of Astronomy in the UK has some definite ideas; to wit:

“I think paradoxically our high-tech environments may actually be an impediment to sustaining useful enthusiasm in science,” Lord Rees said….

The gadgets that now pervade our lives, smartphones and such like, are baffling black boxes and pure magic to most people. “If you take them apart you find few clues to intricate miniaturize mechanisms and you certainly can’t put them together again. “So, the extreme sophistication of modern technology, wonderful though those benefits are, is ironically an impediment to engaging young people in the basics, with learning how things work.”

I think the summary is that dumbing down goes up as high-technology squeezes learning into an easily controlled digital experience.

This is a positive for some companies. Are you able to identify two or three?

Stephen E Arnold, June 8, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general) | Comments Off on Technology TapsBrakes on Learning? Well, Well, Well

Adulting at Facebook: Filtering Government Generated Content

June 5, 2020

Facebook may have realized that certain nation states are generating weaponized content. My goodness, what an insight, what a flash of brilliance, what a realization about the world of adults! DarkCyber noted “Facebook to Block Ads from State Controlled Media Entities in the U.S.” The write up reports:

Facebook said Thursday (June 5, 2020) it will begin blocking state-controlled media outlets from buying advertising in the U.S. this summer. It’s also rolling out a new set of labels to provide users with transparency around ads and posts from state-controlled outlets. Outlets that feel wrongly labeled can appeal the process.

Some government professionals in Sweden were hip to state actors using social media ads for state owned purposes about a decade ago. Facebook just got the memo maybe? The article adds:

The purpose of labeling these outlets is to give users transparency about any kind of potential bias a state-backed entity may have when providing information to U.S. users.

How many users of social media know that some content is “real,” and other content is “pay to play”? Not too many. DarkCyber has picked up hints that fewer than five percent of online content consumers can figure out provenance as a concept, let alone identify wonky information and data. Infographics? Hey, looks like real numbers, right?

The key point is that adulting is arriving a day late and a dollar short. Does anyone care? Sure, some people. But decades into the Wild West of weaponized content, information and data, slapping an index term on a content object is similar to watching the ocean liner sailing toward the horizon. Missing the boat? Yep.

Stephen E Arnold, June 6, 2020

Written by Stephen E. Arnold · Filed Under Government, News, Online (general) | Comments Off on Adulting at Facebook: Filtering Government Generated Content

GoAccess: A Log Analyzer

June 4, 2020

We are updating our tools section of an upcoming National Crime Conference lecture. If you have access to a Web server and a log, you may want to take a look at GoAccess. The software

was designed to be a fast, terminal-based log analyzer. Its core idea is to quickly analyze and view web server statistics in real time without needing to use your browser.

“Analytics without Google” provides additional information about the software and includes helpful pointer. The article states:

What I further liked about GoAccess is I could run it on a separate machine, transferring logs from multiple servers into one place, then creating my necessary dashboards; this isn’t a specific feature of GoAccess, but a feature of the Unix philosophy. This flexibility works well with my seemingly ephemeral Digital Ocean Droplets, which don’t go kaboom on their own, but rather suffer from my own tendencies to erase and start from scratch. GoAccess reminded me how beautiful composable tools are. Its feature set is minimal and it plays nicely with the tools already available to us on a *nix platform. Do one thing and do it well — words of wisdom.

Worth a look.

Stephen E Arnold, June 4, 2020

Written by Stephen E. Arnold · Filed Under Analytics, News, Online (general) | Comments Off on GoAccess: A Log Analyzer

Bookmarks and the Dynamic Web: Yes, Still a Problem

June 3, 2020

Apparently, bookmarks are a thing. Again. Memex from WorldBrain.io is an open source browser extension that allows users to annotate, search, and organize online information locally. The offline functionality supports both privacy and data ownership. It is available for Chrome, Firefox, and Brave browsers, and now offers a mobile app called Memex Go. The product page lists these features:

Full Text History Search: Automatically indexes websites you visit. Instantly recover anything you’ve seen without upfront work.

Highlights & Annotations: Keep your thoughts organized with their original context.

Tags, Lists & Bookmarks: Quickly organize content via the sidebar or keyboard shortcuts.

Quickly save & organize content on the go: Encrypted sync between your computer, iOS and Android devices.

Your Data and Attention are yours: Memex is offline first & WorldBrain.io introduced a cap on investor returns so we don’t exploit your attention and data to maximize investor profits.

The page illustrates each feature with a dynamic screen shot, so check it out for more details. You can also click here to learn more about their financial philosophy. The Basic version of Memex is free, while the Pro version costs € 2 per month or € 20 per year (after the 14-day free trial). WorldBrain.io hopes its software will contribute to a “well-informed and less polarized global society.” Based in Berlin, the company was founded in 2017.

Cynthia Murrell, June 3, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general), Reference tool | Comments Off on Bookmarks and the Dynamic Web: Yes, Still a Problem

The Good Old Internet Archive Attracts Some Legal Eagle Action

June 2, 2020

Who really owns the Internet Archive? Does the Internet Archive still bundle up tweets and provide them to the august Library of Congress? Is that the caterpillar tracks of the Bezos bulldozer in the roadway near the Internet Archives headquarters? DarkCyber does not know the answers to these questions.

What is clear is that the Association of American Publishers (yes, there are still American publishers) is not happy with the Internet Archive. “Publishers File Suit Against Internet Archive for Systematic Mass Scanning and Distribution of Literary Works. Ask Court to Enjoin and Deter Willful Infringement” is reasonably well written, probably because there is a Vassar literature major in the PR chain. The write up states:

… member companies of the Association of American Publishers (AAP) filed a copyright infringement lawsuit against Internet Archive (“IA”) in the United States District Court for the Southern District of New York. The suit asks the Court to enjoin IA’s mass scanning, public display, and distribution of entire literary works, which it offers to the public at large through global-facing businesses coined “Open Library” and “National Emergency Library,” accessible at both openlibrary.org and archive.org. IA has brazenly reproduced some 1.3 million bootleg scans of print books, including recent works, commercial fiction and non-fiction, thrillers, and children’s books.

The AAP, like DarkCyber, finds the self aggrandizing, virtue signaling about pandemics, unemployment, and other contentious social issues tiresome. Amazon and Google are busy waving their hands after recent social turmoil. That helps if one is seeking clicks and some positive corporate CxO stroking.

The AAP’s statement continues:

Despite the self-serving library branding of its operations, IA’s conduct bears little resemblance to the trusted role that thousands of American libraries play within their communities and as participants in the lawful copyright marketplace. IA scans books from cover to cover, posts complete digital files to its website, and solicits users to access them for free by signing up for Internet Archive Accounts. The sheer scale of IA’s infringement described in the complaint—and its stated objective to enlarge its illegal trove with abandon—appear to make it one of the largest known book pirate sites in the world. IA publicly reports millions of dollars in revenue each year, including financial schemes that support its infringement design.

You can read the AAP statement via the link above.

At a time when civility is in short supply, the AAP approaches its legal foe this way:

The lawsuit reflects widespread anger among publishers, authors, and the entire creative community regarding IA’s actions and its response to objections. In an open letter to IA and its Board of Directors, the Authors Guild observed, “You cloak your illegal scanning and distribution of books behind the pretense of magnanimously giving people access to them. But giving away what is not yours is simply stealing, and there is nothing magnanimous about that. Authors and publishers—the rights owners who legally can give their books away—are already working to provide electronic access to books to libraries and the people who need them. We do not need Internet Archive to give our works away for us.”

Yikes. The news release should have carried a trigger warning. Where is that Vassar-powered red pencil when one needs it? After the Google-centric headline, why not add “This content may be disturbing.”

Will publishers succeed in this effort? The flapping of the legal eagles over the Google Books’ project is less noisy than it was. (When will those angry with Google realize that projects die at Google because staff lose interest?)

Internet Archive may be different. One bright spot: The search and retrieval mechanism for Internet Archive content is darned interesting. Try to find a content object. Great stuff. When a content object cannot be found, does it exist?

The lawsuit is unlikely to consider this question.

Stephen E Arnold, June 2, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general), Publishing | Comments Off on The Good Old Internet Archive Attracts Some Legal Eagle Action

List of Online Libraries

June 2, 2020

One of the DarkCyber researchers spotted a list of online libraries. The source is an unlikely one: Voat and a contributor named Auchtung. The links point to Web sites which provide access to collections. The “collections” are often duplicates; that is, there is redundancy in the list. DarkCyber believes that if one library is taken down, another one can be located. If you are curious about lists of books offered without charge, navigate to this link. Registration may be required. Also, it is possible that some of the information offered on these Web pages is protected by copyright. Just a heads up.

Stephen E Arnold, June 2, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general) | Comments Off on List of Online Libraries

Survey Says, Make the Content Go Away, Please

May 19, 2020

TechRadar states the obvious—“Want to Remove Information About Yourself Online? You’re Not Alone.” The write-up cites a recent Kaspersky survey of over 15,000 respondents. It confirms people are finally taking notice that their personal data has been making its way across the Web. The findings show a high percentage of Internet users have tried to erase personal information online, and for good reason, but many have met with little success. Writer Mike Moore reports:

“Four in five people (82 percent) surveyed in a major study by Kaspersky said they had tried to remove private information which had been publicly available, either from websites or social media channels, recently. However a third (37 percent) of those surveyed had no idea of how to remove details about themselves online. … [The survey] found that over a third (34 percent) of consumers have faced incidents where their private information was accessed by someone who did not have their consent. Of these incidents, over a quarter (29 percent) resulted in financial losses and emotional distress, and more than a third (35 percent) saw someone able to gain access to personal devices without permission. This rises to 39 percent among those aged between 25 and 34, despite younger internet users often being expected to have higher levels of technological literacy. Overall, one in five people say they are concerned about the personal data that organizations are collecting about them and their loved ones.”

The standard recommendation to protect privacy in the first place has been to use a VPN, but even that may be inadequate. A study performed by TechRadar Pro found that nearly half of all VPN services are based in countries that are part of the Fourteen Eyes international surveillance alliance. Looking for alternatives? Moore shares this link to a TechRadar article on what they say are the most secure VPN providers.

Cynthia Murrell, May 19, 2020

Written by Stephen E. Arnold · Filed Under News, Online (general), Privacy | Comments Off on Survey Says, Make the Content Go Away, Please

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.