CyberOSINT banner

Avast: Pirate Libraries

July 26, 2016

They are called “pirate libraries,” but one would be better-served envisioning Robin Hood than Blackbeard.  Atlas Obscura takes a look at these floaters of scientific-journal copyrights in, “The Rise of Pirate Libraries.” These are not physical libraries, but virtual ones, where researchers and other curious folks can study articles otherwise accessible only through expensive scientific journal paywalls. Reporter Sarah Laskow writes:

“The creators of these repositories are a small group who try to keep a low profile, since distributing copyrighted material in this way is illegal. Many of them are academics. The largest pirate libraries have come from Russia’s cultural orbit, but the documents they collect are used by people around the world, in countries both wealthy and poor. Pirate libraries have become so popular that in 2015, Elsevier, one of the largest academic publishers in America, went to court to try to shut down two of the most popular, Sci-Hub and Library Genesis.

“These libraries, Elsevier alleged, cost the company millions of dollars in lost profits. But the people who run and support pirate libraries argue that they’re filling a market gap, providing access to information to researchers around the world who wouldn’t have the resources to obtain these materials any other way.”

The development of these illicit repositories traces back to Russia and its surrounds, where academics had a long history of secretly sharing documents under the repressive Soviet Union.  In the 1990s, this tradition began to move online; one of the first pirate-library websites was Lib.Ru. Since then, illegally shared knowledge from more parts of the world has been made available, particularly from Western publishers and universities. Furthermore, the speed with which materials make it online has increased considerably.

Which is more worthy: protecting the stranglehold academic journals have managed to legally establish, and profit from, on research and other information? Or allowing people who possess great curiosity, but who lack deep pockets, to access the latest research? The scholarly pirates have made their choice.

 

 

Cynthia Murrell, July 26, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden Web/Dark Web meet up on July 26, 2016. Information is at this link: http://bit.ly/29tVKpx.

 

More Palantir Spotting

June 27, 2016

Trainspotting is a collection of short stories or a novel presented as a series of short stories by Irvine Welsh. The fun lovers in the fiction embrace avocations which seem to be addictive. The thrill is the thing. Now I think I have identified Palantir spotting.

Navigate to “Palantir Seeks to Muzzle Former Employees.” I am not too interested in the allegations in the write up. What is interesting is that the article is one of what appears to be of series of stories about Palantir Technologies enriched with non public documents.

image

The Thingverse muzzle might be just the ticket for employees who want to chatter about proprietary information. I assume the muzzle is sanitary and durable, comes in various sizes, and adapts to the jaw movement of the lucky dog wearing the gizmo.

Why use the phrase “Palantir spotting.” It seems to me that making an outfit which provides services and software to government entities is an unusual hobby. I, for example, lecture about the Dark Web, how to recognize recycled analytics algorithms and their assorted “foibles,” and how to find information in the new, super helpful Google Web search system.

Poking the innards of an outfit with interesting software and some wizards who might be a bit testy is okay if done with some Onion type  or Colbert like humor. Doing what one of my old employers did in the 1970s to help ensure that company policies remain inside the company is old hat to me.

In the write up, I noted:

The Silicon Valley data-analysis company, which recently said it would buy up to $225 million of its own common stock from current and former staff, has attached some serious strings to the offer. It is requiring former employees who want to sell their shares to renew their non-disclosure agreements, agree not to poach Palantir employees for 12 months, and promise not to sue the company or its executives, a confidential contract reviewed by BuzzFeed News shows. The terms also dictate how former staff can talk to the press. If they get any inquiries about Palantir from reporters, the contract says, they must immediately notify Palantir and then email the company a copy of the inquiry within three business days. These provisions, which haven’t previously been reported, show one way Palantir stands to benefit from the stock purchase offer, known as a “liquidity event.”

Okay, manage information flow. In my experience, money often comes with some caveats. At one time I had lots and lots of @Home goodies which disappeared in a Sillycon Valley minute. The fine print for the deal covered the disappearance. Sigh. That’s life with techno-financial wizards. It seems life has not changed too much since the @Home affair decades ago.

I expect that there will be more Palantir centric stories. I will try to note these when they hit my steam powered radar detector in Harrod’s Creek. My thought is that like the protagonists in Trainspotting, Palantir spotting might have some after effects.

I keep asking myself this question:

How do company confidential documents escape the gravitational field of a comparatively secretive company?

The Palantir spotters are great data gatherers or those with access to the documents are making the material available. No answers yet. Just that question about “how”.

Stephen E Arnold, June 27, 2016

An Early Computer-Assisted Concordance

November 17, 2015

An interesting post at Mashable, “1955: The Univac Bible,” takes us back in time to examine an innovative indexing project. Writer Chris Wild tells us about the preacher who realized that these newfangled “computers” might be able to help with a classically tedious and time-consuming task: compiling a book’s concordance, or alphabetical list of key words, their locations in the text, and the context in which each is used. Specifically, Rev. John Ellison and his team wanted to create the concordance for the recently completed Revised Standard Version of the Bible (also newfangled.) Wild tells us how it was done:

“Five women spent five months transcribing the Bible’s approximately 800,000 words into binary code on magnetic tape. A second set of tapes was produced separately to weed out typing mistakes. It took Univac five hours to compare the two sets and ensure the accuracy of the transcription. The computer then spat out a list of all words, then a narrower list of key words. The biggest challenge was how to teach Univac to gather the right amount of context with each word. Bosgang spent 13 weeks composing the 1,800 instructions necessary to make it work. Once that was done, the concordance was alphabetized, and converted from binary code to readable type, producing a final 2,000-page book. All told, the computer shaved an estimated 23 years off the whole process.”

The article is worth checking out, both for more details on the project and for the historic photos. How much time would that job take now? It is good to remind ourselves that tagging and indexing data has only recently become a task that can be taken for granted.

Cynthia Murrell, November 17, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Product Hunt Adds Collections to Its Search Results

November 13, 2015

Product Hunt is a website for the cutting-edge consumer, where users share information about the latest and greatest in the tech market. The Next Web tells us, “Product Hunt Now Lets You Follow and Search for Collections.” A “collection” can be established by any user to curate and share groups of products. An example would be a selection of website-building tools, or of the best electronic-device accessories for charging electronic devices. The very brief write-up reveals:

Product Hunt, the Web’s favorite destination to discover new apps, gadgets and connected services, has updated its Collections feature, allowing users to follow and search for curated lists. You can now follow any collection you find interesting to receive notifications when new products are added to them. Collections will also show up in search results alongside products. In addition, curators can add comments to products in their collections to describe them or note why they’ve included them in their list.”

So now finding the best of the latest is even easier. An important tool for anyone with a need, and the means, to keep in front of the technology curve. Launched in 2013, Product Hunt is based in San Francisco. Their Collections feature was launched last December, and this year the site also added sections specifically for books and for games.

Cynthia Murrell, November 13, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Libraries Failure to Make Room for Developer Librarians

October 23, 2015

The article titled Libraries’ Tech Pipeline Problem on Geek Feminism explores the lack of diverse developers. The author, a librarian, is extremely frustrated with the approach many libraries have taken. Rather than refocusing their hiring and training practices to emphasize technical skills, many are simply hiring more and more vendors, hardly a solution. The article states,

“The biggest issue I see is that we offer a fair number of very basic learn-to-code workshops, but we don’t offer a realistic path from there to writing code as a job. To put a finer point on it, we do not offer “junior developer” positions in libraries; we write job ads asking for unicorns, with expert- or near-expert-level skills in at least two areas (I’ve seen ones that wanted strong skills in development, user experience, and devops, for instance).”

The options available are that librarians either learn to code in their spare time (not viable), or enter the tech workforce temporarily and bring your skills back after a few years. This option is also full of drawbacks, especially that even white women are marginalized in the tech industry. Instead, the article stipulates the libraries need to make more room for hiring and promoting people with coding skills and interests while also joining the coding communities like Code4Lib.

 

Chelsea Kerwin, October 23, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Funding Granted for American Archive Search Project

September 23, 2015

Here’s an interesting project: we received an announcement about funding for Pop Up Archive: Search Your Sound. A joint effort of the WGBH Educational Foundation and the American Archive of Public Broadcasting, the venture’s goal is nothing less than to make almost 40,000 hours of Public Broadcasting media content easily accessible. The American Archive, now under the care of WGBH and the Library of Congress, has digitized that wealth of sound and video. Now, the details are in the metadata. The announcement reveals:

As we’ve written before, metadata creation for media at scale benefits from both machine analysis and human correction. Pop Up Archive and WGBH are combining forces to do just that. Innovative features of the project include:

*Speech-to-text and audio analysis tools to transcribe and analyze almost 40,000 hours of digital audio from the American Archive of Public Broadcasting

*Open source web-based tools to improve transcripts and descriptive data by engaging the public in a crowdsourced, participatory cataloging project

*Creating and distributing data sets to provide a public database of audiovisual metadata for use by other projects.

“In addition to Pop Up Archive’s machine transcripts and automatic entity extraction (tagging), we’ll be conducting research in partnership with the HiPSTAS center at University of Texas at Austin to identify characteristics in audio beyond the words themselves. That could include emotional reactions like laughter and crying, speaker identities, and transitions between moods or segments.”

The project just received almost $900,000 in funding from the Institute of Museum and Library Services. This loot is on top of the grant received in 2013, from the Corporation for Public Broadcasting, that got the project started. But will it be enough money to develop a system that delivers on-point results? If not, we may be stuck with something clunky, something that resembles the old Autonomy Virage, Blinkxx, Exalead video search, or Google YouTube search. Let us hope this worthy endeavor continues to attract funding so that, someday, anyone can reliably (and intuitively) find valuable Public Broadcasting content.

Cynthia Murrell, September 23, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

A Search Engine for College Students Purchasing Textbooks

August 27, 2015

The article on Life Hacker titled TUN’s Textbook Search Engine Compares Prices from Thousands of Sellers reviews TUN, or the “Textbook Save Engine.” It’s an ongoing issue for college students that tuition and fees are only the beginning of the expenses. Textbook costs alone can skyrocket for students who have no choice but to buy the assigned books if they want to pass their classes. TUN offers students all of the options available from thousands of booksellers. The article says,

“The “Textbook Save Engine” can search by ISBN, author, or title, and you can even use the service to sell textbooks as well. According to the main search page…students who have used the service have saved over 80% on average buying textbooks. That’s a lot of savings when you normally have to spend hundreds of dollars on books every semester… TUN’s textbook search engine even scours other sites for finding and buying cheap textbooks; like Amazon, Chegg, and Abe Books.”

After typing in the book title, you get a list of editions. For example, when I entered Pride and Prejudice, which I had to read for two separate English courses, TUN listed an annotated version, several versions with different forewords (which are occasionally studied in the classroom as well) and Pride and Prejudice and Zombies. After you select an edition, you are brought to the results, laid out with shipping and total prices. A handy tool for students who leave themselves enough time to order their books ahead of the beginning of the class.

Chelsea Kerwin, August 27, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Library Design Improves

June 10, 2015

I like libraries. If you enjoy visiting them as well, navigate to “These Modern Libraries Look Like Alien Spaceships On The Inside.” Among the libraries featured are the Beinecke Rare Book and Manuscript Library (Yale), Bibliotheca Alexandrina, and Biblioteca España.

Stephen E Arnold, June 9, 2015

Reading in the Attention Deficit World

May 12, 2015

The article on Popist titled Telling the Truth with Charts outlines the most effective and simple method of presenting the information on the waning of book-reading among Americans. While the article focuses on the effectiveness of the chart, the information in the chart is disturbing as well, stating that the amount of Americans who read zero books in 2014 is up to 23% from 8% in 1987. The article links to another article on The Atlantic titled The Decline of the American Book Lover. That article presents an argument for some hope,

“The percentage of young folks reading for pleasure stopped declining. Last year, the NEA found that 52 percent of 18-24 year-olds had read a book outside of work or school, the same as in the pre-Facebook days of 2002. If book culture were in terminal decline, this is the demographic where you’d expect it to be fading fastest. Perhaps the worst of the fall is over. “

The article demonstrates the connection between education level and reading for pleasure, which may be validation for many teachers and professors. However, there also seems to be a growing tendency among students to read, even homework, without absorbing anything, or in other words, to skim texts instead of paying close attention. This may be the effect of too much TV or

Facebook, or even the No Child Left Behind generation entering college. Students are far more interested in their grades than in their education, and just tallying up the numbers of books they or anyone else read is not going to paint an accurate portrait. Similarly, what books are the readers reading? If they are all Twilight and 50 Shades of Grey, do we still celebrate the accomplishment?

Chelsea Kerwin, May 12, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Research Like the Old School

April 24, 2015

There was a time before the Internet that if you wanted to research something you had to go to the library, dig through old archives, and check encyclopedias for quick facts.  While it seems that all information is at your disposable with a few keystrokes, but search results are often polluted with paid ads and unless your information comes from a trusted source, you can’t count it as fact.

LifeHacker, like many of us, knows that if you want to get the truth behind a topic, you have to do some old school sleuthing.  The article “How To Research Like A Journalist When The Internet Doesn’t Deliver” drills down tried and true research methods that will continue to withstand the sands of time or the wrecking ball (depending on how long libraries remain brick and mortar buildings).

The article pushes using librarians as resources and even going as far as petitioning government agencies and filing FOIA requests for information.  When it makes the claim that some information is only available in person or strictly for other librarians, this is both true and false.  Many libraries are trying to digitize their information, but due to budgets are limited in their resources.  Also unless the librarian works in a top secret archive, most of the information is readily available to anyone with or without the MLS degree.

Old school interviews are always great, especially when you have to cite a source.  You can always cite your own interview and verify it cam straight from the horse’s mouth.  One useful way to team the Internet with interviews is tracking down the interviewees.

Lastly, this is the best piece of advice from the article:

“Finally, once you’ve done all of this digging, visited government agencies, libraries, and the offices of the people with the knowledge you need, don’t lose it. Archive everything. Digitize those notes and the recordings of your interviews. Make copies of any material you’ve gotten your hands on, then scan them and archive them safely.”

The Internet is full of false information.  By placing a little more credence out there, will make the information more safe to use or claim as the truth.

These tips are useful, even if a little obvious, but they however still fail to mention the important step that all librarians know: doing the actual footwork and proper search methods to find things.

Whitney Grace, April 24, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Next Page »