Losing the Past Online

December 30, 2014

I read “WWWTXT: The Oldest Internet Archive.” The write up makes clear that archival online content is tough to find. I like the idea that online history is lost. The idea, one might say, is that lack of awareness of the past makes everything new again. Here’s a quote I noted:

(Rehn’s archive was acquired from the now-defunct Deja News, which was acquired by Google in 2001.) These days, the majority of new content he gets is from old BBS archives, either given to him, or found on old floppy disks.

When experts in search are clueless about early information retrieval systems, I thought it was a failure on the part of the expert. Now I see. Those folks have no past to which to refer. Hence, old stuff is innovative. Good to know.

Stephen E Arnold, December 30, 2014

Beyond Search Content Flow

December 22, 2014

To my two or three readers:

We will be reducing the flow of stories from December 18, 2014, to January 1, 2015. Coverage in Beyond Search will be expanded to include the new Cyber OSINT data stream and including content about NGIA (next generation information access). I will be moving the IDC/Schubmehl content to the Xenky.com Web site to make on going references to the reputation surfing easier to reference.

Enjoy the holidays.

Stephen E Arnold, December 22, 2014

Elsevier and Bad Information

December 22, 2014

Years and years ago, a unit of the Courier Journal & Louisville Times created the Business Dateline database. As far as I know, it was the first full text online database to feature corrections. The team believed that most online content contained flaws, and neither the database producers, the publishers, nor the online distributions like LexisNexis invested much effort in accuracy. How many databases followed in our footsteps? Well, not too many. At one time it was exactly zero. But people perceive information from a computer as accurate, based on studies we did at the newspaper and subsequently as part of ArnoldIT’s work.

Flash forward to our go go now. The worm, after several decades, may be turning, albeit slowly. Navigate to “Elsevier Retracting 16 Papers for Faked Peer Review.” Assuming the write up was itself accurate, I noted this passage:

We consider ourselves to have an important role in prevention. We try to put a positive tone to our education material, so it’s not a draconian “we will catch you” – it’s also about the importance of research integrity for science, the perception of science with taxpayers…there are a lot of rewards for doing this the right way.

The questions in my mind are:

  • How many errors are in the LexisNexis online file? What steps are being taken to remove the ones known to be incorrect; for example, technical papers with flawed information?
  • How will Elsevier alert its customers that some information may be inaccurate?
  • What process is in place for other Elsevier properties to correct, minimize, and eliminate errors in print and online content?

I can imagine myself in a meeting with Elsevier’s senior management. My task is to propose specific measures to ensure quality, accuracy, and timeliness in Elsevier’s products. I am not sure my suggestions will be ones that generate a great deal of enthusiasm. Hopefully, I am incorrect.

Stephen E Arnold, December 22, 2014

UK Paintings Catalog: When Every Does Not Mean Every

December 2, 2014

I love headlines like “Every Painting in the UK at Your Fingertips.” The idea is that “images and details of every painting (in tempera or acrylic) in public ownership through the United Kingdom.” Well, obviously the “every” is not every painting. There is an 86 volume set which presumably presents the images and metadata. The digital images are available at Your Paintings. There is a search box and a number of other options. I ran a query for Patrick Heron, an artist whose work I find interesting. There are some of his pictures in the Tate, and he was born . Here’s what I found:

image

Pretty thin. The Patrick Heron entry for the St Ives School offers a bit more information.

image

I am not sure if the BBC index is incomplete. It appears that posting information or links to other UK online sources is not part of the project. Also, the presentation of different search boxes on the BBC site does not make accessing the Your Paintings information easier.

The enthusiasm of the newspaper is admirable. I expect/hope that the service will improve its usability and completeness in the months ahead. The BBC is, as one of my British acquaintences with an Oxford education used to say, performant.”

Stephen E Arnold, December 2, 20141

Online Accuracy: The Hollywood Sign Approach

November 24, 2014

I read “Why People Keep Trying to Erase the Hollywood Sign from Google Maps.” The write up underscores the fluidity of the notion about accurate online information. Last time I was in Hollywood, I gave my talk at an intel conference and beat a quick path back to Kentucky. For those who think that life has not been lived until one stands at the base of a giant letter, Google Maps, if the write up is correct, may give you an extra workout. Here’s the passage I noted:

Even though Google Maps clearly marks the actual location of the sign, something funny happens when you request driving directions from any place in the city. The directions lead you to Griffith Observatory, a beautiful 1920s building located one mountain east from the sign, then—in something I’ve never seen before, anywhere on Google Maps—a dashed gray line arcs from Griffith Observatory, over Mt. Lee, to the sign’s site. Walking directions show the same thing.

Obviously in the world of online this is the only instance of information being modified so it does not match reality. I am comforted unlike some folks.

Stephen E Arnold, November 24, 2014

Mozzila and Search Changes: Meh

November 20, 2014

You can read the crashing waves of opinions about Mozilla and its falling out of love with the GOOG. “Firefox Drops Google as Default Search Engine…” presents the new, “real” journalism approach; to wit:

Firefox has lost market share in recent years but is still used by roughly 17 percent of web goers.

Juicy factoid. Small percentage in a world in which traffic and eyeballs matter.

You can get the search engine optimization/inside scoop viewpoint in “Mozilla CEO: It Wasn’t Money — Yahoo Was The Better Strategic Partner For Firefox.” I noted this:

The official line from the Mozilla blog post about the deal helps parse what being a good strategic partner seems to be. It praises Yahoo as being “aligned with our values of choice and independence” — which suggests that Firefox was feeling that Google had become too controlling or wanted more control about what was happening within Firefox. Or, perhaps Mozilla felt Google has been less about supporting the web and more about supporting itself than in the past.

My view is not just tepid; it is indifferent. Monopolistic behaviors are the order of the day. Yahoo is no monopoly. Yandex may have a shot as long it stays on the right side of certain governmental authorities. Baidu is the best of the bunch, but one misstep and I would suggest that life could be viewed through a filter.

As the browser becomes the new operating system, if you are not running what’s mainstream, there may be some challenges ahead. Do you still have an Eagle desktop computer? If so, dig it out, plug in your DEC Rainbow, and let me know how you read this blog post.

Oh, and what about search? It seems to rank right along with the Mozilla attitude toward money in my opinion.

Stephen E Arnold, November 20, 2014

Ah, History and the 20 Somethings

November 16, 2014

I had a conversation last week with a quite assured expert in content processing. I mentioned that I was 70 years old and would not attending a hippy dippy conference in New York. I elicited a chuckle.

I thought of this gentle dismissal of old stuff when I read “Old Scientific Papers Never Die, They Just Fade Away. Or They Used to.” The main idea of the article seems to be that “old” work can provide some useful factoids for the 20 somethings and 35 year old whiz kids who wear shirts with unclothed female on them. Couple a festive shirt with tattoo, and you have a microcosm of the specialists inventing the future.

Here’s a passage I noted:

“Our [Googlers] analysis indicates that, in 2013, 36% of citations were to articles that are at least 10 years old and that this fraction has grown 28% since 1990,” say Verstak and co. What’s more, the increase in the last ten years is twice as big as in the previous ten years, so the trend appears to be accelerating.

Quite an insight considering that much of the math used to deliver whizzy content processing is a couple of centuries old. I looked for a reference to Dr. Gene Garfield and did not notice one. Well, maybe he’s too old to be remembered. Should I send a link to the 20 something with whom I spoke? Nah, waste of time.

Stephen E Arnold, November 16, 2014

Will Hand-Carved Type and Printing Return to Strasbourg?

November 12, 2014

I am not hip to the ins and outs of France and its financial situation. I assume the country with more than 200 varieties of cheese and almost as many somewhat obscure search and content processing companies is rolling right along.

I was puzzled by this item: “France Signs a Five-Year National Deal with Elsevier.”

The main points seems to that Elsevier, part owner of the outstandingly expensive online service LexisNexis, has signed a deal to provide Elsevier content for what strikes me as a reasonable price: €171 697 159.

The article seems to imply that this is not a good deal:

French research is in disarray. Some universities are on the verge of bankruptcy. Others anticipates four meager years. Strangely enough, money is not the problem. The French State actually gives away several billions each year in the form of tax incentives so that private companies fund research (the “Crédit impôt recherche”). This policy has proven dramatically ineffectual : it is actually nothing more than a tool for tax optimization, that does little if nothing to encourage research.

I have confidence that the French know exactly how to maintain their premier position in education, finance, and linguistic excellence. Elsevier, by the way, has one very happy sales person.

Stephen E Arnold, November 12, 2014

Stanford Finds the First Web Site: Guess Who?

November 9, 2014

I read “Stanford Libraries Unearths the Earliest US Website.” Guess which outfit created the first Web site according to the Stanford Wayback Machine? Give up? It was Stanford. Never heard of the Stanford Wayback? Neither had I. Here’s a link. I suppose the original CERN demo page I saw in the mid 1990s does not count. Well, CERN is obviously not Stanford. Tim Berners who? Next Stanford may discover from its Stanford resources that the university invented fire.

Stephen E Arnold, November 9, 2014

Google and Axel Springer: Traffic Means Power

November 6, 2014

In the summer of 2014, Axel Springer acquired 20 percent of the Pertimm-powered Qwant. As you may know, which I profile in my current Information Today article, is a Web search engine with features. Believe me, lots of features. What Qwant does not have is traffic. Google’s Eric Schmidt believes the quirky system is a threat. From my lookout on top of the crest of the hill near the hollow in which I live in rural Kentucky, that strikes me as a very rotten red herring.

Axel Springer now understands the difference between the traffic generated by Qwanta and other Web search engines and the Google if I understand “German Publishing Giant Axel Springer Caves in over Google News Snippets Row.” The article reports:

Announcing the free license for Google yesterday, Axel Springer said that traffic to the sites had declined by nearly 40 percent since Google stopped producing snippets and thumbnails on October 23. It also claimed that traffic to the German sites from Google News was down by almost 80 percent.

You can work through the “real” journalistic approach to this point when you read the original article.

What’s important to me is that Google traffic flows are a powerful tool in Google’s negotiating arsenal. Even if you own a search engine, if you are not in Google, you don’t exist. I wonder how Edmund Gustav Albrecht Husserl would view this fact.

Stephen E Arnold, November 6, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta