CyberOSINT banner

Facebook Users Lack Understanding of Filters: No Big Surprise

March 29, 2015

Let me be clear. I am not a Facebook user. One of the goslings configured the Beyond Search blog to send content to a Facebook page. I, however, do not need a stream of information about my high school and college classmates. At my last reunion, the 50th, I saw only two mobile phones: My wife’s and mine. Obviously central Illinois is not a technology hot spot for the over 70 set.

I read “Many, Many Facebook Users Still Don’t Know That Their News Fees Are Filtered by an Algorithm.” Big whoop. Most of the MBAs I know are clueless about Google’s personalization functions and don’t have much appetite for understanding that what you see may not be what is available. For these cohorts, a little learning is just fine. Drinking from a spring is okay as long as the water comes from an authentic source like Dasani. Isn’t that Coca Cola’s outfit?

The write up reveals what strikes me as a no brainer type factoid:

But a majority of everyday Facebook users in a recent study had no idea that Facebook constructs their experience, pushing certain posts into their stream and leaving others out. And worse, many participants blamed themselves, not Facebook’s software, when friends or family disappeared from their news feeds.

The article reports:

While some participants were upset by the idea that Facebook was changing their social experience, more than half of the study participants “came to appreciate the algorithm over the course of the study.” Most came to think that the filtering and ranking software was actually doing a decent job. “Honestly I have nothing to change which I’m surprised!” one said. “Because I came in like ‘Ah, they’re screwing it all!’”

Sigh. Is there a remedy for this lack of understanding? Nope.

Do most online “experts” care? Nah, but some of them charge windmills with their iPad Airs as a shield.

The reality is that a comprehensive understanding of a particular content domain requires good, old fashioned research. The idea is to read, talk to informed individuals, gather additional primary data, analyze what you collect, and then figure out who knows what about a topic.

We are doing this type of grunt work about one facet of the Dark Web. Early results are in. Most of the people writing about the Dark Web are not doing a particularly good job of explaining where the “dark” content lives, how to find it, or what the content reveals about a fundamental shift in online usage for a small but important and interesting group of users worldwide.

If one cannot understand what Facebook is doing, the Dark Web is of zero consequence. If a Google user accepts search results as objective, I am not sure there is much hope for remedial intervention.

Net net: At a time when ease, convenience, short cuts, and distractions are of primary importance, thinking about information is not of much interest to many people.

“Hey, after the NCAA games, let’s binge watch Breaking Bad. We can post our comments on Facebook too!”

Sound fun? Oh, wait. I have to take this call, send an SMS, and post a picture of our pizza to Facebook. Cool.

Stephen E Arnold, March 29, 2015

Bing Books: Chasing a Market

January 9, 2015

Books. Interesting idea. Are books a growth market in the Amazon world?Bing is looking at books. Err, doesn’t Amazon/Goodreads do this? I read “Finding Great Books Just got Easier with Bing Best Sellers Search.” The article provides some suggested searches; for example, best business books. I am not sure how many of the thumb typing crowd are into books. Perhaps Bing can pull new readers with its new service? My hunch is that Bing is likely to generate more sales for Amazon. Publishers will find the Bing thing a step in the right direction.

Stephen E Arnold, January 9, 2015

Losing the Past Online

December 30, 2014

I read “WWWTXT: The Oldest Internet Archive.” The write up makes clear that archival online content is tough to find. I like the idea that online history is lost. The idea, one might say, is that lack of awareness of the past makes everything new again. Here’s a quote I noted:

(Rehn’s archive was acquired from the now-defunct Deja News, which was acquired by Google in 2001.) These days, the majority of new content he gets is from old BBS archives, either given to him, or found on old floppy disks.

When experts in search are clueless about early information retrieval systems, I thought it was a failure on the part of the expert. Now I see. Those folks have no past to which to refer. Hence, old stuff is innovative. Good to know.

Stephen E Arnold, December 30, 2014

Beyond Search Content Flow

December 22, 2014

To my two or three readers:

We will be reducing the flow of stories from December 18, 2014, to January 1, 2015. Coverage in Beyond Search will be expanded to include the new Cyber OSINT data stream and including content about NGIA (next generation information access). I will be moving the IDC/Schubmehl content to the Web site to make on going references to the reputation surfing easier to reference.

Enjoy the holidays.

Stephen E Arnold, December 22, 2014

Elsevier and Bad Information

December 22, 2014

Years and years ago, a unit of the Courier Journal & Louisville Times created the Business Dateline database. As far as I know, it was the first full text online database to feature corrections. The team believed that most online content contained flaws, and neither the database producers, the publishers, nor the online distributions like LexisNexis invested much effort in accuracy. How many databases followed in our footsteps? Well, not too many. At one time it was exactly zero. But people perceive information from a computer as accurate, based on studies we did at the newspaper and subsequently as part of ArnoldIT’s work.

Flash forward to our go go now. The worm, after several decades, may be turning, albeit slowly. Navigate to “Elsevier Retracting 16 Papers for Faked Peer Review.” Assuming the write up was itself accurate, I noted this passage:

We consider ourselves to have an important role in prevention. We try to put a positive tone to our education material, so it’s not a draconian “we will catch you” – it’s also about the importance of research integrity for science, the perception of science with taxpayers…there are a lot of rewards for doing this the right way.

The questions in my mind are:

  • How many errors are in the LexisNexis online file? What steps are being taken to remove the ones known to be incorrect; for example, technical papers with flawed information?
  • How will Elsevier alert its customers that some information may be inaccurate?
  • What process is in place for other Elsevier properties to correct, minimize, and eliminate errors in print and online content?

I can imagine myself in a meeting with Elsevier’s senior management. My task is to propose specific measures to ensure quality, accuracy, and timeliness in Elsevier’s products. I am not sure my suggestions will be ones that generate a great deal of enthusiasm. Hopefully, I am incorrect.

Stephen E Arnold, December 22, 2014

UK Paintings Catalog: When Every Does Not Mean Every

December 2, 2014

I love headlines like “Every Painting in the UK at Your Fingertips.” The idea is that “images and details of every painting (in tempera or acrylic) in public ownership through the United Kingdom.” Well, obviously the “every” is not every painting. There is an 86 volume set which presumably presents the images and metadata. The digital images are available at Your Paintings. There is a search box and a number of other options. I ran a query for Patrick Heron, an artist whose work I find interesting. There are some of his pictures in the Tate, and he was born . Here’s what I found:


Pretty thin. The Patrick Heron entry for the St Ives School offers a bit more information.


I am not sure if the BBC index is incomplete. It appears that posting information or links to other UK online sources is not part of the project. Also, the presentation of different search boxes on the BBC site does not make accessing the Your Paintings information easier.

The enthusiasm of the newspaper is admirable. I expect/hope that the service will improve its usability and completeness in the months ahead. The BBC is, as one of my British acquaintences with an Oxford education used to say, performant.”

Stephen E Arnold, December 2, 20141

Online Accuracy: The Hollywood Sign Approach

November 24, 2014

I read “Why People Keep Trying to Erase the Hollywood Sign from Google Maps.” The write up underscores the fluidity of the notion about accurate online information. Last time I was in Hollywood, I gave my talk at an intel conference and beat a quick path back to Kentucky. For those who think that life has not been lived until one stands at the base of a giant letter, Google Maps, if the write up is correct, may give you an extra workout. Here’s the passage I noted:

Even though Google Maps clearly marks the actual location of the sign, something funny happens when you request driving directions from any place in the city. The directions lead you to Griffith Observatory, a beautiful 1920s building located one mountain east from the sign, then—in something I’ve never seen before, anywhere on Google Maps—a dashed gray line arcs from Griffith Observatory, over Mt. Lee, to the sign’s site. Walking directions show the same thing.

Obviously in the world of online this is the only instance of information being modified so it does not match reality. I am comforted unlike some folks.

Stephen E Arnold, November 24, 2014

Mozzila and Search Changes: Meh

November 20, 2014

You can read the crashing waves of opinions about Mozilla and its falling out of love with the GOOG. “Firefox Drops Google as Default Search Engine…” presents the new, “real” journalism approach; to wit:

Firefox has lost market share in recent years but is still used by roughly 17 percent of web goers.

Juicy factoid. Small percentage in a world in which traffic and eyeballs matter.

You can get the search engine optimization/inside scoop viewpoint in “Mozilla CEO: It Wasn’t Money — Yahoo Was The Better Strategic Partner For Firefox.” I noted this:

The official line from the Mozilla blog post about the deal helps parse what being a good strategic partner seems to be. It praises Yahoo as being “aligned with our values of choice and independence” — which suggests that Firefox was feeling that Google had become too controlling or wanted more control about what was happening within Firefox. Or, perhaps Mozilla felt Google has been less about supporting the web and more about supporting itself than in the past.

My view is not just tepid; it is indifferent. Monopolistic behaviors are the order of the day. Yahoo is no monopoly. Yandex may have a shot as long it stays on the right side of certain governmental authorities. Baidu is the best of the bunch, but one misstep and I would suggest that life could be viewed through a filter.

As the browser becomes the new operating system, if you are not running what’s mainstream, there may be some challenges ahead. Do you still have an Eagle desktop computer? If so, dig it out, plug in your DEC Rainbow, and let me know how you read this blog post.

Oh, and what about search? It seems to rank right along with the Mozilla attitude toward money in my opinion.

Stephen E Arnold, November 20, 2014

Ah, History and the 20 Somethings

November 16, 2014

I had a conversation last week with a quite assured expert in content processing. I mentioned that I was 70 years old and would not attending a hippy dippy conference in New York. I elicited a chuckle.

I thought of this gentle dismissal of old stuff when I read “Old Scientific Papers Never Die, They Just Fade Away. Or They Used to.” The main idea of the article seems to be that “old” work can provide some useful factoids for the 20 somethings and 35 year old whiz kids who wear shirts with unclothed female on them. Couple a festive shirt with tattoo, and you have a microcosm of the specialists inventing the future.

Here’s a passage I noted:

“Our [Googlers] analysis indicates that, in 2013, 36% of citations were to articles that are at least 10 years old and that this fraction has grown 28% since 1990,” say Verstak and co. What’s more, the increase in the last ten years is twice as big as in the previous ten years, so the trend appears to be accelerating.

Quite an insight considering that much of the math used to deliver whizzy content processing is a couple of centuries old. I looked for a reference to Dr. Gene Garfield and did not notice one. Well, maybe he’s too old to be remembered. Should I send a link to the 20 something with whom I spoke? Nah, waste of time.

Stephen E Arnold, November 16, 2014

Will Hand-Carved Type and Printing Return to Strasbourg?

November 12, 2014

I am not hip to the ins and outs of France and its financial situation. I assume the country with more than 200 varieties of cheese and almost as many somewhat obscure search and content processing companies is rolling right along.

I was puzzled by this item: “France Signs a Five-Year National Deal with Elsevier.”

The main points seems to that Elsevier, part owner of the outstandingly expensive online service LexisNexis, has signed a deal to provide Elsevier content for what strikes me as a reasonable price: €171 697 159.

The article seems to imply that this is not a good deal:

French research is in disarray. Some universities are on the verge of bankruptcy. Others anticipates four meager years. Strangely enough, money is not the problem. The French State actually gives away several billions each year in the form of tax incentives so that private companies fund research (the “Crédit impôt recherche”). This policy has proven dramatically ineffectual : it is actually nothing more than a tool for tax optimization, that does little if nothing to encourage research.

I have confidence that the French know exactly how to maintain their premier position in education, finance, and linguistic excellence. Elsevier, by the way, has one very happy sales person.

Stephen E Arnold, November 12, 2014

Next Page »