CyberOSINT banner

Management Observations about Yahoo from a Real Newspaper

December 16, 2015

I am fascinated when publishers offer management advice and opinions. The newspaper publishing sector has done a bang up job with digital in the last 30 years.

I read the UK newspaper written by “real” journalists online and spotted this article: “Don’t Blame Marissa Mayer: nobody Was Going to Save Yahoo.”

That’s a great headline from a newspaper. I think it also emphasizes the value a Xoogler has despite the somewhat tarnished performance of the company the Xoogler is “turning around.”

I highlighted a couple of passages as particularly interesting observations.

For example, I highlighted in Yahoo purple:

All of it [Yahoo’s management actions], sadly, has been pretty irrelevant.

I like the “all”. There is nothing quite like a categorical affirmative to add heft to an argument.

I noted:

It would be easy to blame Mayer for this [revenue malaise]; in several ways she has done herself few favors – hiring and firing a chief operating officer who earned $58m in 15 months, cancelling working from home while bringing her newborn son and a full-time nanny to the office, and overseeing an exodus of top executives.

Well, I am not sure that the assertion “it’s not clear that anybody could have saved Yahoo.”

Again a categorical, embracing lots of folks” does not provide much insight into the Yahoo we know and love.

Too bad for those who rely on generalizations to navigate the tough business climate for information, whether in print or online.

I wonder how newspapers are doing. I assume super peachy. These outfits, including the Telegraph, are paragons of management excellence, organic revenue growth, hefty profits, and keen thinking.

Thank goodness for “real” journalists. These outfits and their professionals will make bang up consultants.

Stephen E Arnold, December 16, 2015

UK Publisher Repositioning

December 15, 2015

I read “How Dennis Publishing Created a New Tech Media Brand.” I was looking forward to a how to, the nuts and bolts of converting a print and online operation into a zippy digital brand.

The write up explains what most folks involved in “real” journalism know: Publishing outfits are good at outputting content and maybe not so good at the organization of the overall operation.

I learned from the write up:

Dennis Publishing wants to be the top destination for technology-related content in the U.K. in the next two years, spurred by the quick success of its new digital brand, Alphr.

I remembered seeing Alphr on iTunes. A podcast about technology ran for a while and then disappeared in October 2015 with nary a peep. I noted the odd ball spelling, which I assume allows the company’s content to be located with a Google or Yandex search.

The write up said:

That decision seems to be paying off. Alphr attracted just under 600,000 unique visitors in the U.K. in November and 1.5 million globally, according to Google Analytics….Dennis claims that the latest data shows that it outstripped Wired U.K., Quartz and Tech Insider in November in terms of shares in U.K. visits across these categories.

So what was the “how”? The write up pointed out that the company:

  • Centralized certain operations
  • Implemented testing procedures for products
  • Kept the same headcount
  • Embraced the Ziff “network” ad sale model from the late 1980s.

In short, in 2015, Dennis took steps that other publishers have been forced to adopt for a number of years.

The one thing the new plan did not do was communicate that the podcast, one of those hippy dippy social media things, was not relevant to the firm.

Communication about podcasts, it seems, is not germane to the new digital brand.

Stephen E Arnold, December 15, 2015

CAVE to TDM: New Jargon for Publishers

December 7, 2015

I read “Text and Data Mining: Challenges and Solutions from the Publishers’ Perspective.” The write up summarizes a conference attracting publishers. The perspective of publishers, based on my experience, is survival. I know. I know. Some publishers are in high cotton. The Washington Post is vying to become the US newspaper of record. There are other examples as well, but surging revenues, generous organic growth, and health profits are not attributes of most publishing outfits engaged in doing things with dead trees.

The conference focused on text mining; that is, according to the blog focused on the program:

Text mining refers to “the process or practice of examining large collections of written resources in order to generate new information”. I am not an expert in text mining, but I understand that it is about applying specialized software/algorithms/techniques on existing textual information so that it can be read and analyzed by machines in order for them to extract more meaningful information for us, humans. Of course, text mining is no news to the research community, as it seems that it all started back in the ’80s with a methodology titled CAVE (Content Analysis of Verbatim Explanations) but its background goes beyond the scope of this article. What I can tell you is that it is a complex process, involving techniques from areas such as information retrieval, natural language processing, information extraction and data mining – into a single workflow!

I am not sure if this definition hits the core of the concern. From my point of view, publishers have to respond to numerical recipes which operate on content. Included in my view is the landscape of videos whose dialog and imagery is converted to either human understandable text, best guesses at what an image represents, and assorted metadata.

What does one do with these outputs of text mining? Numbers are good. But the more important use is that with smart software certain processes can be automated and made intelligent. The use of algorithms to generate news stories is, believe it or not, part of the Associated Press’ bag of cost cutting tricks.

For publishers, like those named in the write up, the future looks challenging. Authors pump out content with or without the ministrations of a “real” publishing company. Then tireless software agents labor away. When something useful (defined by the self adjusting algorithms) become evident, an output is generated.

The question, therefore, is not what tools have been and are available. Text mining is decades old. The question is, “How will publishers make informed decisions about increasingly smart systems which supplant older, slower, more expensive ways to find useful nuggets and assemble them into actionable reports, visualizations, or on demand dashboard displays.

Net net: Conferences are useful. Buzzwords maybe. The publishers have some thrill ahead. I wish I had the publishers’ grasp of trends and text mining technologies.

Stephen E Arnold, December 7, 2015

Hack a Scholarly Journal

December 7, 2015

Scholarly journals and other academic research are usually locked down under a copyright firewall that requires an expensive subscription to access.  Most of the people who want this content are researchers, writers, scientists, students, and other academics.  Most people who steal content usually steal movies, software, books, and material related to pop culture or expensive to buy elsewhere.   Scholarly journals fall into the latter category, but Science Mag shares a new trend for hackers, “Feature: How To Hijack A Journal.”

Journal hacking is not new, but it gaining traction due to the multimillion-dollar academic publishing industry.  Many academic writers pay to publish their papers in a journal and  the fees range in hundreds of dollars.  What happens is something called Web site spoofing, where hackers buy a closely related domain or even hack the actual journal’s domain a create a convincing Web site.  The article describes several examples where well-known journals were hijacked, including one he did himself.

How can you check to see if an online journal is the real deal?

“First, check the domain registration data online by performing a WHOIS query. (It’s not an acronym, but rather a computer protocol to look up “who is” behind a particular domain.) If the registration date is recent but the journal has been around for years, that’s the first clue. Also suspicious is if the domain’s country of registration is different from the journal’s publisher, or if the publisher’s name and contact information are kept anonymous by private domain registrars.”

Sadly, academic journals will be at risk for some time, because many of the publishers never adapted to online publishing, sometimes someone forgets to pay a domain name bill, and they rely on digital object identifiers to map Web addresses to papers.

Scholarly journals are important for academic research, but their publishing models are outdated anyway.  Maybe if they were able to keep up the hacking would not happen as often.

Whitney Grace, December 7, 2015
Sponsored by, publisher of the CyberOSINT monograph

The New Real Journalism: Bezos a WaPo to the Gray Lady

November 29, 2015

I read “Jeff Bezos Says The Washington Post’s Goal Is to Become the New Paper of Record.” As Jack Benny used to say when someone mentioned $1 million, “Yipes.”

My hunch is that the sports at the New York Times probably had other exclamations to share among themselves.

We know that Mr. Bezos seems to have made the overhead reducing cloud computing thing a money maker. We know Mr. Bezos has pulled off a 1950s style rocket ship landing which suggests the visionary inventor of the Tesla has some catching up to do in the space craft landing field. We know Mr. Bezos has lots of money.

I noted this quote, which suggests, he knows his achievement factoids as well:

Well, you know, what we’re doing with the Post is we’re working on becoming the new paper of record, Charlie. We’ve always been a local paper, and just this month The Washington Post passed The New York Times in terms of number of viewers online. This is a gigantic accomplishment for the Post team. We’re just gonna keep after that. The reason that that’s working is because we have such a talented team at the Post. It’s all about quality journalism. And even here in the Internet age, in the 21st century, people really care about quality journalism.

What will the New York Times do? Gee, I don’t know. In the third quarter of 2015, the Gray Lady generated $9 million in profit. What do you think building rockets for fun costs? Probably a lot more than real journalism Bezos style.

Stephen E Arnold, November 29, 2015

Axel Springer Snaps Up Business Insider

November 24, 2015

I often find myself at Business Insider, reading about a recent development. That’s why I was intrigued by the article, “Sold! Axel Springer Bets Big on Digital, Buys Business Insider” at re/code. Though for me the name conjures an image of a sensationalistic talk-show host with a bandana and a wide vocal range, Axel Springer is actually a publisher based in Germany, and has been around since 1946. We note that they also own stake in the Qwant search engine, and their website touts they are now the “leading digital publisher in Europe.” This is one traditional publisher that is taking the world’s shift to the digital realm head on.

Writer Peter Kafka sees a connection between this acquisition and Axel Springer’s failed bid to buy the venerable Financial Times. He writes:

“Axel Springer is a Berlin-based publisher best known as the owner of newspapers Die Welt and Bild. In July, it missed its chance to buy the Financial Times, the august, 127-year-old business news publisher, when it was outbid at the last second by Japan’s Nikkei. Business Insider shares very little in common with the FT, other than they both deal with financial topics: While the FT has built out its own digital operations in recent years, it’s a subscription-based business whose stock-in-trade is sober, restrained reporting. Business Insider is a fast-twitch publisher, pitched at readers who’ve grown up on the Web and based on a free, ad-supported business model. While the site was famous for its you-bet-you’ll-keep-clicking headlines and slideshows, it also did plenty of serious reporting; in the last year it has been on an expansion binge, adding a British outpost, a new tech site and a new gambit that’s supposed to create viral content that lives on platforms like Facebook. Today’s transaction appears to link the FT and BI: Industry executives think Springer’s inability to land the Financial Times made them that much hungrier to get Business Insider.”

Perhaps, but this deal may be a wise choice nevertheless. Digital news and information is here to stay, and Business Insider seems to have figured out the format. We’ll see how Axel Springer leverages that know-how.

Cynthia Murrell, November 24, 2015

Sponsored by, publisher of the CyberOSINT monograph

An Early Computer-Assisted Concordance

November 17, 2015

An interesting post at Mashable, “1955: The Univac Bible,” takes us back in time to examine an innovative indexing project. Writer Chris Wild tells us about the preacher who realized that these newfangled “computers” might be able to help with a classically tedious and time-consuming task: compiling a book’s concordance, or alphabetical list of key words, their locations in the text, and the context in which each is used. Specifically, Rev. John Ellison and his team wanted to create the concordance for the recently completed Revised Standard Version of the Bible (also newfangled.) Wild tells us how it was done:

“Five women spent five months transcribing the Bible’s approximately 800,000 words into binary code on magnetic tape. A second set of tapes was produced separately to weed out typing mistakes. It took Univac five hours to compare the two sets and ensure the accuracy of the transcription. The computer then spat out a list of all words, then a narrower list of key words. The biggest challenge was how to teach Univac to gather the right amount of context with each word. Bosgang spent 13 weeks composing the 1,800 instructions necessary to make it work. Once that was done, the concordance was alphabetized, and converted from binary code to readable type, producing a final 2,000-page book. All told, the computer shaved an estimated 23 years off the whole process.”

The article is worth checking out, both for more details on the project and for the historic photos. How much time would that job take now? It is good to remind ourselves that tagging and indexing data has only recently become a task that can be taken for granted.

Cynthia Murrell, November 17, 2015

Sponsored by, publisher of the CyberOSINT monograph


No Microfiche Required

November 16, 2015

Longstanding publications are breathing new life into their archives by re-publishing key stories online, we learn from NiemanLab’s article, “Esquire Has a Cold: How the Magazine is Mining its Archives with the Launch of Esquire Classics.” We learn that Esquire has been posting older articles on their Esquire Classics website, timed to coincide with related current events. For example, on the anniversary of Martin Luther King Jr.’s death last April, the site republished a 1968 article about his assassination.

Other venerable publications are similarly tapping into their archives. Writer Joseph Lichterman notes:

“Esquire, of course, isn’t the only legacy publication that’s taking advantage of archival material once accessible only via bound volumes or microfiche. Earlier this month, the Associated Press republished its original coverage of Abraham Lincoln’s assassination 150 years ago…. Gawker Media’s Deadspin has The Stacks, which republishes classic sports journalism originally published elsewhere. For its 125th anniversary last year, The Wall Street Journal published more than 300 archival articles. The New York Times runs a Twitter account, NYT Archives, that resurfaces archival content from the Times. It also runs First Glimpses, a series that examines the first time famous people or concepts appeared in the paper.”

This is one way to adapt to the altered reality of publication. Perhaps with more innovative thinking, the institutions that have kept us informed for decades (or centuries) will survive to deliver news to our great-grandchildren. But will it be beamed directly into their brains? That is another subject entirely.


Cynthia Murrell, November 16, 2015

Sponsored by, publisher of the CyberOSINT monograph

The Guardian Recycles Binney and Seems to Omit Reddit Link to Original Content

November 12, 2015

I am not a subscriber to the Guardian. Perhaps the online article I viewed a moment ago is spoofed in some way. Anyway, navigate to “NSA Whistleblower Reveals Details of American Spying during Reddit AMA Session.” You can read a recycling of the Reddit Ask Me Anything. The link to the source on Reddit is here. Information has a way of disappearing, so if the link the AMA is a goner, there’s not much I can do.

The Guardian does take the time to provide links to its articles and to USA Today, an outstanding publication. Heck, yes, that “real” journalism stuff is just better than the original source.

Quick question: I find it interesting that real journalists are aggressively recycling social media content. Why not include an explicit link? Oh, I know. Pride, haste, a misplaced sense of providing “real” information. Pick one.

Stephen E Arnold, November 12, 2015

Ravel, Harvard, and Indigestion for Lexis and Westlaw

October 31, 2015

If you are a lucky online maven with free Lexis and Westlaw access, you do not want to waste your time reading “Harvard Law School Launches “Free the Law” Project with Ravel Law To Digitize US Case Law, Provide Free Access.”

But if you pay hard cash to run queries on certain court documents, you may want to pay attention to the Ravel-Harvard plan to provide access to US case law.

Ravel wants to catch the attention of the big guns at Reed Elsevier and Thomson Reuters. I assume the executives at these companies are on top of the Ravel plan to unravel their money machines.

According to the Harvard write up:

Harvard Law School’s collection comprises 40,000 books containing approximately forty million pages of court decisions, including original materials from cases that predate the U.S. Constitution. It is the most comprehensive and authoritative database of American law and cases available anywhere except for the Library of Congress, containing binding judicial decisions from the federal government and each of the fifty states, from the founding of each respective jurisdiction. The Harvard Law School Library—the largest academic law library in the world—has been collecting these decisions over the past two hundred years.

Where there is legal information and the two leading for fee legal online services, my hunch is that there will be some legal eagles taking flight.

According to Techdirt:

Harvard “owns” the resulting data (assuming what’s ownable), and while there are some initial restrictions that Ravel can put on the corpus of data, that goes away entirely after eight years, and can end earlier if Ravel “does not meet its obligations.” Beyond that, Harvard is making everything available to non-profits and researchers anyway. Ravel is apparently looking to make some money by providing advanced tools for sifting through the database, even if the content itself will be freely available.

What will the professional publishing outfits do to preserve their market? I can think of several actions. Sure, litigation is one route. But taking Harvard to court might generate some bad vibes. Perhaps Reed Elsevier and Thomson Reuters will finally bite the bullet, merge, and then buy out Ravel? We have Walgreen Boots, why not LexisWestlaw? Is that a scary Halloween thought? Let the Department of Justice unravel that deal. Don’t lawyers enjoy that sort of challenge.

Stephen E Arnold, October 31, 2015

« Previous PageNext Page »