Insight into News Corp.: If True, Amazing Method for Creating Content
August 17, 2011
I don’t know if the information in “Phone Hacking Letter Spells More Trouble for Murdoch and News Corp.” is accurate. Since I don’t “do” news, I look at most of the content available via the Web with some skepticism. Since I am not a “real” anything, I am not qualified to determine what is right and wrong in the rough and tumble world of newspaper publishing. With pressures on publishing companies increasing, the line between marketing and research seems to be fuzzy. In the quest for eyeballs, I am beginning to think that anything goes.
Here is the passage that caught my attention:
Dated March 2, 2007, Mr. Goodman’s letter was sent as a protest against his firing from NotW [News of the World] following his arrest for phone hacking. Goodman, the former royals correspondent, asserts dismay over his dismissal since hacking was “widely discussed” and supported by senior NotW [News of the World] management. At the parliamentary hearing last month, the Murdochs asserted that they did not know the scale of the hacking practice until recently, and said they thought it was restricted to Goodman. Yet immediately after the hearing, two former News International executives contested the younger Murdoch’s claim of ignorance.
Fascinating. Next time I read a “real news” story about publishing, I will make a note to consider the likelihood that some drift may be inserted into what’s stated. I wonder how News Corp. will present the trajectory of MySpace.com or how other “real” information distribution channels will describe certain events. I am certain there is an explanation for the apparent discontinuity.
Stephen E Arnold, August 17, 2011
Sponsored by Pandia.com
Arnold Columns, August 2011
August 15, 2011
Another financial crisis, more executive turnover in search and content processing companies, and a definite dearth of substantive news. Nevertheless, I was able to prepare several columns for my publishers. One outfit, which I shall not name, seems to have lost its grip on its life preserver and has slipped under the water. Hopefully, the outfit will resurface.
Here’s the line up for my August columns, which have been submitted. I have no idea when these will appear in their hard copy or online form. Because this is work for hire, you won’t find the information in my free Web logs Beyond Search, SharePointSemantics, or Inteltrax, however. Such is life in the post crash world of copyright-infused publishing companies.
Enterprise Technology Management (London, England). “Google’s Enterprise Search: From Headliner to Bit Player.” In this essay, I pick up the theme that Google’s push into Android and Google Plus (Google+) has made it clear that the firm has some new priorities. In this shift, search is now becoming more a utility. I highlight what Google is doing and contrast it with what Dassault Exalead has in play. Guess which is performing more effectively? Read the ETM publication to find my answer to this question.
Information Today. Due to a brutal September travel schedule, I was a good little, but underpaid writer. I submitted my September and my October columns in August. Don’t worry the information in both is new and definitely important. The September column is “Two Search Innovations: The Snake and the Lion.” I discuss the Canadian teen who created a new approach to determining contextual relevance for short messages and the new metasearch system which uses the full width of today’s modern monitors to display search results. For the October column, I tackle European business intelligence as manifested in Spotter, a firm founded by a female manager wizard who is also a technology ace. This is definitely a must read for those who want more diversity in the male dominated world of search and content processing. This column is called “Business Intelligence: Overcoming the International Blind Spot.” Acquisitions, anyone?
For Information Today’s Newsbreaks, I wrote a longer piece which the editor edited to focus on my views of Google innovation. Cost cutting is interesting. You can find the original on the information Today Web site.
KMWorld has a stockpile of my columns. These will be running in the next issues. I have lost track of what’s in the queue.
Online Magazine. I am now a “regular” contributing longer essays for each issue of this prestigious publication. I am covering open source from the point of view of an online centric organization. This month’s feature is “Open Source Search: A Digital Technicolor Dream Coat.” The idea is that there seems to be something magical about open source. But is it for real or a theatrical convention before open source goes commercial? The answer to the question appears in my write up for Online.
There are two for fee content tests underway. These services may be killed after their alpha tests. If you want to see additional content produced by the ArnoldIT/Beyond Search team, check out www.patentpoints.com and www.thecardline.com.
Stephen E Arnold, August 15, 2011
Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion
Free Program Removes DRM Controls from PDFs
August 12, 2011
We’ve found a tool that is, perhaps, a bit concerning. Softpedia presents, for free, PDF Drm Removal 1.4.2.0. The developer of the software is listed as Removedrmfromepub.com. The product description reads,
PDF Drm Removal is a professional and reliable application designed to remove DRM protections from PDF files with no quality loss. Just removes the PDF files drm header, no change on the files. Read the PDF on any supported devices!
Interesting and somewhat concerning. We understand there’s controversy over Digital Rights Management controls; some say that they stifle innovation or violate private property rights. Others say the technology unnecessarily locks documents into a format that is bound to become obsolete someday.
However, we necessarily sympathize with publishers and writers, like Stephen E. Arnold, who
rely on PDF security to safeguard documents. How else will they protect their work in the digital age?
Cynthia Murrell August 11, 2011
Quote to Note: Newspapers and Android Tablets
August 11, 2011
I read two different news items which informed me that Apple has stopped the sale of Samsung tablets in countries far from Kentucky. Ominous if accurate. Then I came upon “Newspaper Eyes Creating Subsidized Samsung Tablet.” The idea seemed a bit of a reach, but, hey, those newspaper managers are sharp business professionals. I am not sure how many of my neighbors would put down their jug of firewater to browse a tablet, but one never knows until one tries. What caught my attention was a quote to note, allegedly made by “one insider”, a great source for sure. Here the quote perches:
“If it turns out to be a failure, it will be a fantastically interesting failure.” Another source commented: “I would be shocked if it was successful.”
There it is.
Stephen E Arnold, August 11, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Belgium and Google: A Messy Waffle
July 18, 2011
I saw this headline: “Belgian Newspapers Claim Retaliation By Google After Copyright Victory” and I was nervous clicking on the link. SEO news services make me nervous. The idea of any “retaliation” story makes me think of long lists of words on a watch list somewhere.
I clicked on the link, and the story seemed okay, just a bit thin on substantive details. Quoting the Associated Press is in and of itself is reason for concern.
Here’s the main idea:
Publishers in Belgium did not want their content indexed by Google. (That strikes me as less than informed, but forget the knowledge value angle.) So publishers get the fluid legal system to notify the Google. Shortly thereafter, some Belgium publishers note that their content is tough to find at the top of a Google results list. Bottomline line: Some folks believe Google is jiggling the results to make some Belgian content familiar with the tedium of clicking through lots of pages to find the desired hit.
My view is that accusations are definitely good for “real” news outfits like the publisher of the retaliation story. I also think that considerable care must be taken before yip yapping about why a particular results list does not show what one wants, expects, believes, or hopes will appear.
Google has lots of people working on the search system. I once believed that these teams were coordinated and working like a well oiled robot arm assembling nuclear fuel rods. Now I know that the method is more like “get it working”. Good enough is going to earn a search wizard an A from the Google system.
Messy waffles. Image source: http://eatbakelove-todayistheday.blogspot.com/2011/06/weekend-warrior.html
If there is a difference between publishers’ expectations and what is in a particular result list, I suggest several things:
First, get a trained and expert online searcher to run queries in a methodical manner to verify what is and what is not “findable.” Keep in mind that 99.9 percent of the people who claim to be search experts are not. If you don’t believe me, give Ulla de Stricker a buzz. You can also try Anne Mintz, former director of the Forbes Magazine information center. You can also ping Marydee Ojala, editor of online. Folks, trust me. These individuals are certifiable online search experts and can get the information needed to put some data behind the hot air. Data needed.
Yellow Pages Trying to Find a Future
July 7, 2011
Search Engine Watch reports, “Search Secures Recognition as Local Business Info Provider.” The article examines information from studies performed by research companies Burke and eMarketer. Each has compiled data on usage for a variety of local-business-information resources. The Yellow Pages performs well when its paper and online ventures are combined, but separately each was roundly beaten by search engines.
Not surprisingly, add revenues for printed Yellow Pages are expected to dwindle into nothingness:
In the long haul, predicted figures decline to the point of near eclipse. By 2015, the predicted ad investment for print directories is just $5 billion. Meanwhile, search investments continue to rise, with the predicted figures for search ad spending increasing by more than 50 percent to $21.5 billion by 2015.
This must inevitably lead to the extinction of printed phone books. Good news if you’re a tree.
Still, how will the Yellow Pages fare in the long run? Depends on how nimble they are with their business model. The print sector is going to be under increasing cost pressure when print and ink are involved.
Cynthia Murrell, July 7, 2011
The addled goose is the author of The New Landscape of Enterprise Search
Google Books: Chugging Right Along
July 1, 2011
Search Engine Watch’s article “Google-British Library Partnership to Digitize 250,000 Books” reminded us that Google Books is chugging along. With Google working overtime to showcase itself as a portal and really fascinating legal issues surfacing in France, the once high profile Google goal of gathering the world’s information has drifted into the background.
But Google is beavering away with books. Its partner is the British Library. The project will add 250,000 books to the already 12 million-plus that Google has already scanned. These books are copyright free, and come from the British Library’s unique works collection. As with previous projects, completed in cooperation with over 40 libraries, Google will make these works available to anyone for non-commercial use, including reproduction and modification. The Library will also keep an archive of the files.
The documents will be searchable, of course. Rumor has it that Google will also assemble language usage and other data from the texts to add to their datasets.
Google keeps looking for angles on digitizing books. The article reported:
Google still has a long way to go if it wants to meet its ambitious aim: digitizing all of the world’s known 129,864,880 books by the end of 2019. This goal, announced by Google in August of last year, involves creating a digital library of over four billion pages. With many major libraries having unique copies of historical texts, partnerships like the one with the British Library are vital to that goal.
My hunch is that in one or more of Google legal battles, the issue of books, reuse, copyright, and other issues will be glued to the tar ball Google has become. In the meantime, eBooks are hot and Google’s role in their diffusion is likely to be significant. I just don’t know in what way. The British Library is a big name.
Cynthia Murrell July 1, 2011
You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published my Pandia in Oslo, Norway, in June 2011.
The Wages of SEO: Content Free Content
June 28, 2011
In the last two weeks, I have participated in a number of calls about the wrath of Panda. The idea is that sites which produce questionable content like Beyond Search suck. I agree that Beyond Search sucks. The site provides me with a running diary of what I find important in search and content processing. Some search vendors have complained that I cover Autonomy and not other engines. I find Autonomy interesting. It held an IPO, buys companies, manages reasonably well, and is close to generating an annual turnover of $1.0. I don’t pay much attention to Dieselpoint and a number of other vendors because these companies do not strike me as disruptive or interesting.
I paddle away in Harrod’s Creek, oblivious to the machinations of “content farms.” I have some people helping me because I have a number of projects underway, and once I find an article I want to capture, I enlist the help of librarians and other specialists. Other folks are doing similar things, but rely on ads for revenue which I do not do. I have some Google ads, but these allow me to look at Google reports and keep tabs o n various Googley functions. The money buys a tank of gas every month. Yippy.
I read “Google’s War on Nonsense.” You should too while I go out to clean the pasture spring. The main point is that a number of outfits pay people to write content that is of questionable value. No big surprise. I noted this passage in the write up:
The insultingly vacuous and frankly bizarre prose of the content farms — it seems ripped from Wikipedia and translated from the Romanian — cheapens all online information. A few months ago, tired of coming across creepy, commodified content where I expected ordinary language, I resolved to turn to mobile apps for e-books, social media, ecommerce and news, and use the open Web only sparingly. I had grown confused by the weird articles I often stumbled on. These prose-widgets are not hammered out by robots, surprisingly. But they are written by writers who work like robots. As recent accounts of life in these words-are-money mills make clear, some content-farm writers have deadlines as frequently as every 25 minutes. Others are expected to turn around reported pieces, containing interviews with several experts, in an hour. Some compose, edit, format and publish 10 articles in a single shift. Many with decades of experience in journalism work 70-hour weeks for salaries of $40,000 with no vacation time. The content farms have taken journalism hackwork to a whole new level.
My take on this approach to information—what I call content free content—is that we are in the midst of a casserole created by Google and its search engine optimization zealots. Each time Google closes a loophole for metatag stuffing or putting white text on a white background, another corner cutter cooks up some other way to confuse and dilute Google’s relevance recipe.
The content free content revolution has been with us for a long time. A Web searcher’s ability to recognize baloney is roughly in line with the Web searcher’s ability to invest the time and effort to fact check, ferret out the provenance of a source, and think critically. Google makes this flaw in its ad machine’s approach with its emphasis on “speed” and “predictive methods.” Speed means that Google is not doing much, if any, old fashioned index look up. The popular stuff is cached and updated when it suits the Google. No search required, thank you. Speed, just like original NASCAR drivers, is a trick. And that trick works. Maybe not for queries like mine, but I don’t count literally. Predictive means that Google uses inputs to create a query, generate good enough results, and have them ready or pushed to the user. Look its magic. Just not to me.
With short cuts in evidence at Google and in the world of search engine optimization, with Web users who are in a hurry and unwilling or unable to check facts, with ad revenue and client billing more important than meeting user needs—we have entered the era of content free content. As lousy as Beyond Search is, at least I use the information in my for fee articles, my client reports, and my monographs.
The problem, however, is that for many people what looks authoritative is authoritative. A Google page that puts a particular company or item at the top of the results list is the equivalent of a Harvard PhD for some. Unfortunately the Math Club folks are not too good with content. Algorithms are flawless, particularly when algorithms generate big ad revenue.
Can we roll back the clock on relevance, reading skills, critical thinking, and the pursuit of knowledge for its own sake? Nope, search is knowledge. SEO is the into content free content. In my opinion, Google likes this situation just fine.
Stephen E Arnold, June 28, 2011
You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published my Pandia in Oslo, Norway, in June 2011.
Gannett a Gone Goose?
June 26, 2011
Physorg.com announces that “USA Today Publisher Gannett Cuts 700 Jobs.” Just what we need now, hundreds more unemployed.
Despite the headline, USA Today itself is not involved. Gannett is the largest news chain in the U.S, and the layoffs will hit a multitude of local papers. The article states:
Like other US newspapers, Gannett has been grappling with declining print advertising revenue, falling circulation and the migration of readers to free news online. Robert Dickey, the head of Gannett’s US Community Publishing division, said in a memo to employees that the layoffs were necessary to ‘align our costs with the current revenue trends.
Yeah, we’ve heard it before. I’d like to see what the executives make. Also why spare the national rag?
Stephen E Arnold, former vice president at the “old” Pulitzer Prize Courier Journal” hinted that the Courier Journal was once one of the top 25 newspapers in the world.
Since Gannett’s purchase of the paper, the CJ has tumbled from its lofty perch. Steve thought that Barry Senior would have gone ballistic. In his own gentile way he would have put the quality back in the paper. He would motivate staff, officers, and others by reminding everyone that quality was what made the paper great. He would have passed the word to others, including the US president, various elected officials, and his pals at the New York Times.
When large corporations gobble up local media, the quality goes out and the buzz dies.
Now the future of Gannett becomes visible. Business wizards with systems that release stories online without the value adding that extends, complements and enhances old fashioned high value writing. Where did those databases go? Good question. The gone goose’s gone geese.
Cynthia Murrell June 25, 2011
From the leader in next-generation analysis of search and content processing, Beyond Search.
Will Google Do Real News?
June 23, 2011
Will Google do “real” news? I read “Salon CEO Gingras Resigns to Become Global Head of News Products at Google.” I think this is a fascinating action on the part of Google. In Google: The Digital Gutenberg, published by Infonortics Ltd. in 2009, I looked at Google’s content technology. My focus was not on indexing. I reviewed the parse, tag, chop, and reassemble systems and methods that Google’s wizards had invented. The monograph is available at this link. The monograph may be useful for anyone who wants to understand what happens when “real” journalists get access to the goodies in the Googleplex. In addition, to Odwalla beverages, the Google open source documents suggest that snippets of text and facts can be automatically assembled into outputs that one could describe as “reports” or “new information objects.” Sure, a human is needed in some of these processes, but Google uses lots of humans. Its public relations machine and liberal mouse pad distribution policy helps keep the myth alive that Google is all math all the time. Not exactly accurate.
The write up says:
The new position as the senior executive overseeing Google News, as well as other products that may be in the pipeline, comes several years after Gingras worked as a consultant at the Mountain View campus, focusing on ways the search giant could improve its news products.
What will come from a “real” journalist getting a chance to learn about some of the auto assembly technology? I offer some ideas in my Digital Gutenberg monograph. Publishers may want to ponder this idea as well. Google is more than search, and we are going to learn more about its intentions in the near future.
Five years ago, when Google was at the top of its game, I would have had little hesitation to give Google a better than 50 percent chance of success. Now with the Amazon, Apple, and Facebook environment, I am not so sure. Google has been relying more on buying stuff that works and playing a hard game of “Me Too.”
With the most recent reworking of Google News, I find myself turning to Pulse, Yahoo News, and NewsNow.co.uk. Am I alone?
Stephen E Arnold, June 23, 2011
From the leader in next-generation analysis of search and content processing, Beyond Search.