Big Data as Savior of Newspapers? Tell That to NYT Editors

August 7, 2017

This would be ironic. The SmartDataCollective posits, “Is Big Data the Salvation of the Newspaper Industry?” The write-up tells us that several prominent publications are turning to data analysis to boost their bottom lines and, they hope, save themselves from extinction. Writer Rehan Ijaz cites this post from the US Chamber of Commerce Foundation as he describes ways the New York Times and the Financial Times are leveraging data. He quotes publishing pro, David Soloff:

The Financial Times, one of our global publisher customers, uses big data analytics to optimize pricing on ads by section, audience, targeting parameters, geography, and time of day. Our friends at the FT sell more inventory because the team knows what they have, where it is and how it should be priced to capture the opportunity at hand. To boot, analytics reveal previously undersold areas of the publication, enabling premium pricing and resulting in found margin falling straight to the bottom line.

What about the venerable New York Times? That paper hired a data scientist in 2014, yet now is slashing staff, we learn from Reuters’ piece, “New York Times Offers Buyouts, Scraps Public Editor Position.” It is, in fact, most editors facing unemployment (because clear prose and verified facts are so last century, I suppose.) Reporters Jessica Toonkel and Narottam Medhora reveal:

The newspaper said it would eliminate the in-house watchdog position of public editor as it shifts focus to reader comments. ‘Today, our followers on social media and our readers across the internet have come together to collectively serve as a modern watchdog, more vigilant and forceful than one person could ever be,’ publisher Arthur Sulzberger Jr said in a memo, which was reviewed by Reuters.

“Vigilant and forceful?” Is “correct” not a consideration? Professional editors exist for a reason; crowdsourcing will not always suffice. Also, call me old-fashioned, but I think facts should be confirmed before publication. This is an interesting choice for the Times to be making particularly now, amid the “fake news” commotion.

Cynthia Murrell, August 7, 2017

Bibliophiles Have 25 Million Reasons to Smile

June 6, 2017

The US Library of Congress has released 25 million records of its collection online and are anyone with Internet access is free to use it.

According to Science Alert article titled The US Library of Congress Just Put 25 Million Records Online, Free of Charge:

The bibliographic data sets, like digital library cards, cover music, books, maps, manuscripts, and more, and their publication online marks the biggest release of digital records in the Library’s history.

The Library of Congress has been on digitization spree for long and users can expect more records to be made online in the near future. The challenge, however, is retrieving books or information that the user needs. The web interface is still complicated and not user-friendly. In short, the enterprise search function is a mess. What The Library of Congress really needs is a user-friendly and efficient way of accessing its vast collection of knowledge to bibliophiles.

Vishal Ingole, June 6, 2017

The Golden Age of Radio as Compared to the Internet

April 3, 2017

Here is an article going out to all those old fogies who remember when radio was the main source of news, entertainment, and communication.  Me Shed Society compares the Golden Age of Radio to the continuous information stream known as the Internet and they discuss more in the article, “The Internet Does To The World What Radio Did To The World.”

The author focuses on Marshal McLuhan’s book Understanding Media and its basic idea, “The medium is the message.”  There are three paragraphs that the author found provoking and still relevant, especially in today’s media craze times.  The author suggests that if one were to replace the Hitler references with the Internet or any other influential person or medium, it would be interchangeable.  The first paragraph states that Hitler’s rise to power is due in part to the new radio invention and mass media.  The most profound paragraph is the second:

The power of radio to retribalize mankind, its almost instant reversal of individualism into collectivism, Fascist or Marxist, has gone unnoticed. So extraordinary is this unawareness that it is what needs to be explained. The transforming power of media is easy to explain, but the ignoring of this power is not at all easy to explain. It goes without saying that the universal ignoring of the psychic action of technology bespeaks some inherent function, some essential numbing of consciousness such as occurs under stress and shock conditions.

The third paragraph concludes that there should be some way to defend against media fallout, such as education and its foundations in dead tree formats, i.e. print.

Print, however, is falling out of favor, at least when it comes to the mass media, and education is built more on tests and meeting standards than fighting hysteria.  Let us add another “-ism” to this list with the “extreme-ism” that runs rampant on the TV and the Internet.

Whitney Grace, April 3, 2017

Yandex Incorporates Semantic Search

March 15, 2017

Apparently ahead of a rumored IPO launch, Russian search firm Yandex is introducing “Spectrum,” a semantic search feature. We learn of the development from “Russian Search Engine Yandex Gets a Semantic Injection” at the Association of Internet Research Specialists’ Articles Share pages. Writer Wushe Zhiyang observes that, though Yandex claims Spectrum can read users’ minds,  the tech appears to be a mix of semantic technology and machine learning. He specifies:

The system analyses users’ searches and identifies objects like personal names, films or cars. Each object is then classified into one or more categories, e.g. ‘film’, ‘car’, ‘medicine’. For each category there is a range of search intents. [For example] the ‘product’ category will have search intents such as buy something or read customer reviews. So we have a degree of natural language processing, taxonomy, all tied into ‘intent’, which sounds like a very good recipe for highly efficient advertising.

But what if a search query has many potential meanings? Yandex says that Spectrum is able to choose the category and the range of potential user intents for each query to match a user’s expectations as close as possible. It does this by looking at historic search patterns. If the majority of users searching for ‘gone with the wind’ expect to find a film, the majority of search results will be about the film, not the book.

As users’ interests and intents tend to change, the system performs query analysis several times a week’, says Yandex. This amounts to Spectrum analysing about five billion search queries.”

Yandex has been busy. The site recently partnered with VKontakte, Russia’s largest social network, and plans to surface public-facing parts of VKontakte user profiles, in real time, in Yandex searches. If the rumors of a plan to go public are true, will these added features help make Yandex’s IPO a success?

Cynthia Murrell, March 15, 2017

Who Knew Hackers Have Their Own Search Engines?

March 3, 2017

Hackers tend to the flock to the Internet’s underbelly, the Dark Web, and it remains inaccessible unless you have a Tor browser.  According to the AIRS Association, hacker search engines are a lot easier to access than you think, read about it in “5 Hacker-Friendly Search Engines You Must Use.”  The best-known hacker-friendly search engine is Shodan, which can search for Internet connected devices.  While Shodan can search computers, smartphones, and tablets the results also include traffic lights, license plate readers, and anything with an Internet connection.  The biggest problem, however, is that most of these devices do not have any security:

The main reason that Shodan is considered hacker-friendly is because of the amount and type of information it reveals (like banner information, connection types, etc.). While it is possible to find similar information on a search engine like Google, you would have to know the right search terms to use, and they aren’t all laid out for you.

Other than Shodan some of the other scary search engines are ZoomEye, I2P, PunkSPIDER, and Censys.  These search engines range in the amount of data they share as well as their intended purpose, but they all reveal Internet connected devices.  Beginners can use these search engines, but it takes a little more than technical know how to get results displayed.  One needs to figure out how to use them before you even enter the first search result, because basic keyword will not get you far.

Hacker search engines are a good tool to use to find security breaches in your personal network or Web site.  What will prevent most people from using them is the lack of experience, but with only a small amount of learning these search engines in the wrong hands are dangerous.

Whitney Grace, March 3, 2017

New Technologies Meet Resistance in Business

March 3, 2017

Trying to sell a state of the art, next-gen search and content processing system can be tough. In the article, “Most Companies Slow to Adopt New Business Tech Even When It Can Help,” Digital Trends demonstrates that a reluctance to invest in something new is not confined to Search. Writer Bruce Brown cites the Trends vs. Technologies 2016 report (PDF) from Capita Technology Solutions and Cisco. The survey polled 125 ICT [Information and Communications Tech] decision-makers working in insurance, manufacturing, finance, and the legal industry. More in-depth interviews were conducted with a dozen of these folks, spread evenly across those fields.

Most higher-ups acknowledge the importance of keeping on top of, and investing in, worthy technological developments. However, that awareness does not inform purchasing and implementation decisions as one might expect. Brown specifies:

The survey broke down tech trends into nine areas, asking the surveyed execs if the trends were relevant to their business, if they were being implemented within their industry, and more specifically if the specific technologies were being implemented within their own businesses. Regarding big data, for example, 90 percent said it was relevant to their business, 64 percent said it was being applied in their industry, but only 39 percent reported it being implemented in their own business. Artificial intelligence was ranked as relevant by 50 percent, applied in their industry by 25 percent, but implemented in their own companies by only 8 percent. The Internet of Things had 70 percent saying it is relevant, with 50 percent citing industry applications, but a mere 30 percent use it in their own business. The study analyzed why businesses were not implementing new technologies that they recognized could improve their bottom line. One of the most common roadblocks was a lack of skill in recognizing opportunities within organizations for the new technology. Other common issues were the perception of security risks, data governance concerns, and the inertia of legacy systems.

The survey also found the stain of mistrust, with 82 percent of respondents sure that much of what they hear about tech trends is pure hype. It is no surprise, then, that they hesitate to invest resources and impose change on their workers until they are convinced benefits will be worth the effort. Perhaps vendors would be wise to dispense with the hype and just lay out the facts as clearly as possible; potential customers are savvier than some seem to think.

Cynthia Murrell, March 3, 2017

 

Inside Loon Balloons

March 2, 2017

You may have heard about Google X’s Project Loon, which aims to bring Internet access to underserved, rural areas using solar-powered balloons. The post, “Here’s How Google Makes its Giant, Internet-Beaming Balloons,” at Business Insider takes us inside that three-year-old project, describing some of how the balloons are made and used. The article is packed with helpful photos and GIFs. We learn that the team has turned to hot-air-balloon manufacturer Raven Aerostar for their expertise. The write-up tells us:

The balloons fly high in the stratosphere at about 60,000 to 90,000 feet above Earth. That’s two to three times as high as most commercial airplanes. Raven Aerostar creates a special outer shell for the balloons, called the film, that can hold a lot of pressure — allowing the balloons to float in the stratosphere for longer. The film is as thin as a typical sandwich bag. … The film is made of a special formulation of polyethylene that allows it to retain strength when facing extreme temperatures of up to -112 degrees Fahrenheit.

We like the comparison sandwich bag. The balloons are tested in sub-freezing conditions at the McKinley Climatic Lab—see the article for dramatic footage of one of their test subjects bursting. We also learn about the “ballonet,” an internal compartment in each balloon that controls altitude and, thereby, direction. Each balloon is equipped with a GPS tracker, of course, and all electronics are secured in a tiny basket below.

One caveat is a bit disappointing—users cannot expect to stream high-quality videos through the balloons. Described as “comparable to 3G,” the service should be enough for one to visit websites and check email. That is certainly far better than nothing and could give rural small-business owners and remote workers the Internet access they need.

Cynthia Murrell, March 2, 2017

Visualizing a Web of Sites

February 6, 2017

While the World Wide Web is clearly a web, it has not traditionally been presented visually as such. Digital Trends published an article centered around a new visualization of Wikipedia, Race through the Wikiverse for your next internet search. This web-based interactive 3D visualization of the open source encyclopedia is at Wikiverse.io. It was created by Owen Cornec, a Harvard data visualization engineer. It pulls about 250,000 articles from Wikipedia and makes connections between articles based on overlapping content. The write-up tells us,

Of course it would be unreasonable to expect all of Wikipedia’s articles to be on Wikiverse, but Cornec made sure to include top categories, super-domains, and the top 25 articles of the week.

Upon a visit to the site, users are greeted with three options, each of course having different CPU and load-time implications for your computer: “Light,” with 50,000 articles, 1 percent of Wikipedia, “Medium,” 100,000 articles, 2 percent of Wikipedia, and “Complete,” 250,000 articles, 5 percent of Wikipedia.

Will this pave the way for web-visualized search? Or, as the article suggests, become an even more exciting playing field for The Wikipedia Game? Regardless, this advance makes it clear the importance of semantic search. Oh, right — perhaps this would be a better link to locate semantic search (it made the 1 percent “Light” cut).

Megan Feil, February 6, 2017

Little New Hampshire Public Library Takes on Homeland Security over Right to Tor

February 3, 2017

The article on AP titled Browse Free or Die? New Hampshire Library Is at Privacy Fore relates the ongoing battle between The Kilton Public Library of Lebanon, New Hampshire and Homeland Security. This fierce little library was the first in the nation to use Tor, the location and identity scrambling software with a seriously bad rap. It is true, Tor can be used by criminals, and has been used by terrorists. As this battle unfolds in the USA, France is also scrutinizing Tor. But for librarians, the case is simple,

Tor can protect shoppers, victims of domestic violence, whistleblowers, dissidents, undercover agents — and criminals — alike. A recent routine internet search using Tor on one of Kilton’s computers was routed through Ukraine, Germany and the Netherlands. “Libraries are bastions of freedom,” said Shari Steele, executive director of the Tor Project, a nonprofit started in 2004 to promote the use of Tor worldwide. “They are a great natural ally.”… “Kilton’s really committed as a library to the values of intellectual privacy.

To illustrate a history of action by libraries on behalf of patron privacy, the article briefly lists events surrounding the Cold War, the Patriot Act, and the Edward Snowden leak. It is difficult to argue with librarians. For many of us, they were amongst the first authority figures, they are extremely well read, and they are clearly arguing passionately about an issue that few people fully understand. One of the library patrons spoke about how he is comforted by the ability to use Tor for innocent research that might get him flagged by the NSA all the same. Libraries might become the haven of democracy in what has increasingly become a state of constant surveillance. One argument might go along these lines: if we let Homeland Security take over the Internet and give up intellectual freedom, don’t the terrorists win anyway?

Chelsea Kerwin, February 3, 2017

« Previous Page

  • Archives

  • Recent Posts

  • Meta