Tracking Trends in News Homepage Links with Google BigQuery

October 17, 2019

Some readers may be familiar with the term “culturomics,” a particular application of n-gram-based linguistic analysis to text. The practice arose after a 2010 project that applied such analysis to five million historical books across seven languages. The technique creates n-gram word frequency histograms from the source text. Now the technique has been applied to links found on news organizations’ home pages using Google’s BigQuery platform. Forbes reports, “Using the Cloud to Explore the Linguistic Patterns of Half a Trillion Words of News Homepage Hyperlinks.” Writer Kalev Leetaru explains:

“News media represents a real-time reflection of localized events, narratives, beliefs and emotions across the world, offering an unprecedented look into the lens through which we see the world around us. The open data GDELT Project has monitored the homepages of more than 50,000 news outlets worldwide every hour since March 2018 through its Global Frontpage Graph (GFG), cataloging their links in an effort to understand global journalistic editorial decision-making. In contrast to traditional print and broadcast mediums, online outlets have theoretically unlimited space, allowing them to publish a story without displacing another. Their homepages, however, remain precious fixed real estate, carefully curated by editors that must decide which stories are the most important at any moment. Analyzing these decisions can help researchers better understand which stories each news outlet believed to be the most important to its readership at any given moment in time and how those decisions changed hour by hour.”

The project has now collected more than 134 billion such links. The article describes how researchers have used BigQuery to analyze this dataset with a single SQL query, so navigate there for the technical details. Interestingly, one thing they are looking at is trends across the 110 languages represented by the samples. Leetaru emphasizes this endeavor demonstrates how much faster these computations can be achieved compared to the 2010 project. He concludes:

“Even large-scale analyses are moving so close to real-time that we are fast approaching the ability of almost any analysis to transition from ‘what if’ and ‘I wonder’ to final analysis in just minutes with a single query.”

Will faster analysis lead to wiser decisions? We shall see.

Cynthia Murrell, October 17, 2019

YouTube: Tidying Up Script Kiddie Crumbs

October 15, 2019

An interesting series of comments flowed on Reddit (Monday, October 14, 2019). You may be ablt to access the original post and the comments at this link. No guarantees, however. The subject: Alleged Google  censorship. The topic: Methods for penetrating other people’s computers.

Is Google actively removing videos which violate the Jello-like terms of service?

DarkCyber hopes so.

YouTube is TV for hundreds of millions around the world.

There is some interesting material available on YouTube.

The post includes links. DarkCyber suggests you do some clicking and forming your own conclusion. Google often lacks consistency, so it is difficult to know where the Googley ball is bouncing.

Stephen E Arnold, October 15, 2019.

What to Be Found Via Google?

October 13, 2019

Do you want your business and/or Web site to be at the top of Google’s search results? It is a hard race to the top, but it can be won and Business 2 Community explains how in, “How To Use Google My Business Posts.” Google My Business (GMB) posts are part of the Google My Business profile, one of the many services that Google offers users to optimize their business’s profile. According to the article, Google My Business (GMB) posts are mini-ads for your business or the goods/services you offer.

The GMB posts allow users to publish products, services, events and other information directly to Google’s search and maps. Your content is then placed in front of potential customers. The biggest clincher is that the GMB posts are placed in Google’s many services in real time. That is a big deal! Being able to view and interact with content in real time is part of the augmented reality.

“Google offers four different types of posts to help you promote your business:

• Events, like a wine night or networking event

• Offers, such as sales or discounts

• Product updates, like new merchandise

• Announcements, such as “We’re open late” or “Closed due to inclement weather!”

There are two ways to create a GMB post, on a desktop or mobile device. Videos and photos can also be added to posts. All posts appear in a user’s GMB profile and are live on Google search for seven days, unless an event is more than seven days in the future.

GMB is like Facebook or LinkedIn, except for businesses. How long will it take before it becomes spam filled? Also Google Ads works too, but that requires some monetary investment.

Whitney Grace, October 13, 2019

Google: Practical, Pragmatic, and Logical

October 3, 2019

I am not sure if this news item about the GOOG is accurate. Nevertheless, it does provide a Googley solution to a thorny problem; namely, where can one get images with which to train a facial recognition system.

One could scrape One could scrape Yandex Images. One could retain the enterprising Yahoo engineer who hacked accounts for interesting images.


One could snap pix of homeless people. Atlanta. Hmmm. Atlanta?

The allegedly accurate factoids appear in “Google Contractors Reportedly Targeted Homeless People for Pixel 4 Facial Recognition.” I noted:

a Google contractor may be using some questionable methods to get those facial scans, including targeting groups of homeless people and tricking college students who didn’t know they were being recorded. According to several sources who allegedly worked on the project, a contracting agency named Randstad sent teams to Atlanta explicitly to target homeless people and those with dark skin, often without saying they were working for Google, and without letting on that they were actually recording people’s faces.

Legal? Illegal? I don’t care.

The idea and the execution is troubling.

If true, classy. Like the yacht death involving drugs and an alleged person for hire. Somehow Googley.

Stephen E Arnold, October 3, 2019

Will the Real Disintermediating Entity Step Forward?

October 3, 2019

Big Microsoft day. It’s back in the mobile phone business. Sometime next year, probably coincident with a delayed Win 10 update, the Microsoft Surface Dual Screen Folding Android Phone becomes available. You can get the scoop and one view of Microsoft’s “we’re in phones again strategy” in “Microsoft’s Future Is Built on Google Code.” Do I agree? Of course not, that’s my method: Find other ways to look at an announcement.

The write up posits:

Google underpins Microsoft’s browser and mobile OS now.

I noted this statement as well:

… it could come as quite a shock that the CEO of Microsoft doesn’t care that much about operating systems. But there it is, in black and white. Microsoft obviously isn’t abandoning Windows — it announced a new version of it today — but it matters much more to Microsoft that you use its services like Office. That’s where the money is, after all.

Money. A phone that is not here?

But there’s another side to Microsoft. Amazon, the evil enemy, makes it possible run Microsoft on the AWS platform.

Now who is going to disintermediate whom?

Will Google get frisky and nuke Microsoft’s Android love?

Will Amazon just push MSFT SQLServer and other Microsoft innovations off the AWS platform and suck up the MSFT business.

Will Microsoft find that loving two enemies is more a management hassle than getting a Windows 10 server out the door?

Will Amazon and Google escalate their skirmishes and take actions that miss one enemy and plug the Redmond frenemy?

The stakes are high. Microsoft has done a pivot with an double backflip.

Perfect 10 or broken foot? Enron tried something like Microsoft’s approach. The landing was bumpy. The cloud may not cushion a lousy landing.

Stephen E Arnold, October 3, 2109

Google: A Big Play

October 1, 2019

Google’s walled garden is getting a glass roof. AMP was a good first step, but there is a world of other Internet-enabled services which are not likely to be AMP-lified. What’s the fix? DarkCyber believes that Google wants to become the Internet. Stopping Amazon is not working with the GOOG’s standard line up of services. “Why Big ISPs Aren’t Happy about Google’s Plans for Encrypted DNS.”

The write up states:

Google and Mozilla are trying to address these concerns by adding support in their browsers for sending DNS queries over the encrypted HTTPS protocol. But major Internet service providers have cried foul. In a September 19 letter to Congress, Big Cable and other telecom industry groups warned that Google’s support for DNS over HTTPS (DOH) “could interfere on a mass scale with critical Internet functions, as well as raise data-competition issues.”

Consider Google’s point of view. Google has user security in mind. Sure, there are others who see benefits in putting Google in a superordinate position with regards to DNS. What happens if Google filters certain addresses? An apology for sure.

The stakes are high. How will Amazon (an ISP of sorts) respond?

This will be interesting.

Stephen E Arnold, October 1. 2019

Google Cemetery

September 27, 2019

Google is mercurial. I assume that others will describe Google in a different way. DarkCyber wants to note this page:


The Google Graveyard does a good job of keeping track of discontinued Google products and services. Each entry includes a brief description, a comment about how long the product lived, and a cheerful icon for the deceased service. Little digital tombstones!

Stephen E Arnold, September 27, 2019

Dumbing Down Search and Making More Money?

September 27, 2019

Google makes changes that benefit Google. Forbes Magazine, the capitalist tool, however, does not understand this simple fact about the world’s largest online advertising outfit.

“Google Makes It More Difficult To Find Old Images” points out that the ad giant made it more difficult for 99 percent of Google Image search users to locate “old” images. Most of Google’s advanced search features don’t get much click love.

As a result, why make the feature available? The benefit of making Google Image Search dumbed is related to several factors tangential; my thought is:

  • Legal hassles related to making images findable
  • Cost reduction. If content is not searched, why spend money verifying links and storing pointers
  • Ads. Clicking a Web page for an image can display a current ad. Clicking an old picture like the one below is unlikely to provide an ad payout for the GOOG.


There are some options:

  • Use Google search operators like those on this list
  • Include a date in the image search string; for example, IBM mainframe 1964
  • Use the Google advanced image search form which is at this link.

What’s Forbes’ take?

I reached out to Google for comment on this story. I have yet to hear back and will update this article if I do.

Yep, the capitalist tool.

Stephen E Arnold, September 27, 2019

Apple, Google, and Avid: The Perils of Complexity and Arrogance

September 27, 2019

Apple wants to make Mac users safe. The technology Apple uses requires passwords. Behind the scenes, Apple’s zeros and ones are beavering away to make Mac use a breeze. The trick? Just stick with Apple.

Google wants to point fingers at Apple iPhone and get Chrome on every Mac computer. Ads, surveillance, and real estate are probably motives. The Googlers are darned confident that their code is just peaches. Imagine the pain and shame of posting an admission of sorts that Google nukes some Macs. See this post. (Bonus time?)

Then there is Avid. To prevent the ethical lads and lasses in Hollywood and other video hot spots from pirating software, dongles are the answer. That’s Avid’s policy. No dongle, no go.

The problem is that none of these confident (maybe arrogant?)outfits think about the unknown dependencies within users’ computers. There are too many users. There are too many combinations of software and dongles.

The solution is to assume that everything will work. But when it doesn’t, the arrogant outfits have to explain that:

  • Their code may not be perfect
  • The security procedures may cause problems
  • The dongle things add complexity.

Will these types of issues become more frequent? Will smart software avoid these problems? Will pigs fly?

Stephen E Arnold, September 27, 2019

Google Cannot Patch Up Its Local News Service

September 27, 2019

A Xoogler had an idea for local news. Patch flopped. “Google Shutters Bulletin, Its Hyperlocal News Experiment” reports:

In a letter to users (obtained by Android Police), Google said in two weeks the Bulletin app will no longer be accessible. Users who shared content on the app will be able to download their posts until November 22nd. One reason Bulletin never took off was that it wasn’t highly publicized, so chances are few people are going to miss this.

With  the erosion of daily and weekly newspapers, how does a person get information about a particular city?

Not easily and maybe not at all.

What value were those old fashioned information services? Well, those old fashioned outfits provided advertising opportunities to zippy outfits like Google.

Who cares? Probably not too many Silicon Valley types. Anyway Google tried.

Stephen E Arnold, September 27, 2019

Next Page »

  • Archives

  • Recent Posts

  • Meta