Anonymized Location Data: an Oxymoron?

May 13, 2020

Location data. To many the term sounds innocuous, boring really. Perhaps that is why society has allowed apps to collect and sell it with no significant regulation. An engaging (and well-illustrated) piece from Norway’s NRK News, “Revealed by Mobile,” shares the minute details journalists were able to put together about one citizen from location data purchased on the open market. Graciously, this man allowed the findings to published as a cautionary tale. We suggest you read the article for yourself to absorb the chilling reality. (The link we share above runs through Google Translate.)

Vendors of location data would have us believe the information is completely anonymized and cannot be tied to the individuals who generated it. It is only good for general uses like statistics and regional marketing, they assert. Intending to put that claim to the test, NRK purchased a batch of Norwegian location data from the British firm Tamoco. Their investigation shows anonymization is an empty promise. Though the data is stripped of directly identifying information, buyers are a few Internet searches away from correlating location patterns with individuals. Journalists Trude Furuly, Henrik Lied, and Martin Gundersen tell us:

“All modern mobile phones have a GPS receiver, which with the help of satellite can track the exact position of the phone with only a few meters distance. The position data NRK acquired consisted of a table with four hundred million map coordinates from mobiles in Norway. …

“All the coordinates were linked to a date, time, and specific mobile. Thus, the coordinates showed exactly where a mobile or tablet had been at a particular time. NRK coordinated the mobile positions with a map of Norway. Each position was marked on the map as an orange dot. If a mobile was in a location repeatedly and for a long time, the points formed larger clusters. Would it be possible for us to find the identity of a mobile owner by seeing where the phone had been, in combination with some simple web searches? We selected a random mobile from the dataset.

“NRK searched the address where the mobile had left many points about the nights. The search revealed that a man and a woman lived in the house. Then we searched their Facebook profiles. There were several pictures of the two smiling together. It seemed like they were boyfriend and girlfriend. The man’s Facebook profile stated that he worked in a logistics company. When we searched the company in question, we discovered that it was in the same place as the person used to drive in the morning. Thus, we had managed to trace the person who owned the cell phone, even though the data according to Tamoco should have been anonymized.”

The journalists went on to put together a detailed record of that man’s movements over several months. It turns out they knew more about his trip to the zoo, for example, than he recalled himself. When they revealed their findings to their subject, he was shocked and immediately began deleting non-essential apps from his phone. Read the article; you may find yourself doing the same.

Cynthia Murrell, May 12, 2020

Enterprise Document Management: A Remarkable Point of View

March 3, 2020

DarkCyber spotted “What Is an Enterprise Document Management (EDM) System? How to Implement Full Document Control.” The write up is lengthy, running about 4,000 words. There are pictures like this one:


ECM is enterprise content management and in the middle is Enterprise Document Management which is abbreviated DMS, not EDM.

The idea is that documents have to be managed, and DarkCyber assumes that most organizations do not manage their content — regardless of its format — particularly well until the company is involved in a legal matter. Then document management becomes the responsibility of the lawyers.

In order to do any type of document or content management, employees have to follow the rules. The rules are the underlying foundation of the article. A company manufacturing interior panels for an automaker will have to have a product management system, an system to deal with drawings (paper and digital), supplier data, and other bits and pieces to make sure the “door cards” are produced.

The problem is that guidelines often do not translate into consistent employee behavior. One big reason is that the guidelines don’t fit into the work flows and the incentive schemes do not reward the time and effort required to make sure the information ends up in the “system.” Many professionals write something, text it, and move on. Enterprise systems typically do not track fine grained information very well.

Like enterprise search, the “document management” folks try to make workers who may be concerned about becoming redundant, a sick child, an angry boss, or any other perturbation in the consultant’s checklist ignore many information rules.

There is an association focused on records management. There are companies concerned with content management. There are vendors who focus on images, videos, audio, and tweets.

The myth that an EDM, ECM, or enterprise search system can create an affordable, non invasive, legally compliant, and effective way to deal with the digital fruit cake in organizations is worth lots of money.

The problem is that these systems, methods, guidelines, data lakes, federation technologies, smart software, etc. etc. don’t work.

The article does a good job of explaining what a consultant recommends. The information it presents provides fodder for the marketing animals who are going to help sell systems, training, and consulting.

The reality is that humans generate information and use a range of systems to produce content. Tweets about a missed shipment from a person mobile phone may be prohibited. Yeah, explain that to the person who got the order in the door and kept the commitment to the customer.

There are conferences, blogs, consulting firms, reports, and BrightPlanet videos about managing information.

The write up states:

There is no use documenting and managing poor workflows, processes, and documentation. To survive in business, you have to adapt, change and improve. That means continuously evaluating your business operations to identify shortfalls, areas for improvements, and strengths for continuous investment. Regular internal audits of your management systems will enable you to evaluate the effectiveness of your Enterprise Document Management solution.

Right. When these silver bullet, pie-in-the-sky solutions cost more than budgeted, employees quit using them, and triage costs threaten the survival of the company — call in the consultants.

Today’s systems do not work with the people actually doing information creation. As a result, most fail to deliver. Sound familiar? It should. You, gentle reader, will never follow the information rules unless you are specifically paid to follow them or given an ultimatum like “do this or get fired.”

Tweet that and let me know if you managed that information.

Stephen E Arnold, March 3, 2020

Blockchain: A Loser in 2020?

December 31, 2019

I recently completed a report about Amazon’s R&D work in blockchain. If you want a free summary of the report, write darkcyber333 at yandex dot com. If not, no problem. You will want to read “Please Blockchain, Prove Me Wrong.” The author likes to use words on some online services stop list, but that’s okay. The writer is passionate about the perceived failings of blockchain.

Blockchain is, according to the write up:

a solution looking for a problem.”

More proof needed, you gentle but skeptical reader? How about this?

According to Gartner’s Hype Cycle, blockchain is still “sliding into the trough of disillusionment,” meaning the technology is struggling to live up to the expectations created by the hype around it.

There you go. Proof from a marketing company.

DarkCyber’s view is that encryption is likely to continue to toddle forward. Also, the charm of the distributed database continues to woe some people’s attention.

There may be hope, and perhaps that is why Amazon has more than a dozen patents related to blockchain technology. We learn from the impassioned analysis:

Blockchain’s purported promise is such that everyone is willingly taking a multi-faceted approach, not giving much thought to the possibility that its potential may, in fact, be limited. Or maybe blockchain is just the first iteration of something far more powerful, a base we can build on to restore our faith in decentralized systems.

To sum up, for a dead duck, there are some feathers afloat. And there are those Amazon patents? Maybe Mr. Bezos is just off base and should stick to bulldozing outfits like mom and pop stores and outfits like FedEx?

Stephen E Arnold, December 31, 2019

Online Consumption of Data: A Mental Architecture Built on Inherent Addictive Patterns??

December 27, 2019

Two items caught my attention. The first explains that more than 80 percent of a sample group use a “second screen” when watching television. Yep, the boob tube and the vast wasteland. Marshall McLuan, a controversial figure, explained that TV is a kick back and vegetate medium. Punching buttons and formulating a thought for a tweet is hot. The article “88% of Americans Use a Second Screen While Watching TV. Why?” references the factoid that humans are not very adept at multi tasking. Interesting because humans can walk and chew gum, breathe, and think about crossing the street at the same time. But whatever. Also, the write up ignores the McLuhanesque approach that each type of media has its own “construct” or “mental evocation.”

The answer to “Why?” may be as simple as, “Addiction. Just a TV and a computing device.” Can one get the monkey off one’s back? Not easily.

Who can assist another? Consider if this item of information is correct: “70% Parents Cannot Control Their Own Online Activity.” This write up reports:

Around 70 per cent of parents admit that they themselves spend too much time online and 72 per cent feel that internet and mobile device usage in general is impeding family life…

Net net: No wonder information has to be crunchy. Easy to use is becoming a strategy for control. Interesting implications for 2020 and beyond if these two reports are mostly accurate.

Stephen E Arnold, December 27, 2019

Sentiment Analysis: Can a Monkey Can Do It?

June 27, 2019

Sentiment analysis is a machine learning tool companies are employing to understand how their customers feel about their services and products. It is mainly deployed on social media platforms, including Facebook, Instagram, and Twitter. The Monkey Learn blog details how sentiment analysis is specifically being used on Twitter in the post, “Sentiment Analysis Of Twitter.”

Using sentiment analysis is not a new phenomenon, but there are still individuals unaware of the possible power at their fingertips. Monkey Learn specializes in customer machine learning solutions that include intent, keywords, and, of course, sentiment analysis. The post is a guide on the basics of sentiment analysis: what it is, how it works, and real life examples. Monkey Learn defines sentiment analysis as:

Sentiment analysis (a.k.a opinion mining) is the automated process of identifying and extracting the subjective information that underlies a text. This can be either an opinion, a judgment, or a feeling about a particular topic or subject. The most common type of sentiment analysis is called ‘polarity detection’ and consists in classifying a statement as ‘positive’, ‘negative’ or ‘neutral’.”

It also relies on natural language processing (NLP) to understand the information’s context.

Monkey Learn explains that sentiment analysis is important because most of the world’s digital data is unstructured. Machine learning with NLP’s assistance can quickly sort large data sets and detect their polarity. Monkey Learn promises with their sentiment analysis to bring their customers scalability, consistent criteria, and real-time analysis. Many companies are using Twitter sentiment analysis for customer service, brand monitoring, market research, and political campaigns.

The article is basically a promotional piece for Monkey Learn, but it does work as a starting guide for sentiment analysis.

Whitney Grace, June 27, 2019

The Challenge of Filtering: One Reason to Hire Human Editors

May 17, 2018

In an effort to keep innocent bystanders and children safe from all the nastiness of the net a decade ago, British officials created the greatest firewall this side of China. The results are finally starting to be seen with clear hindsight and they are not a glowing view of the project. We discovered more of this botched attempt at government oversight from a recent BoingBoing piece, “Britain’s Great Firewall Blocks Access to Official Disney Sites, Internet Safety Guides, VPNs and Coding Sites for Kids.”

According to the story the wall is intended to create a:

“[H]armful content blocklist that UK ISPs are using to keep UK children safe.

“Unsurprisingly, the list is full of embarrassing false positives, including (the official UK site of the Walt Disney Company), as well as Disney’s More awkward: the UK’s largest ISPs are blocking, a website that teaches kids to use the internet safely; also blocked is, which teaches children to write software.”

Despite this kind of gaffe, other nations are not being scared off by firewalls. Perhaps that’s a sign of each government’s views on civil liberties, because the UK is trying to fix its problem while Russia is considering adopting a similar firewall. This is a developing front worth following because the way we interact with other nations is at stake.

Patrick Roland, May 17, 2018

SEO Tips for Featured Snippets

March 26, 2018

We like Google’s Featured Snippets feature, at least when the information it serves up is relevant to the query. That is the tool that places text from, and links to, a site that (ideally) answers the user’s question at the top of search results. Naturally, Search Engine Optimization pros want their clients’ sites to grace these answer boxes as often as possible. That is the idea behind VolumeNine’s blog post, “Featured Snippets in Search: An Overview.” Writer Megan Duffy sees Featured Snippets as an opportunity for those already well-positioned in the search rankings. She explains,

There’s no debate that holding the primary spot on a search engine results page helps drive a ton of traffic. But it takes a long, disciplined approach to climb to the top of an organic search result. The featured snippet provides a bit of a shortcut. The featured snippet is an opportunity for any page ranked in the top ten of results to jump straight to the top with less effort compared to building a page’s search rank from, for example, from eighth to first. Having a featured snippet effectively puts you at search result zero and allows your business to earn traffic as the top search result.

Duffy goes on to make recommendations for maximizing one’s chances of being picked for that Snippet spot. To her credit, she emphasizes that good content is key; we like to see that is still a consideration.

Cynthia Murrell, March 26, 2018

Bigquery Equals Big Data Transfers for Google

March 16, 2018

Google provides hundreds of services for its users; these include YouTube, AdWord, DoubleClick Campaign Manager, and more.  Google, however, is mainly used as a search engine and all of the content on its other services are fed into the search algorithm so they can be queried.  In order for all of the content to be searchable, it needs to be dumped and mined.  That requires a lot of push power, so what does Google use?  According to Smart Data Collective, Google uses the, ““Big Query Service: Next Big Thing Unveiled By Google On Big Data”.“”

Google and big data have not been in the news together for a while, but the BigQuery Data Transfer Service shows how it is moving away from SaaS.  How exactly does this work?

According to a Google’s blog post, the new service automates the migration of data from these apps in BigQuery in a scheduled and managed manner. So good so far, the service will support data transfers from AdWords, DoubleClick Campaign Manager, DoubleClick for Publishers, and YouTube Content and Channel Owner Reports and so forth. As soon as the data gets to BigQuery, users can begin querying on the immediate basis. With the help of Google Cloud Dataprep, users cannot only clean and prep the data for that analysis but also further think of analyzing other data alongside that information kept in BigQuery.

The data moves from the apps within 24 hours and BigQuery customers can schedule their own data deliveries so they occur regularly.  Customers who already use BigQuery are Trivago and Zenith.

The article turns into a press release for other services Google provides related to machine learning and explains how it is the leading company in the industry.  It is simply an advertisement for cloud migration and yet another Google service.

Whitney Grace, March 16, 2018

Come on Google, Stop Delivering Offensive Content

March 14, 2018

Sentiment analytics is notoriously hard to program and leads to more chuckles than accurate results.  Throughout the year, Google, Facebook, and other big names have dealt with their own embarrassing sentiment analytics fiascos and they still continue.  The Verge shares, “Google’s Top Search Results Promote Offensive Content, Again” in an unsurprising headline.

One recent example took an offensive meme from the swathe subreddit when “gender fluid” was queried and made it the first thing displayed.  Yes, it is funny, but stuff like this keeps happening without any sign of stopping:

The slip-up comes just a month after Google briefly gave its “top stories” stamp of approval to two 4chan threads identifying the wrong suspect in the recent Las Vegas mass shooting tragedy. This latest search result problem appears to be related to the company’s snippet feature. Featured snippets are designed to answer queries instantly, and they’ve often provided bad answers in the past. Google’s Home device, for example, used a featured snippet to answer the question ‘are women evil?’ with the horrendously bad answer ‘every woman has some degree of prostitute in her.’

The ranking algorithm was developed to pull the most popular stories and deliver them regardless of their accuracy.  Third parties and inaccurate sources can manipulate the ranking algorithm for their own benefit or human.  Google is considered the de facto source of information.  There is a responsibility of purveying the truth, but there will always be people who take advantage of the news outlets.

Whitney Grace, March 14, 2018

Is Google The Victim or the Aggressor in Prager Case?

March 8, 2018

Courtroom drama is reaching a high point in an interesting case that might have flown under your radar. Online university Prager U is suing YouTube for taking many of its videos off of YouTube. Seems like an odd choice, until you start to realize just how political this move is and the first amendment can of worms is spilling all over the place. We learned more in a recent FrontPage Mag story, “Prager U Video: Who Will Google Silence Next?”

According to the video shown, Google claimed that some of the company’s educational five minute videos were not appropriate for children.

“Google and YouTube dominate internet search with over 75% of the market. If you disappear on Google, your ability to voice your opinion disappears too. PragerU is an educational non-profit that has had over 40 of their videos restricted by YouTube. That’s why they have recently filed a lawsuit against the tech giant.”

Prager is claiming that this is a misunderstanding and a violation of their first amendment rights, since they say that their short videos are age appropriate across the board. Google, however, is firing back with a surprising defense: It’s actually Google’s first amendment rights that are being violated. They say: “PragerU’s motion is a radical attempt to rewrite the rules governing online services, one that would transform nearly every decision that service providers make about how content may be displayed on their platforms into a constitutional case to be arbitrated by the courts.”

Grab some popcorn, because this is going to be an interesting fight. Adam Carolla is a semi-partner with Mr. Prager. Mr. Carolla has a podcast, and he can create some traction for issues which interest him. Does anyone remember the patent troll who took on the comedian? The patent troll does, I believe.

Patrick Roland, March 8, 2018

Next Page »

  • Archives

  • Recent Posts

  • Meta