Israel and Mobile Phone Data: Some Hypotheticals

March 19, 2020

DarkCyber spotted a story in the New York Times: “Israel Looks to Repurpose a Trove of Cell Phone Data.” The story appeared in the dead tree edition on March 17, 2020, and you can access the online version of the write up at this link.

The write up reports:

Prime Minister Benjamin Netanyahu of Israel authorized the country’s internal security agency to tap into a vast , previously undisclosed trove of cell phone data to retract the movements of people who have contracted the corona virus and identify others who should be quarantined because their paths crossed.

Okay, cell phone data. Track people. Paths crossed. So what?

Apparently not much.

The Gray Lady does the handwaving about privacy and the fragility of democracy in Israel. There’s a quote about the need for oversight when certain specialized data are retained and then made available for analysis. Standard journalism stuff.

DarkCyber’s team talked about the write up and what the real journalists left out of the story. Remember. DarkCyber operates from a hollow in rural Kentucky and knows zero about Israel’s data collection realities. Nevertheless, my team was able to identify some interesting use cases.

Let’s look at a couple and conclude with a handful of observations.

First, the idea of retaining cell phone data is not exactly a new one. What if these data can be extracted using an identifier for a person of interest? What if a time-series query could extract the geolocation data for each movement of the person of interest captured by a cell tower? What if this path could be displayed on a map? Here’s a dummy example of what the plot for a single person of interest might look like. Please, note these graphics are examples selected from open sources. Examples are not related to a single investigation or vendor. These are for illustrative purposes only.

image

Source: Standard mobile phone tracking within a geofence. Map with blue lines showing a person’s path. SPIE at https://bit.ly/2TXPBby

Useful indeed.

Second, what if the intersection of two or more individuals can be plotted. Here’s a simulation of such a path intersection:

image

Source: Map showing the location of a person’s mobile phone over a period of time. Tyler Bell at https://bit.ly/2IVqf7y

Would these data provide a way to identify an individual with a mobile phone who was in “contact” with a person of interest? Would the authorities be able to perform additional analyses to determine who is in either party’s social network?

Third, could these relationship data be minded so that connections can be further explored?

Image result for analyst notebook mapping route

Source:  Diagram of people who have crossed paths visualized via Analyst Notebook functions. Globalconservation.org

Can these data be arrayed on a timeline? Can the routes be converted into an animation that shows a particular person of interest’s movements at a specific window of time?

image

Source: Vertical dots diagram from Recorded Future showing events on a timeline. https://bit.ly/39Xhbex

These hypothetical displays of data derived from cross correlations, geotagging, and timeline generation based on date stamps seem feasible. If earnest individuals in rural Kentucky can see the value of these “secret” data disclosed in the New York Times’ article, why didn’t the journalist and the others who presumably read the story?

What’s interesting is that systems, methods, and tools clearly disclosed in open source information is overlooked, ignored, or just not understood.

Now the big question: Do other countries have these “secret” troves of data?

DarkCyber does not know; however, it seems possible. Log files are a useful function of data processes. Data exhaust may have value.

Stephen E Arnold, March 19, 2020

Machine Learning Foibles: Are We Surprised? Nope

March 18, 2020

Eurekalert published “Study Shows Widely Used Machine Learning Methods Don’t Work As Claimed.” Imagine that? The article states:

Researchers demonstrated the mathematical impossibility of representing social networks and other co0mplex networks using popular methods of low dimensional embeddings.

To put the allegations and maybe mathematical proof in context, there are many machine learning methods and even more magical thresholds the data whiz kids fiddle to generate acceptable outputs. The idea is that as long as the outputs are “good enough”, the training method is okay to use. Statistics is just math with some good old fashioned “thumb on the scale” opportunities.

The article states:

The study evaluated techniques known as “low-dimensional embeddings,” which are commonly used as input to machine learning models. This is an active area of research, with new embedding methods being developed at a rapid pace. But Seshadhri and his coauthors say all these methods share the same shortcomings.

What are the shortcomings?

Seshadhri and his coauthors demonstrated mathematically that significant structural aspects of complex networks are lost in this embedding process. They also confirmed this result by empirically by testing various embedding techniques on different kinds of complex networks.

The method discards or ignores information, relying on a fuzz ball which puts an individual into a “geometric representation.” Individuals’ social connections are lost in the fuzzification procedures.

Big deal. Sort of. The paper opens the door to many graduate students’ beavering away on the “accuracy” of machine learning procedures.

Stephen E Arnold, March 18, 2020

The Google: Geofence Misdirection a Consequence of Good Enough Analytics?

March 18, 2020

What a surprise—the use of Google tracking data by police nearly led to a false arrest, we’re told in the NBC News article, “Google Tracked his Bike Ride Past a Burglarized Home. That Made him a Suspect.” Last January, programmer and recreational cyclist Zachary McCoy received an email from Google informing him, as it does, that the cops had demanded information from his account. He had one week to try to block the release in court, yet McCoy had no idea what prompted the warrant. Writer Jon Schuppe reports:

“There was one clue. In the notice from Google was a case number. McCoy searched for it on the Gainesville Police Department’s website, and found a one-page investigation report on the burglary of an elderly woman’s home 10 months earlier. The crime had occurred less than a mile from the home that McCoy … shared with two others. Now McCoy was even more panicked and confused.”

After hearing of his plight, McCoy’s parents sprang for an attorney:

“The lawyer, Caleb Kenyon, dug around and learned that the notice had been prompted by a ‘geofence warrant,’ a police surveillance tool that casts a virtual dragnet over crime scenes, sweeping up Google location data — drawn from users’ GPS, Bluetooth, Wi-Fi and cellular connections — from everyone nearby. The warrants, which have increased dramatically in the past two years, can help police find potential suspects when they have no leads. They also scoop up data from people who have nothing to do with the crime, often without their knowing ? which Google itself has described as ‘a significant incursion on privacy.’ Still confused ? and very worried ? McCoy examined his phone. An avid biker, he used an exercise-tracking app, RunKeeper, to record his rides.”

Aha! There was the source of the “suspicious” data—RunKeeper tapped into his Android phone’s location service and fed that information to Google. The records show that, on the day of the break-in, his exercise route had taken him past the victim’s house three times in an hour. Eventually, the lawyer was able to convince the police his client (still not unmasked by Google) was not the burglar. Perhaps ironically, it was RunKeeper data showing he had been biking past the victim’s house for months, not just proximate to the burglary, that removed suspicion.

Luck, and a good lawyer, were on McCoy’s side, but the larger civil rights issue looms large. Though such tracking data is anonymized until law enforcement finds something “suspicious,” this case illustrates how easy it can be to attract that attention. Do geofence warrants violate our protections against unreasonable searches? See the article for more discussion.

Cynthia Murrell, March 18, 2020

Math Resources

January 27, 2020

One of the DarkCyber team spotted a list of math resources available. Some cost money; others are free. Math Vault lists courses, platforms, tools, and question – answering sites. Some are relatively mainstream like Wolfram Alpha; others, less well publicized like ProofWiki. You can find the listing at this link.

Kenny Toth, January 26, 2020

Quadratic Equations: A New Method

December 15, 2019

If you deal with quadratic equations, you will want to read “A New Way to Make Quadratic Equations Easy.” The procedure is straightforward, and apparently has been either overlooked, lost in time, or dismissed as out of step with current teaching methods. Worth a look but my high school mathematics teacher Ms. Blackburn would not approve. She liked old school methods, including whacking teen aged boys on the head with her wooden ruler.

Stephen E Arnold, December 15. 2019

Calculus Made Almost Easy

December 2, 2019

Just a quick tip of the hat to 0a.io. You have to love that url. Navigate to “Calculus Explained with Pics and Gifs.”

The site provides an overview of calculus. Pictures and animations make it easy to determine if one was sleeping in calculus class or paying attention.

The site went live with the information five years ago. One of the DarkCyber team spotted it and sent along the link. Worth a visit.

Stephen E Arnold, December 2, 2019

Can Machine Learning Pick Out The Bullies?

November 13, 2019

In Walt Disney’s 1942 classic Bambi, Thumper the rabbit was told, “If you can’t say something nice, don’t say nothing at all.”

Poor grammar aside, the thumping rabbit did delivered wise advice to the audience. Then came the Internet and anonymity, when the trolls were released to the world. Internet bullying is one of the world’s top cyber crimes, along with identity and money theft. Passionate anti-bullying campaigners, particularly individuals who were cyber-bullying victims, want social media Web sites to police their users and prevent the abusive crime. Trying to police the Internet is like herding cats. It might be possible with the right type of fish, but cats are not herd animals and scatter once the tasty fish is gone.

Technology might have advanced enough to detect bullying and AI could be the answer. Innovation Toronto wrote, “Machine Learning Algorithms Can Successfully Identify Bullies And Aggressors On Twitter With 90 Percent Accuracy.” AI’s biggest problem is that algorithms can identify and harvest information, they lack the ability to understand emotion and context. Many bullying actions on the Internet are sarcastic or hidden within metaphors.

Computer scientist Jeremy Blackburn and his team from Binghamton University analyzed bullying behavior patterns on Twitter. They discovered useful information to understand the trolls:

“ ‘We built crawlers — programs that collect data from Twitter via variety of mechanisms,’ said Blackburn. ‘We gathered tweets of Twitter users, their profiles, as well as (social) network-related things, like who they follow and who follows them.’ ”

The researchers then performed natural language processing and sentiment analysis on the tweets themselves, as well as a variety of social network analyses on the connections between users. The researchers developed algorithms to automatically classify two specific types of offensive online behavior, i.e., cyber bullying and cyber aggression. The algorithms were able to identify abusive users on Twitter with 90 percent accuracy. These are users who engage in harassing behavior, e.g. those who send death threats or make racist remarks to users.

“‘In a nutshell, the algorithms ‘learn’ how to tell the difference between bullies and typical users by weighing certain features as they are shown more examples,’ said Blackburn.”

Blackburn and his teams’ algorithm only detects the aggressive behavior, it does not do anything to prevent cyber bullying. The victims still see and are harmed by the comments and bullying users, but it does give Twitter a heads up on removing the trolls.

The anti-bullying algorithm prevents bullying only after there are victims. It does little assist the victims, but it does prevent future attacks. What steps need to be taken to prevent bullying altogether? Maybe schools need to teach classes on Internet etiquette with the Common Core, then again if it is not on the test it will not be in a classroom.

Whitney Grace, November 13, 2019

Tech Backlash: Not Even Apple and Goldman Sachs Exempt

November 11, 2019

Times are indeed interesting. Two powerful outfits—Apple (the privacy outfit with a thing for Chinese food) and Goldman Sachs (the we-make-money-every way possible organization) are the subject of “Viral Tweet about Apple Card Leads to Goldman Sachs Probe.” The would-be president’s news machine stated, “Tech entrepreneur alleged inherent bias in algorithms for card.” The card, of course, is the Apple-Goldman revenue-generating credit card. Navigate to the Bloomberg story. Get the scoop.

On the other hand, just look at one of the dozens and dozens of bloggers commenting about this bias, algorithm, big name story. Even more intriguing is that the aggrieved tweeter’s wife had her credit score magically changed. Remarkable how smart algorithms work.

DarkCyber does not want to retread truck tires. We do have three observations:

  1. The algorithm part may be more important than the bias angle. The reason is that algorithms embody bias, and now non-technical and non-financial people are going to start asking questions: Superficial at first and then increasingly on point. Not good for algorithms when humans obviously can fiddle the outputs.
  2. Two usually untouchable companies are now in the spotlight for subjective, touchy feely things with which neither company is particularly associated. This may lead to some interesting information about what’s up in the clubby world of the richest companies in the world. Discrimination maybe? Carelessness? Indifference? Greed? We have to wait and listen.
  3. Even those who may have worked at these firms and who now may be in positions of considerable influence may find themselves between a squash wall and sweaty guests who aren’t happy about an intentional obstruction. Those corporate halls which are often tomb-quiet may resound with stressed voices. “Apple” carts which allegedly sell to anyone may be upset. Cleaning up after the spill may drag the double’s partners from two exclusive companies into a task similar to cleaning sea birds after the gulf oil spill.

Will this issue get news traction? Will it become a lawyer powered railroad handcar creeping down the line?

Fascinating stuff.

Stephen E Arnold, November 11, 2019

Visual Data Exploration via Natural Language

November 4, 2019

New York University announced a natural language interface for data visualization. You can read the rah rah from the university here. The main idea is that a person can use simple English to create complex machine learning based visualizations. Sounds like the answer to a Wall Street analyst’s prayers.

The university reported:

A team at the NYU Tandon School of Engineering’s Visualization and Data Analytics (VIDA) lab, led by Claudio Silva, professor in the department of computer science and engineering, developed a framework called VisFlow, by which those who may not be experts in machine learning can create highly flexible data visualizations from almost any data. Furthermore, the team made it easier and more intuitive to edit these models by developing an extension of VisFlow called FlowSense, which allows users to synthesize data exploration pipelines through a natural language interface.

You can download (as of November 3, 2019, but no promises the document will be online after this date) “FlowSense: A Natural Language Interface for Visual Data Exploration within a Dataflow System.”

DarkCyber wants to point out that talking to a computer to get information continues to be of interest to many researchers. Will this innovation put human analysts out of their jobs.

Maybe not tomorrow but in the future. Absolutely. And what will those newly-unemployed people do for money?

Interesting question and one some may find difficult to consider at this time.

Stephen E Arnold, November 4, 2019

 

Bias: Female Digital Assistant Voices

October 17, 2019

It was a seemingly benign choice based on consumer research, but there is an unforeseen complication. TechRadar considers, “The Problem with Alexa: What’s the Solution to Sexist Voice Assistants?” From smart speakers to cell phones, voice assistants like Amazon’s Alexa, Microsoft’s Cortana, Google’s Assistant, and Apple’s Siri generally default to female voices (and usually sport female-sounding names) because studies show humans tend to respond best to female voices. Seems like an obvious choice—until you consider the long-term consequences. Reporter Olivia Tambini cites a report UNESCO issued earlier this year that suggests the practice sets us up to perpetuate sexist attitudes toward women, particularly subconscious biases. She writes:

“This progress [society has made toward more respect and agency for women] could potentially be undone by the proliferation of female voice assistants, according to UNESCO. Its report claims that the default use of female-sounding voice assistants sends a signal to users that women are ‘obliging, docile and eager-to-please helpers, available at the touch of a button or with a blunt voice command like “hey” or “OK”.’ It’s also worrying that these voice assistants have ‘no power of agency beyond what the commander asks of it’ and respond to queries ‘regardless of [the user’s] tone or hostility’. These may be desirable traits in an AI voice assistant, but what if the way we talk to Alexa and Siri ends up influencing the way we talk to women in our everyday lives? One of UNESCO’s main criticisms of companies like Amazon, Google, Apple and Microsoft is that the docile nature of our voice assistants has the unintended effect of reinforcing ‘commonly held gender biases that women are subservient and tolerant of poor treatment’. This subservience is particularly worrying when these female-sounding voice assistants give ‘deflecting, lackluster or apologetic responses to verbal sexual harassment’.”

So what is a voice-assistant maker to do? Certainly, male voices could be used and are, in fact, selectable options for several models. Another idea is to give users a wide variety of voices to choose from—not just different genders, but different accents and ages, as well. Perhaps the most effective solution would be to use a gender-neutral voice; one dubbed “Q” has now been created, proving it is possible. (You can listen to Q through the article or on YouTube.)

Of course, this and other problems might have been avoided had there been more diversity on the teams behind the voices. Tambini notes that just seven percent of information- and communication-tech patents across G20 countries are generated by women. As more women move into STEM fields, will unintended gender bias shrink as a natural result?

Cynthia Murrell, October 17, 2019

Next Page »

  • Archives

  • Recent Posts

  • Meta