The Case for Algorithmic Equity

September 20, 2016

We know that AI algorithms are skewed by the biases of both their creators and, depending on the application, their users. Social activist Cathy O’Neil addresses the broad consequences to society in her book, Weapons of Math Destruction. Time covers her views in its article, “This Mathematician Says Big Data is Causing a ‘Silent Financial Crisis’.” O’Neil studied mathematics at Harvard, utilized quantitative trading at a hedge-fund, and introduced a targeted-advertising startup. It is fair to say she knows what she is talking about.

More and more businesses and organizations rely on algorithms to make decisions that have big impacts on people’s lives: choices about employment, financial matters, scholarship awards, and where to deploy police officers, for example. Yet, the processes are shrouded in secrecy, and lawmakers are nowhere close to being on top of the issue. There is currently no way to ensure these decisions are anything approaching fair. In fact, the algorithms can create a sort of feedback loop of disadvantage. Reporter Rana Foroohar writes:

Using her deep technical understanding of modeling, she shows how the algorithms used to, say, rank teacher performance are based on exactly the sort of shallow and volatile type of data sets that informed those faulty mortgage models in the run up to 2008. Her work makes particularly disturbing points about how being on the wrong side of an algorithmic decision can snowball in incredibly destructive ways—a young black man, for example, who lives in an area targeted by crime fighting algorithms that add more police to his neighborhood because of higher violent crime rates will necessarily be more likely to be targeted for any petty violation, which adds to a digital profile that could subsequently limit his credit, his job prospects, and so on. Yet neighborhoods more likely to commit white collar crime aren’t targeted in this way.

Yes, unsurprisingly, it is the underprivileged who bear the brunt of algorithmic aftermath; the above is just one example. The write-up continues:

Indeed, O’Neil writes that WMDs [Weapons of Math Destruction] punish the poor especially, since ‘they are engineered to evaluate large numbers of people. They specialize in bulk. They are cheap. That’s part of their appeal.’ Whereas the poor engage more with faceless educators and employers, ‘the wealthy, by contrast, often benefit from personal input. A white-shoe law firm or an exclusive prep school will lean far more on recommendations and face-to-face interviews than a fast-food chain or a cash-strapped urban school district. The privileged… are processed more by people, the masses by machines.

So, algorithms add to the disparity between how the wealthy and the poor experience life. Compounding the problem, algorithms also allow the wealthy to isolate themselves online as well as in real life, through curated news and advertising that make it ever easier to deny that poverty is even a problem. See the article for its more thorough discussion.

What does O’Neil suggest we do about this? First, she proposes a “Hippocratic Oath for mathematicians.” She also joins the calls for much more thorough regulation of the AI field and to update existing civic-rights laws to include algorithm-based decisions. Such measures will require the cooperation of legislators, who, as a group, are hardly known for their technical understanding. It is up to those of us who do comprehend the issues to inform them action must be taken. Sooner rather than later, please.

Cynthia Murrell, September 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

 

Hundreds of Thousands of Patient Records Offered up on the Dark Web

September 19, 2016

Some of us suspected this was coming, despite many assurances to the contrary. Softpedia informs us, “Hacker Selling 651,894 Patient Records on the Dark Web.” Haughtily going by the handle TheDarkOverlord, the hacker responsible is looking to make over seven hundred grand off the data. Reporter Catalin Cimpanu writes:

The hacker is selling the data on The Real Deal marketplace, and he [or she] says he breached these companies using an RDP (Remote Desktop Protocol) bug. TheDarkOverlord has told DeepDotWeb, who first spotted the ads, that it’s ‘a very particular bug. The conditions have to be very precise for it.’ He has also provided a series of screenshots as proof, showing him accessing the hacked systems via a Remote Desktop connection. The hacker also recalls that, before putting the data on the Dark Web, he contacted the companies and informed them of their problems, offering to disclose the bug for a price, in a tactic known as bug poaching. Obviously, all three companies declined, so here we are, with their data available on the Dark Web. TheDarkOverlord says that all databases are a one-time sale, meaning only one buyer can get their hands on the stolen data.

The three databases contain information on patients in Farmington, Missouri; Atlanta, Georgia; and the Central and Midwest areas of the U.S. TheDarkOverloard asserts that the data includes details like contact information, Social Security numbers, and personal facts like gender and race. The collection does not, apparently, include medical history. I suppose that is a relief—for now.

Cynthia Murrell, September 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

 

Enterprise Technology Perspective on Preventing Security Breaches

September 16, 2016

When it comes to the Dark Web, the enterprise perspective wants solutions to prevent security breaches. Fort Scale released an article, Dark Web — Tor Use is 50% Criminal Activity — How to Detect It, speaking to this audience. This write-up explains the anonymizer Tor as The Onion Router, a name explained by the multiple layers used to hide an IP address and therefore the user’s identity. How does the security software works to detect Tor users? We learned,

There are a couple of ways security software can determine if a user is connecting via the Tor network. The first way is through their IP address. The list of Tor relays is public, so you can check whether the user is coming from a known Tor relay. It’s actually a little bit trickier than that, but a quality security package should be able to alert you if user behaviors include connecting via a Tor network. The second way is by looking at various application-level characteristics. For example, a good security system can distinguish the differences between a standard browser and a Tor Browser because among other things,Tor software won’t respond to certain history requests or JavaScript queries.

Many cybersecurity software companies that exist offer solutions that monitor the Dark Web for sensitive data, which is more of a recovery strategy. However, this article highlights the importance of cybersecurity solutions which monitor enterprise systems usage to identify users connecting through Tor. While this appears a sound strategy to understand the frequency of Tor-based users, it will be important to know whether these data-producing software solutions facilitate action such as removing Tor users from the network.

Megan Feil, September 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Automated Tools for Dark Web Data Tracking

September 15, 2016

Naturally, tracking stolen data through the dark web is a challenge. Investigators have traditionally infiltrated chatrooms and forums in the effort—a tedious procedure with no guarantee of success. Now, automated tools may give organizations a leg up, we learn from the article, “Tools to Track Stolen Data Through the Dark Web” at GCN. Reporter Mark Pomerleau informs us:
“The Department of Veterans Affairs last month said it was seeking software that can search the dark web for exploited VA data improperly outside its control, distinguish between VA data and other data and create a ‘one-way encrypted hash’ of VA data to ensure that other parties cannot ascertain or use it. The software would also use VA’s encrypted data hash to search the dark web for VA content. We learned:

Some companies, such as Terbium Labs, have developed similar hashing technologies.  ‘It’s not code that’s embedded in the data so much as a computation done on the data itself,’ Danny Rogers, a Terbium Labs co-founder, told Defense One regarding its cryptographic hashing.  This capability essentially enables a company or agency to recognize its stolen data if discovered. Bitglass, a cloud access security broker, uses watermarking technology to track stolen data.  A digital watermark or encryption algorithm is applied to files such as spreadsheets, Word documents or PDFs that requires users to go through an authentication process in order to access it.

We’re told such watermarks can even thwart hackers trying to copy-and-paste into a new document, and that Bitglass tests its tech by leaking and following false data onto the dark web. Pomerleau notes that regulations can make it difficult to implement commercial solutions within a government agency. However, government personnel are very motivated to find solutions that will allow them to work securely outside the office.

The article wraps up with a mention of DARPA’s  Memex search engine, designed to plumb the even-more-extensive deep web. Law enforcement is currently using Memex, but the software is expected to eventually make it to the commercial market.

Cynthia Murrell, September 15, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Mobile Data May Help Fight Disease

September 14, 2016

Data from smartphones and other mobile devices may give us a new tool in the fight against communicable diseases.  Pen State News reports, “Walking and Talking Behaviors May Help Predict Epidemics and Trends.” A recent study, completed by an impressive roster of academics at several institutions, reveals a strong connection between our movements and our communications. So strong, in fact, that a dataset on one can pretty accurately predict the other. The article cites one participant, researcher Dashun Wang of Penn State:

[Wang] added that because movement and communication are connected, researchers may only need one type of data to make predictions about the other phenomenon. For instance, communication data could reveal information about how people move. …

The equation could better forecast, among other things, how a virus might spread, according to the researchers, who report their findings today (June 6) in the Proceedings of the National Academy of Sciences. In the study, they tested the equation on a simulated epidemic and found that either location or communication datasets could be used to reliably predict the movement of the disease.

Perhaps not as dramatic but still useful, the same process could be used to predict the spread of trends and ideas. The research was performed on three databases full of messages from users in Portugal and another (mysteriously unidentified) country and on four years of Rwandan mobile-phone data. These data sets document who contacted whom, when, and where.

Containing epidemics is a vital cause, and the potential to boost its success is worth celebrating. However, let us take note of who is funding this study: The U.S. Army Research Laboratory, the Office of Naval Research, the Defense Threat Reduction Agency and the James S. McDonnell Foundation’s program, Studying Complex Systems. Note the first three organizations in the list; it will be interesting to learn what other capabilities derive from this research (once they are unclassified, of course).

Cynthia Murrell, September 14, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Toshiba Amps up Vector Indexing and Overall Data Matching Technology

September 13, 2016

The article on MyNewsDesk titled Toshiba’s Ultra-Fast Data Matching Technology is 50 Times Faster than its Predecessors relates the bold claims swirling around Toshiba and their Vector Indexing Technology. By skipping the step involving computation of the distance between vectors, Toshiba has slashed the time it takes to identify vectors (they claim). The article states,

Toshiba initially intends to apply the technology in three areas: pattern mining, media recognition and big data analysis. For example, pattern mining would allow a particular person to be identified almost instantly among a large set of images taken by surveillance cameras, while media recognition could be used to protect soft targets, such as airports and railway stations*4by automatically identifying persons wanted by the authorities.

In sum, Toshiba technology is able to quickly and accurately recognize faces in the crowd. But the specifics are much more interesting. Current technology takes around 20 seconds to identify an individual out of 10 million, and Toshiba can do it in under a second. The precision rates that Toshiba reports are also outstanding at 98%. The world of Minority Report, where ads recognize and direct themselves to random individuals seems to be increasingly within reach. Perhaps more importantly, this technology should be of dire importance to the criminal and perceived criminal populations of the world

Chelsea Kerwin, September 13, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monographThere is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

How Collaboration and Experimentation Are Key to Advancing Machine Learning Technology

September 12, 2016

The article on CIO titled Machine Learning “Still a Cottage Industry” conveys the sentiments of a man at the heart of the industry in Australia, Professor Bob Williamson. Williamson is the Commonwealth Scientific and Industrial Research Organisation’s (CSIRO’s) Data 61 group chief scientist. His work in machine learning and data analytics led him to the conclusion that for machine learning to truly move forward, scientists must find a way to collaborate. He is quoted in the article,

There’s these walled gardens: ‘I’ve gone and coded my models in a particular way, you’ve got your models coded in a different way, we can’t share’. This is a real challenge for the community. No one’s cracked this yet.” A number of start-ups have entered the “machine-learning-as-a-service” market, such as BigML, Wise.io and Precog, and the big names including IBM, Microsoft and Amazon haven’t been far behind. Though these MLaaSs herald some impressive results, Williamson warned businesses to be cautious.

Williamson speaks to the possibility of stagnation in machine learning due to the emphasis on data mining as opposed to experimenting. He hopes businesses will do more with their data than simply look for patterns. It is a refreshing take on the industry from an outsider/insider, a scientist more interested in the science of it all than the massive stacks of cash at stake.

Chelsea Kerwin, September 12, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Revenue Takes a Backseat to Patent Filings at IBM

September 9, 2016

The post on Slashdot titled IBM Has Been Awarded an Average of 24 Patents Per Day So Far in 2016 compares the patent development emphasis of major companies, with IBM coming out on top with 3,617 patent awards so far in 2016, according to a Quartz report. Patents are the bi-product of IBM’s focus on scientific research, as the report finds,

The company is in the middle of a painful reinvention, that sees the company shifting further away from hardware sales into cloud computing, analytics, and AI services. It’s also plugging away on a myriad of fundamental scientific research projects — many of which could revolutionize the world if they can come to fruition — which is where many of its patent applications originate. IBM accounted for about 1% of all US patents awarded in 2015.

Samsung claimed a close second (with just over 3,000 patents), and on the next rung down sits Google (with roughly 1,500 patents for the same period), Intel, Qualcomm, Microsoft, and Apple. Keep in mind though, that IBM and Samsung have been awarded more than twice as many patents as Google and the others, making it an unstoppable patent machine. You may well ask, what about revenue? They will get back to you on that score later.

Chelsea Kerwin, September 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Is Google Biotech Team Overreaching?

September 9, 2016

Science reality is often inspired by science fiction, and Google’s biotech research division, Verily Life Sciences, is no exception. Business Insider posts, “‘Silicon Valley Arrogance’? Google Misfires as It Strives to Turn Star Trek Fiction Into Reality.” The “Star Trek” reference points to Verily’s Tricorder project, announced three years ago, which set out to create a cancer-early-detection device. Sadly, that hopeful venture may be sputtering out. STAT reporter Charles Piller writes:

Recently departed employees said the prototype didn’t work as hoped, and the Tricorder project is floundering. Tricorder is not the only misfire for Google’s ambitious and extravagantly funded biotech venture, now named Verily Life Sciences. It has announced three signature projects meant to transform medicine, and a STAT examination found that all of them are plagued by serious, if not fatal, scientific shortcomings, even as Verily has vigorously promoted their promise.

Piller cites two projects, besides the Tricorder, that underwhelm. We’re told that independent experts are dubious about the development of a smart contact lens that can detect glucose levels for diabetics. Then there is the very expensive Baseline study—an attempt to define what it means to be healthy and to catch diseases earlier—which critics call “lofty” and “far-fetched.” Not surprisingly, Google being Google, there are also some privacy concerns being raised about the data being collected to feed the study.

There are several criticisms and specific examples in the lengthy article, and interested readers should check it out. There seems to be one central notion, though— that Verily Live Sciences is attempting to approach the human body like a computer when medicine is much, much more complicated than that. The impressive roster of medical researchers on the team seems to provide little solace to critics. The write-up relates:

It’s axiomatic in Silicon Valley’s tech companies that if the math and the coding can be done, the product can be made. But seven former Verily employees said the company’s leadership often seems not to grasp the reality that biology can be more complex and less predictable than computers. They said Conrad, who has a PhD in anatomy and cell biology, applies the confident impatience of computer engineering, along with extravagant hype, to biotech ideas that demand rigorous peer review and years or decades of painstaking work.

Are former employees the most objective source? I suspect ex-Googlers and third-party scientists are underestimating Google. The company has a history of reaching the moon by shooting for the stars, and for enduring a few failures as a price of success. I would not be surprised to see Google emerge on top of the biotech field. (As sci fi fans know, biotech is the medicine of the future. You’ll see.) The real question is how the company will treat privacy, data rights, and patient safety along the way.

Cynthia Murrell, September 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Palantir: More Legal Excitement

September 6, 2016

One of the Beyond Search goslings directed my attention to a legal document “Palantir Technologies Inc. (”Palantir”) Sues Defendants Marc L. Abramowitz…” The 20 page complaint asserts that a Palantir investor sucked in proprietary information and then used that information outside the boundaries of Sillycon Valley norms of behavior. These norms apply to the one percent of the one percent in my opinion.

The legal “complaint” points to several patent documents which embodied Palantir’s proprietary information. The documents require that one use the Justia system to locate; specifically, Provisional Application No. 62/072,36 Provisional Application No. 62/066,716, and Provisional Application No. 62/094,888. These provisional applications, I concluded, reveal that Palantir seeks to enter insurance and health care type markets. This information appears to put Palantir Technologies at a competitive disadvantage.

Who is the individual named in the complaint?

Marc Abramowitz, who is associated with an outfit named KT4. KT4 does not have much of an online presence. The sparse information available to me about Abramowitz is that he is a Harvard trained lawyer and connected to Stanford’s Hoover econo-think unit. Abramowitz’s link to Palantir is that he invested in the company and made visits to the Hobbits’ Palo Alto “shire” part of his work routine.

Despite the legalese, the annoyance of Palantir with Abramowitz seeps through the sentences.

For me what is interesting is that IBM i2 asserted several years ago that Palantir Technologies improperly tapped into proprietary methods used in the Analyst’s Notebook software product and system. See “i2 and Palantir: Resolved Quietly.”

One new twist is that the Palantir complaint against Abramowitz includes a reference to Abramowitz’s appropriating of the word “Shire.” If you are not in the know in Sillycon Valley, Palantir has referenced its offices as the shire; that is, the firm’s office in Palo Alto.

When I read the document, I did not spot a reference to Hobbits or seeing stones.

When I checked this morning (September 6, 2016), the document was still publicly accessible at the link above. However, Palantir’s complaint about the US Army’s procurement system was sealed shortly after it was filed. This Abramowitz complaint may go away for some folks as well. If you can’t locate the Abramowitz document, you will have to up your legal research game. My hunch is that neither Palantir or Mr. Abramowitz will respond to your request for a copy.

There are several hypothetical, Tolkienesque cyclones from this dust up between and investor and the Palantir outfit, which is alleged to be a mythical unicorn:

  1. Trust seems to need a more precise definition when dealing with either Palantir and Abramowitz
  2. Some folks use Tolkein’s jargon and don’t want anyone else to “horn” in on this appropriation
  3. Filing patents on relatively narrow “new” concepts when one does not have a software engineering track record goes against the accepted norms of innovation
  4. IBM i2’s team may await the trajectory of this Abramowitz manner more attentively than the next IBM Watson marketing innovation.

Worth monitoring just for the irony molecules in this Palantir complaint. WWTK or What would Tolkien think? Perhaps a quick check of the seeing stone is appropriate.

Stephen E Arnold, September 6, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta