Another Robot Finds a Library Home
August 23, 2016
Job automation has its benefits and downsides. Some of the benefits are that it frees workers up to take on other tasks, cost-effectiveness, efficiency, and quicker turn around. The downside is that it could take jobs and could take out the human factor in customer service. When it comes to libraries, automation and books/research appear to be the antithesis of each other. Automation, better known as robots, is invading libraries once again and people are up in arms that librarians are going to be replaced.
ArchImag.com shares the story “Robot Librarians Invade Libraries In Singapore” about how the A*Star Research library uses a robot to shelf read. If you are unfamiliar with library lingo, shelf reading means scanning the shelves to make sure all the books are in their proper order. The shelf reading robot has been dubbed AuRoSS. During the night AuRoSS scans books’ RFID tags, then generates a report about misplaced items. Humans are still needed to put materials back in order.
The fear, however, is that robots can fulfill the same role as a librarian. Attach a few robotic arms to AuRoSS and it could place the books in the proper places by itself. There already is a robot named Hugh answering reference questions:
New technologies thus seem to storm the libraries. Recall that one of the first librarian robots, Hugh could officially take his position at the university library in Aberystwyth, Wales, at the beginning of September 2016. Designed to meet the oral requests by students, he can tell them where the desired book is stored or show them on any shelf are the books on the topic that interests them.
It is going to happen. Robots are going to take over the tasks of some current jobs. Professional research and public libraries, however, will still need someone to teach people the proper way to use materials and find resources. It is not as easy as one would think.
Whitney Grace, August 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden /Dark Web meet up on August 23, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233019199/
Another Day Another Possible Data Breach
August 19, 2016
Has the next Ashley Madison incident happened? International Business Times reports on breached information that has surfaced on the Dark Web. The article, Fling.com breach: Passwords and sexual preferences of 40 million users up for sale on dark web, sheds some light on what happened in the alleged 40 million records posted on the The Real Deal marketplace. One source claims the leaked data was old information. Another source reports a victim who says they never had an account with Fling.com. The article states,
“The leak is the latest in a long line of dating websites being targeted by hackers and follows similar incidents at Ashley Madison, Mate1, BeautifulPeople and Adult Friend Finder. In each of these cases, hundreds of thousands – if not millions – of sensitive records were compromised. While in the case of Ashley Madison alone, the release of information had severe consequences – including blackmail attempts, high-profile resignations, and even suicide. Despite claims the data is five years old, any users of Fling.com are now advised to change their passwords in order to stay safe from future account exploitation.”
Many are asking about the facts related to this data breach on the Dark Web — when it happened and if the records are accurate. We’re not sure if it’s true, but it is sensational. The interesting aspect of this story is in the terms of service for Fling.com. The article reveals Fling.com is released from any liability related to users’ information.
Megan Feil, August 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden /Dark Web meet up on August 23, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233019199/
USAGov Wants More Followers on Snapchat
August 12, 2016
The article on GCN titled Tracking the Ephemeral: USAGov’s Plan for Snapchat portrays the somewhat desperate attempts of the government to reach out to millennials. Perhaps shocking to non-users of the self-immolating picture app, Snapchat claims over a hundred million active users each day, mostly comprised of 13 to 34 year olds. The General Service Administration of USAGov plans to use Snapchat to study the success of their outreach like how many followers they receive and how many views their content gets. The article mentions,
“And while the videos and multimedia that make up “Snapchat stories” disappear after just 24 hours, the USAGov team believes the engagement metrics will provide lasting value. Snapchat lets account owners see how many people are watching each story, if they watch the whole story and when and where they stop before it’s over — allowing USAGov to analyze what kind of content works best.”
If you are wondering how this plan is affected by the Federal Records Acts which stipulates documentation of content, GSA is way ahead of you with a strategy of downloading each story and saving it as a record. All in all the government is coming across as a somewhat clingy boyfriend trying to find out what is up with his ex by using her favorite social media outlet. Not a great look for the US government. But at least they aren’t using ChatRoulette.
Chelsea Kerwin, August 12, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden /Dark Web meet up on August 23, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233019199/
Mixpanel Essay Contains Several Smart Software Gems
August 11, 2016
I read “The Hard Thing about Machine Learning.” The essay explains the history of machine learning at Mixpanel. Mixpanel is a business analytics company. Embedded in the write up are several observations which I thought warranted highlighting.
The first point is the blunt reminded that machine learning requires humans—typically humans with specialist skills—to make smart software work as expected. The humans have to figure out what problem they and the numerical recipes are supposed to solve. Mixpanel says:
machine learning isn’t some sentient robot that does this all on its own. Behind every good machine learning model is a team of engineers that took a long thoughtful look at the problem and crafted the right model that will get better at solving the problem the more it encounters it. And finding that problem and crafting the right model is what makes machine learning really hard.
The second pink circle in my copy of the essay corralled this observation:
The broader the problem, the more universal the model needs to be. But the more universal the model, the less accurate it is for each particular instance. The hard part of machine learning is thinking about a problem critically, crafting a model to solve the problem, finding how that model breaks, and then updating it to work better. A universal model can’t do that.
I think this means that machine learning works on quite narrow, generally tidy problems. Anyone who has worked with the mid 1990s Autonomy IDOL system knows that as content flows into a properly trained system, that “properly trained” system can start to throw some imprecise and off-point outputs. The fix is to retrain the system on a properly narrowed data set. Failure to do this would cause users to scratch their heads because they could not figure out how their query about computer terminals generated outputs about railroad costs. The key is the word “terminal” and increasingly diverse content flowing into the system.
The third point received a check mark from this intrepid reader:
Correlation does not imply causation.
Interesting. I think one of my college professors in 1962 made a similar statement. Pricing for Mixpanel begins at $600 per month for four million data points.
Stephen E Arnold, August 11, 2016
Facebook Algorithms: Doing What Users Expect Maybe
August 9, 2016
I read an AOL-Yahoo post titled “Inside Facebook Algorithms.” With the excitement of algorithms tingeing the air, explanations of smart software make the day so much better.
I learned:
if you understand the rules, you can play them by doing the same thing over and over again
Good point. But how many Facebook users are sufficiently attentive to correlate a particular action with an outcome which may not be visible to the user?
Censorship confusing? It doesn’t need to be. I learned:
Mr. Abbasi [a person whose Facebook post was censored] used several words which would likely flag his post as hate speech, which is against Facebook’s community guidelines. It is also possible that the number of the words flagged would rank it on a scale of “possibly offensive” to “inciting violence”, and the moderators reviewing these posts would allocate most of their resources to posts closer to the former, and automatically delete those in the latter category. So far, this tool continues to work as intended.
There is nothing like a word look up list containing words which will result in censorship. We love word lists. Non public words lists are not much fun for some.
Now what about algorithms? The examples in the write up are standard procedures for performing brute force actions. Algorithms, as presented in the AOL Yahoo article, seem to be collections of arbitrary rules. Straightforward for those who know the rules.
A “real” newspaper tackled the issue of algorithms and bias. The angle, which may be exciting to some, is “racism.” Navigate to “Is an Algorithms Any Less Racist Than a Human?” Since algorithms are often generated by humans, my hunch is that bias is indeed possible. The write up tells me:
any algorithm can – and often does – simply reproduce the biases inherent in its creator, in the data it’s using, or in society at large. For example, Google is more likely to advertise executive-level salaried positions to search engine users if it thinks the user is male, according to a Carnegie Mellon study. While Harvard researchers found that ads about arrest records were much more likely to appear alongside searches for names thought to belong to a black person versus a white person.
Don’t know the inside rules? Too bad, gentle reader. Perhaps you can search for an answer using Facebook’s search systems or the Wow.com service. Better yet. Ask a person who constructs algorithms for a living.
Stephen E Arnold, August 9, 2016
Auditing Algorithms: The Impossible Dream
August 8, 2016
Remember this lyric:
To dream the impossible dream
To fight the unbeatable foe
To bear with unbearable sorrow
And to run where
the brave dare not go
Yep, killing windmills.
I read “Make Algorithms Accountable.” You may find it online at this link. No promises because the source is a “real journalist’s dream,” the New York Times.
The main point of the write up is to express an impossible dream. Those who craft numerical recipes have to have their algorithms audited. Not the algorithm assembler, mind you, the algorithm itself.
I noted this passage:
…advocates for big data due process argue that much more must be done to assure the appropriateness and accuracy of algorithm results. An algorithm is a procedure or set of instructions often used by a computer to solve a problem. Many algorithms are secret.
A conundrum not as difficult as solving death, but a kissing cousin as some in Harrod’s Creek say.
Are algorithms biased? Do squirrels get shot in Harrod’s Creek? In case you are wondering, the answer is yes. The recipe is not the problem. There are cooks in the mix.
I can hear this now:
This is my quest
To follow that star
No matter how hopeless
No matter how far
To fight for the right
Without question or pause
To be willing to march,
march into hell
For that heavenly cause
Stephen E Arnold, August 8, 2016
Google: The Crimea Name Thing Part 2
August 3, 2016
I kid you not. Google is changing place names on its Crimean map. I assume the write up “Here’s Why Google Maps Changed Some Town Names in Crimea—And Is Now Changing Them Back” is the digital equivalent of the tablets Moses toted down from the mountain. The mountain, as I recall, was not in the Crimea, but anything is possible.
The write up informed me:
Google told RBC news website that they were aware of the changes made in accordance with Ukraine’s “decommunization” law and of Russia’s displeasure. Google told RBC they were “actively working to restore the previous version of the names on the Russian version of Google Maps.”
For an outfit with smart software doing many things, I assume that the scripts are busy working their magic. But what if the changes are made by humans? I thought that Google was into the algorithmic approach to objectivity.
What is Google called? Ah, Alphabet. I wonder if smart software or smart people decided this name change. A rose by any other name would smell as sweet. Thank goodness for Snapchat images of flowers. No words required. Perhaps Google will change the name of Harrod’s Creek to Herod’s Sewer? It is indeed possible.
Stephen E Arnold, August 3, 2016
In-Q-Tel and Algorithmia: The Widget Approach to Tasks
July 27, 2016
I noted “Algorithmia Lands In-Q-Tel Deal, Adds Deep Learning Capabilities.” Useful write up but one important facet of the deal was omitted from the news item. The trend in some government projects is to have a “open” set of software. The idea of running a query for a needed widget on an ecommerce site hath its charms. The goal is cost reduction and reducing the time required to modify an existing software system. This is the idea behind the DCGS (Distributed Common Ground System) widget approach. The issue I have with brokering algorithms is that most of the algorithms which look like rocket science are actually textbook examples or variations on well known themes. Search and content processing chugs along on 10 methods. When a novel solution becomes available like this insight into Carmichael numbers appears, years may pass before the method can be verified and inserted into the usable methods folder. Now those doing the searching have to know what an algorithm actually does and what settings are required for the numerical recipe to generate an edible croissant. That human knowledge thing is an issue.
Stephen E Arnold, July 27, 2016
Meet the Company Selling Our Medical Data
July 22, 2016
A company with a long history is getting fresh scrutiny. An article at Fortune reports, “This Little-Known Firm Is Getting Rich Off Your Medical Data.” Writer Adam Tanner informs us:
“A global company based in Danbury, Connecticut, IMS buys bulk data from pharmacy chains such as CVS , doctor’s electronic record systems such as Allscripts, claims from insurers such as Blue Cross Blue Shield and from others who handle your health information. The data is anonymized—stripped from the identifiers that identify individuals. In turn, IMS sells insights from its more than half a billion patient dossiers mainly to drug companies.
“So-called health care data mining is a growing market—and one largely dominated by IMS. Last week, the company reported 2015 net income of $417 million on revenue of $2.9 billion, compared with a loss of $189 million in 2014 (an acquisition also boosted revenue over the year). ‘The outlook for this business remains strong,’ CEO Ari Bousbib said in announcing the earnings.”
IMS Health dates back to the 1950s, when a medical ad man sought to make a buck on drug-sales marketing reports. In the 1980s and ‘90s, the company thrived selling profiles of specific doctors’ proscribing patterns to pharmaceutical marketing folks. Later, they moved into aggregating information on individual patients—anonymized, of course, in accordance with HIPAA rules.
Despite those rules, some are concerned about patient privacy. IMS does not disclose how it compiles their patient dossiers, and it may be possible that records could, somehow someday, become identifiable. One solution would be to allow patients to opt out of contributing their records to the collection, anonymized or not, as marketing data firm Acxiom began doing in 2013.
Of course, it isn’t quite so simple for the consumer. Each health record system makes its own decisions about data sharing, so opting out could require changing doctors. On the other hand, many of us have little choice in our insurance provider, and a lot of those firms also share patient information. Will IMS move toward transparency, or continue to keep patients in the dark about the paths of their own medical data?
Cynthia Murrell, July 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.
The Potential of AI Journalism
July 12, 2016
Most of us are familiar with the concept of targeted advertising, but are we ready for targeted news? Personalized paragraphs within news stories is one development writer Jonathan Holmes predicts in, “AI is Already Making Inroads Into Journalism but Could it Win a Pulitzer?” at the Guardian.
Even now, the internet is full of both clickbait and news articles generated by algorithms. Such software is also producing quarterly earnings reports, obituaries, even poetry and fiction. Now that it has been established that, at least, some software can write better than some humans, researchers are turning to another question: What can AI writers do that humans cannot? Holmes quotes Reg Chua, of Thomson Reuters:
“‘I think it may well be that in the future a machine will win not so much for its written text, but by covering an important topic with five high quality articles and also 500,000 versions for different people.’ Imagine an article telling someone how local council cuts will affect their family, specifically, or how they personally are affected by a war happening in a different country. ‘I think the results might show up in the next couple of years,’ Caswell agrees. ‘It’s something that could not be done by a human writer.’”
The “Caswell” above is David Caswell, a fellow at the University of Missouri’s Donald W Reynolds Journalism Institute. Holmes also describes:
“In Caswell’s system, Structured Stories, the ‘story’ is not a story at all, but a network of information that can be assembled and read as copy, infographics or any other format, almost like musical notes. Any bank of information – from court reports to the weather – could eventually be plugged into a database of this kind. The potential for such systems is enormous.”
Yes, it is; we are curious to see where this technology is headed. In the meantime, we should all remember not to believe everything we read… was written by a human.
Cynthia Murrell, July 12, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph