Improving Information for Everyone
August 14, 2016
I love it when Facebook and Google take steps to improve information quality for everyone.
I noted “Facebook’s News Feed to Show Fewer Clickbait Headlines.” I thought the Facebook news feed was 100 percent beef. I learned:
The company receives thousands of complaints a day about clickbait, headlines that intentionally withhold information or mislead users to get people to click on them…
Thousands. I am impressed. Facebook is going to do some filtering to help its many happy users avoid clickbait, a concept which puzzles me. I noted:
Facebook created a system that identifies and classifies such headlines. It can then determine which pages or web domains post large amounts of clickbait and rank them lower in News Feed. Facebook routinely updates its algorithm for News Feed, the place most people see postings on the site, to show users what they are most interested in and encourage them to spend even more time on the site.
Clustering methods are readily available. I ask myself, “Why did Facebook provide streams of clickbait in the first place?”
On a related note, the Google released exclusive information to Time Warner, which once owned AOL and now owns a chunk of Hula. Google’s wizards have identified bad bits, which it calls “unwanted software.” The Googlers converted the phrase into UwS and then into the snappy term “ooze.”
Fortune informed me:
people bump into 60 million browser warnings for download attempts of unwanted software at unsafe Web pages every week.
Quite a surprise I assume. Google will definitely talk about “a really big problem.” Alas, Fortune was not able to provide information about what Mother Google will do to protect its users. Obviously the “real” journalists at Fortune did not find the question, “What are you going to do about this?” germane.
It is reassuring to know that Facebook and Google are improving the quality of the information each provides. Analytics and user feedback are important.
Stephen E Arnold, August 13, 2016
No Dark Web Necessary
August 11, 2016
Do increased Facebook restrictions on hate speech and illegal activity send those users straight to the Dark Web? From The Atlantic comes and article entitled, American Neo-Nazis Are on Russia’s Facebook, which hints that is not always the case. This piece explains that location of an online group called “United Aryan Front” moved from Facebook to a Russia’s version of Facebook: VKontakte. The article describes a shift to cyber racism,
The move to VK is part of the growing tendency of white supremacists to interact in online forums, rather than through real-life groups like the KKK, according to Heidi Beirich, director of the Southern Poverty Law Center’s anti-terror Intelligence Project. Through the early 2000s, skinheads and other groups would host dozens of events per year with hundreds of attendees, she says, but now there are only a handful of those rallies each year. “People online are talking about the same kinds of things that used to happen at the rallies, but now they’re doing it completely through the web,” she said.
It is interesting to consider the spaces people choose, or are forced into, for conducting ill-intentioned activities. Even when Facebook cracks down on it, hate speech amongst other activities is not relegated solely to the Dark Web. While organized online hate speech analogous to rallies may be experiencing a surge in the online world, rallies are not the only avenue for real-world racism. At the core of this article, like many we cover on the Dark Web, is a question about the relationship between place and malicious activity.
Megan Feil, August 11, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden/Dark Web meet up on August 23, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233019199/
Facebook Algorithms: Doing What Users Expect Maybe
August 9, 2016
I read an AOL-Yahoo post titled “Inside Facebook Algorithms.” With the excitement of algorithms tingeing the air, explanations of smart software make the day so much better.
I learned:
if you understand the rules, you can play them by doing the same thing over and over again
Good point. But how many Facebook users are sufficiently attentive to correlate a particular action with an outcome which may not be visible to the user?
Censorship confusing? It doesn’t need to be. I learned:
Mr. Abbasi [a person whose Facebook post was censored] used several words which would likely flag his post as hate speech, which is against Facebook’s community guidelines. It is also possible that the number of the words flagged would rank it on a scale of “possibly offensive” to “inciting violence”, and the moderators reviewing these posts would allocate most of their resources to posts closer to the former, and automatically delete those in the latter category. So far, this tool continues to work as intended.
There is nothing like a word look up list containing words which will result in censorship. We love word lists. Non public words lists are not much fun for some.
Now what about algorithms? The examples in the write up are standard procedures for performing brute force actions. Algorithms, as presented in the AOL Yahoo article, seem to be collections of arbitrary rules. Straightforward for those who know the rules.
A “real” newspaper tackled the issue of algorithms and bias. The angle, which may be exciting to some, is “racism.” Navigate to “Is an Algorithms Any Less Racist Than a Human?” Since algorithms are often generated by humans, my hunch is that bias is indeed possible. The write up tells me:
any algorithm can – and often does – simply reproduce the biases inherent in its creator, in the data it’s using, or in society at large. For example, Google is more likely to advertise executive-level salaried positions to search engine users if it thinks the user is male, according to a Carnegie Mellon study. While Harvard researchers found that ads about arrest records were much more likely to appear alongside searches for names thought to belong to a black person versus a white person.
Don’t know the inside rules? Too bad, gentle reader. Perhaps you can search for an answer using Facebook’s search systems or the Wow.com service. Better yet. Ask a person who constructs algorithms for a living.
Stephen E Arnold, August 9, 2016
Facebook vs. LinkedIn for Job Hunters
August 4, 2016
The article on Lifehacker titled Facebook Can Be Just As Important AS LinkedIn For Finding a Job emphasizes the importance of industry connections. As everyone knows, trying to a find a job online is like trying to date online. A huge number of job postings are scams, schemes, or utter bollox. Navigating these toads and finding the job equivalent to Prince Charming is frustrating, which is why Facebook might offer a happy alternative. The article states,
“As business site Entrepreneur points out, the role Facebook plays in helping people find jobs shouldn’t be surprising. Any time you can connect with someone who works in your industry, that’s one more person who could potentially help you get a job. Research from Facebook itself shows that both strong and weak ties on the site can lead to jobs… Well, weak ties are important collectively because of their quantity, but strong ties are important individually because of their quality.”
Obviously, knowing someone in the industry you seek to work in is the key to finding and getting a job. But a site like Facebook is much easier to exploit than LinkedIn because more people use it and more people check it. LinkedIn’s endless emails eventually become white noise, but scrolling through Facebook’s Newsfeed is an infinite source of time-wasting pleasure for the bulk of users. Time to put the networking back into social networking, job seekers.
Chelsea Kerwin, August 4, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Facebook Acknowledges Major Dependence on Artificial Intelligence
July 28, 2016
The article on Mashable titled Facebook’s AI Chief: ‘Facebook Today Could Not Exist Without AI’ relates the current conversations involving Facebook and AI. Joaquin Candela, the director of applied machine learning at Facebook, states that “Facebook could not exist without AI.” He uses the examples of the News Feed, ads, and offensive content, all of which involve AI stimulating a vastly more engaging and personalized experience. He explains,
“If you were just a random number and we changed that random number every five seconds and that’s all we know about you then none of the experiences that you have online today — and I’m not only talking about Facebook — would be really useful to you. You’d hate it. I would hate it. So there is value of course in being able to personalize experiences and make the access of information more efficient to you.”
And we thought all Facebook required is humans and ad revenue. Candela makes it very clear that Facebook is driven by machine learning and personalization. He paints a very bleak picture of what Facebook would look like without AI- completely random ads, unranked New Feeds, and offensive content splashing around like beached whale. Only in the last few years, computer vision has changed Facebook’s process of removing such content. What used to take reports and human raters now is automated.
Chelsea Kerwin, July 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Twitter Influential but a Poor Driver of News Traffic
June 20, 2016
A recent report from social analytics firm Parse.ly examined the relationship between Twitter and digital publishers. NeimanLab shares a few details in, “Twitter Has Outsized Influence, but It Doesn’t Drive Much Traffic for Most News Orgs, a New Report Says.” Parse.ly tapped into data from a couple hundred of its clients, a group that includes digital publishers like Business Insider, the Daily Beast, Slate, and Upworthy.
Naturally, news sites that make the most of Twitter do so by knowing what their audience wants and supplying it. The study found there are two main types of Twitter news posts, conversational and breaking, and each drives traffic in its own way. While conversations can engage thousands of users over a period of time, breaking news produces traffic spikes.
Neither of those findings is unexpected, but some may be surprised that Twitter feeds are not inspiring more visits publishers’ sites. Writer Joseph Lichterman reports:
“Despite its conversational and breaking news value, Twitter remains a relatively small source of traffic for most publishers. According to Parse.ly, less than 5 percent of referrals in its network came from Twitter during January and February 2016. Twitter trails Facebook, Google, and even Yahoo as sources of traffic, the report said (though it does edge out Bing!)”
Still, publishers are unlikely to jettison their Twitter accounts anytime soon, because that platform offers a different sort of value. One that is, perhaps, more important for consumers. Lichterman quotes the report:
“Though Twitter may not be a huge overall source of traffic to news websites relative to Facebook and Google, it serves a unique place in the link economy. News really does ‘start’ on Twitter.”
And the earlier a news organization knows about a situation, the better. That is an advantage few publishers will want to relinquish.
Cynthia Murrell, June 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Statistical Translation: Dead Like Marley
June 16, 2016
I read “Facebook Says Statistical Machine Translation Has Reached End of Life.” Hey, it is Facebook. Truth for sure. I learned:
Scale is actually one reason Facebook has invested in its own MT technology. According to Packer [Facebook wizard’’], there are more than two trillion posts and comments, which grows by over a billion each day. “Pretty clearly, we’re not going to solve this problem with a roomful or even a building-full of human translators,” he quipped, adding that to have even “a hope of solving this problem, we need AI; we need automation.” The other reason is adaptability. “We tried that,” said Packer about using third-party MT, but it “did not work well enough for our needs.” The reason? The language of Facebook is different from what is on the rest of the Web. Packer described Facebook language as “extremely informal. It’s full of slang, it’s very regional.” He said it is also laden with metaphors, idiomatic expressions, and is riddled with misspellings (most of them intentional). Additionally, as in the rest of the world, there is a marked difference in the way different age groups communicate on Facebook.
I wonder if it is time to send death notices to the vendors who use statistical methods? Perhaps I should wait a bit. Predictions are often different from reality.
Stephen E Arnold, June 16, 2016
Facebook AI Explainer
June 10, 2016
Facebook posted a partial explanation of its artificial intelligence system. You can review the document “Introducing DeepText: Facebook’s Text Understanding Engine” and decide if Facebook or IBM is winning the smart software race. The Facebook document states:
In traditional NLP approaches, words are converted into a format that a computer algorithm can learn. The word “brother” might be assigned an integer ID such as 4598, while the word “bro” becomes another integer, like 986665. This representation requires each word to be seen with exact spellings in the training data to be understood. With deep learning, we can instead use “word embeddings,” a mathematical concept that preserves the semantic relationship among words. So, when calculated properly, we can see that the word embeddings of “brother” and “bro” are close in space. This type of representation allows us to capture the deeper semantic meaning of words. Using word embeddings, we can also understand the same semantics across multiple languages, despite differences in the surface form. As an example, for English and Spanish, “happy birthday” and “feliz cumpleaños” should be very close to each other in the common embedding space. By mapping words and phrases into a common embedding space, DeepText is capable of building models that are language-agnostic.
Due to Facebook’s grip on the 18 to 35 demographic, its approach may have more commercial impact than the methods in use at other firms. Just ask IBM Watson.
Stephen E Arnold, June 10, 2016
The Google Knowledge Vault Claimed to Be the Future
May 31, 2016
Back in 2014, I heard rumors that the Google Knowledge Vault was supposed to be the next wave of search. How many times do you hear a company or a product making the claim it is the next big thing? After I rolled my eyes, I decided to research what became of the Knowledge Vault and I found an old article from Search Engine Land: “Google ‘Knowledge Vault’ To Power Future Of Search.” Google Knowledge Graph was used to supply more information to search results, what we now recognize as the summarized information at the top of Google search results. The Knowledge Vault was supposedly the successor and would rely less on third party information providers.
“Sensationally characterized as ‘the largest store of knowledge in human history,’ Knowledge Vault is being assembled from content across the Internet without human editorial involvement. ‘Knowledge Vault autonomously gathers and merges information from across the web into a single base of facts about the world, and the people and objects in it,’ says New Scientist. Google has reportedly assembled 1.6 billion “facts” and scored them according to confidence in their accuracy. Roughly 16 percent of the information in the database qualifies as ‘confident facts.’”
Knowledge Vault was also supposed to give Google a one up in the mobile search market and even be the basis for artificial intelligence applications. It was a lot of hoopla, but I did a bit more research and learned from Wikipedia that Knowledge Vault was nothing more than a research paper.
Since 2014, Google, Apple, Facebook, and other tech companies have concentrated their efforts and resources on developing artificial intelligence and integrating it within their products. While Knowledge Vault was a red herring, the predictions about artificial intelligence were correct.
Whitney Grace, May 31, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
DGraph Labs Startup Aims to Fill Gap in Graph Database Market
May 24, 2016
The article on GlobeNewsWire titled Ex-Googler Startup DGraph Labs Raises US$1.1 Million in Seed Funding Round to Build Industry’s First Open Source, Native and Distributed Graph Database names Bain Capital Ventures and Blackbird Ventures as the main investors in the startup. Manish Jain, founder and CEO of DGraph, worked on Google’s Knowledge Graph Infrastructure for six years. He explains the technology,
“Graph data structures store objects and the relationships between them. In these data structures, the relationship is as important as the object. Graph databases are, therefore, designed to store the relationships as first class citizens… Accessing those connections is an efficient, constant-time operation that allows you to traverse millions of objects quickly. Many companies including Google, Facebook, Twitter, eBay, LinkedIn and Dropbox use graph databases to power their smart search engines and newsfeeds.”
Among the many applications of graph databases, the internet of thing, behavior analysis, medical and DNA research, and AI are included. So what is DGraph going to do with their fresh funds? Jain wants to focus on forging a talented team of engineers and developing the company’s core technology. He notes in the article that this sort of work is hardly the typical obstacle faced by a startup, but rather the focus of major tech companies like Google or Facebook.
Chelsea Kerwin, May 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph