CyberOSINT banner

Facebook Acknowledges Major Dependence on Artificial Intelligence

July 28, 2016

The article on Mashable titled Facebook’s AI Chief: ‘Facebook Today Could Not Exist Without AI’ relates the current conversations involving Facebook and AI. Joaquin Candela, the director of applied machine learning at Facebook, states that “Facebook could not exist without AI.” He uses the examples of the News Feed, ads, and offensive content, all of which involve AI stimulating a vastly more engaging and personalized experience. He explains,

“If you were just a random number and we changed that random number every five seconds and that’s all we know about you then none of the experiences that you have online today — and I’m not only talking about Facebook — would be really useful to you. You’d hate it. I would hate it. So there is value of course in being able to personalize experiences and make the access of information more efficient to you.”

And we thought all Facebook required is humans and ad revenue. Candela makes it very clear that Facebook is driven by machine learning and personalization. He paints a very bleak picture of what Facebook would look like without AI- completely random ads, unranked New Feeds, and offensive content splashing around like beached whale. Only in the last few years, computer vision has changed Facebook’s process of removing such content. What used to take reports and human raters now is automated.

Chelsea Kerwin, July 28, 2016

Sponsored by, publisher of the CyberOSINT monograph

Twitter Influential but a Poor Driver of News Traffic

June 20, 2016

A recent report from social analytics firm examined the relationship between Twitter and digital publishers. NeimanLab shares a few details in, “Twitter Has Outsized Influence, but It Doesn’t Drive Much Traffic for Most News Orgs, a New Report Says.” tapped into data from a couple hundred of its clients, a group that includes digital publishers like Business Insider, the Daily Beast, Slate, and Upworthy.

Naturally, news sites that make the most of Twitter do so by knowing what their audience wants and supplying it. The study found there are two main types of Twitter news posts, conversational and breaking, and each drives traffic in its own way. While conversations can engage thousands of users over a period of time, breaking news produces traffic spikes.

Neither of  those findings is unexpected, but some may be surprised that Twitter feeds are not inspiring more visits publishers’ sites. Writer Joseph Lichterman reports:

“Despite its conversational and breaking news value, Twitter remains a relatively small source of traffic for most publishers. According to, less than 5 percent of referrals in its network came from Twitter during January and February 2016. Twitter trails Facebook, Google, and even Yahoo as sources of traffic, the report said (though it does edge out Bing!)”

Still, publishers are unlikely to jettison their Twitter accounts anytime soon, because that platform offers a different sort of value. One that is, perhaps, more important for consumers. Lichterman quotes the report:

“Though Twitter may not be a huge overall source of traffic to news websites relative to Facebook and Google, it serves a unique place in the link economy. News really does ‘start’ on Twitter.”

And the earlier a news organization knows about a situation, the better. That is an advantage few publishers will want to relinquish.



Cynthia Murrell, June 20, 2016

Sponsored by, publisher of the CyberOSINT monograph

Statistical Translation: Dead Like Marley

June 16, 2016

I read “Facebook Says Statistical Machine Translation Has Reached End of Life.” Hey, it is Facebook. Truth for sure. I learned:

Scale is actually one reason Facebook has invested in its own MT technology. According to Packer [Facebook wizard’’], there are more than two trillion posts and comments, which grows by over a billion each day. “Pretty clearly, we’re not going to solve this problem with a roomful or even a building-full of human translators,” he quipped, adding that to have even “a hope of solving this problem, we need AI; we need automation.” The other reason is adaptability. “We tried that,” said Packer about using third-party MT, but it “did not work well enough for our needs.” The reason? The language of Facebook is different from what is on the rest of the Web. Packer described Facebook language as “extremely informal. It’s full of slang, it’s very regional.” He said it is also laden with metaphors, idiomatic expressions, and is riddled with misspellings (most of them intentional). Additionally, as in the rest of the world, there is a marked difference in the way different age groups communicate on Facebook.

I wonder if it is time to send death notices to the vendors who use statistical methods? Perhaps I should wait a bit. Predictions are often different from reality.

Stephen E Arnold, June 16, 2016

Facebook AI Explainer

June 10, 2016

Facebook posted a partial explanation of its artificial intelligence system. You can review the document “Introducing DeepText: Facebook’s Text Understanding Engine” and decide if Facebook or IBM is winning the smart software race. The Facebook document states:

In traditional NLP approaches, words are converted into a format that a computer algorithm can learn. The word “brother” might be assigned an integer ID such as 4598, while the word “bro” becomes another integer, like 986665. This representation requires each word to be seen with exact spellings in the training data to be understood. With deep learning, we can instead use “word embeddings,” a mathematical concept that preserves the semantic relationship among words. So, when calculated properly, we can see that the word embeddings of “brother” and “bro” are close in space. This type of representation allows us to capture the deeper semantic meaning of words. Using word embeddings, we can also understand the same semantics across multiple languages, despite differences in the surface form. As an example, for English and Spanish, “happy birthday” and “feliz cumpleaños” should be very close to each other in the common embedding space. By mapping words and phrases into a common embedding space, DeepText is capable of building models that are language-agnostic.

Due to Facebook’s grip on the 18 to 35 demographic, its approach may have more commercial impact than the methods in use at other firms. Just ask IBM Watson.

Stephen E Arnold, June 10, 2016

The Google Knowledge Vault Claimed to Be the Future

May 31, 2016

Back in 2014, I heard rumors that the Google Knowledge Vault was supposed to be the next wave of search.  How many times do you hear a company or a product making the claim it is the next big thing?  After I rolled my eyes, I decided to research what became of the Knowledge Vault and I found an old article from Search Engine Land: “Google ‘Knowledge Vault’ To Power Future Of Search.” Google Knowledge Graph was used to supply more information to search results, what we now recognize as the summarized information at the top of Google search results.  The Knowledge Vault was supposedly the successor and would rely less on third party information providers.

“Sensationally characterized as ‘the largest store of knowledge in human history,’ Knowledge Vault is being assembled from content across the Internet without human editorial involvement. ‘Knowledge Vault autonomously gathers and merges information from across the web into a single base of facts about the world, and the people and objects in it,’ says New Scientist. Google has reportedly assembled 1.6 billion “facts” and scored them according to confidence in their accuracy. Roughly 16 percent of the information in the database qualifies as ‘confident facts.’”

Knowledge Vault was also supposed to give Google a one up in the mobile search market and even be the basis for artificial intelligence applications.  It was a lot of hoopla, but I did a bit more research and learned from Wikipedia that Knowledge Vault was nothing more than a research paper.

Since 2014, Google, Apple, Facebook, and other tech companies have concentrated their efforts and resources on developing artificial intelligence and integrating it within their products.  While Knowledge Vault was a red herring, the predictions about artificial intelligence were correct.


Whitney Grace, May 31, 2016
Sponsored by, publisher of the CyberOSINT monograph

DGraph Labs Startup Aims to Fill Gap in Graph Database Market

May 24, 2016

The article on GlobeNewsWire titled Ex-Googler Startup DGraph Labs Raises US$1.1 Million in Seed Funding Round to Build Industry’s First Open Source, Native and Distributed Graph Database names Bain Capital Ventures and Blackbird Ventures as the main investors in the startup. Manish Jain, founder and CEO of DGraph, worked on Google’s Knowledge Graph Infrastructure for six years. He explains the technology,

“Graph data structures store objects and the relationships between them. In these data structures, the relationship is as important as the object. Graph databases are, therefore, designed to store the relationships as first class citizens… Accessing those connections is an efficient, constant-time operation that allows you to traverse millions of objects quickly. Many companies including Google, Facebook, Twitter, eBay, LinkedIn and Dropbox use graph databases to power their smart search engines and newsfeeds.”

Among the many applications of graph databases, the internet of thing, behavior analysis, medical and DNA research, and AI are included. So what is DGraph going to do with their fresh funds? Jain wants to focus on forging a talented team of engineers and developing the company’s core technology. He notes in the article that this sort of work is hardly the typical obstacle faced by a startup, but rather the focus of major tech companies like Google or Facebook.


Chelsea Kerwin, May 24, 2016

Sponsored by, publisher of the CyberOSINT monograph

Facebook: The Telegraph Newspaper Thinks You Have Social Media by the Throat

May 19, 2016

I read “Sorry Google, We Just Don’t Want to Be Friends with You.” What struck me is that Facebook, which is not the focal point of the write up, is one beneficiary of this write up. The centerpiece of the article is a series of comments about Google’s social media belly flops: Wave, Buzz, Google +. The article omitted the wonderful Orkut service, which was embraced by certain users in Brazil.

Google is not the sole company to be dumped in a bourbon barrel filled with sour mash. Apple and even a nod to Facebook’s slops make the focus on Google less eye watering.

But the winner is Facebook. The Telegraph states:

Facebook has neutralized threats such as Instagram and WhatsApp by buying them before they really surface.

Okay, Facebook is the big dog. The write up makes one last snap at the GOOG:

Perhaps it should just accept that we don’t want to be friends with Google.

I am not sure I want to be friends with the Telegraph.

Stephen E Arnold, May 19, 2016

Facebook and Law Enforcement in Cahoots

May 13, 2016

Did you know that Facebook combs your content for criminal intent? American Intelligence Report reveals, “Facebook Monitors Your Private Messages and Photos for Criminal Activity, Reports them to Police.” Naturally, software is the first entity to scan content, using keywords and key phrases to flag items for human follow-up. Of particular interest are “loose” relationships. Reporter Kristan T. Harris writes:

Reuters’ interview with the security officer explains,  Facebook’s software focuses on conversations between members who have a loose relationship on the social network. For example, if two users aren’t friends, only recently became friends, have no mutual friends, interact with each other very little, have a significant age difference, and/or are located far from each other, the tool pays particular attention.

“The scanning program looks for certain phrases found in previously obtained chat records from criminals, including sexual predators (because of the Reuters story, we know of at least one alleged child predator who is being brought before the courts as a direct result of Facebook’s chat scanning). The relationship analysis and phrase material have to add up before a Facebook employee actually looks at communications and makes the final decision of whether to ping the authorities.

“’We’ve never wanted to set up an environment where we have employees looking at private communications, so it’s really important that we use technology that has a very low false-positive rate,’ Sullivan told Reuters.”

Uh-huh. So, one alleged predator  has been caught. We’re told potential murder suspects have also been identified this way, with one case awash in 62 pages of Facebook-based evidence. Justice is a good thing, but Harris notes that most people will be uncomfortable with the idea of Facebook monitoring their communications. She goes on to wonder where this will lead; will it eventually be applied to misdemeanors and even, perhaps, to “thought crimes”?

Users of any social media platform must understand that anything they post could eventually be seen by anyone. Privacy policies can be updated without notice, and changes can apply to old as well as new data. And, of course, hackers are always lurking about. I was once cautioned to imagine that anything I post online I might as well be shouting on a public street; that advice has served me well.


Cynthia Murrell, May 13, 2016

Sponsored by, publisher of the CyberOSINT monograph

Billions in vc Funding Continues Rinse and Repeat Process

May 12, 2016

In the tech world, the word billion may be losing meaning for some. Pando published a recent editorial called, While the rest of tech struggles, so far VCs have raised more this quarter than in past three years. This piece calls attention to the seemingly never-ending list of VC firms raising ever-more funds. For example, Accel announced their funds were at $2 billion, Founders Fund raised $1 billion in new funds, and Andreessen Horowitz currently works to achieve another $1.5 billion. The author writes,

“It was hard to put that [recent fundraising rounds] in context. I mean, yeah. These are major funds. Is it news that they raised a collective $4.5 billion more at some point? Doesn’t mean they’ll invest it any more quickly. All it means is that the two will still be around for another ten years, which we kinda already guessed. It’s staggeringly hard for a venture fund to actually go out of business, even when it wasn’t some of the first money in Facebook or, in the case of Marc Andreessen, sits on its board. [Disclosure: Marc Andreessen, Founders Fund and Accel are all investors in Pando.]”

As the author wonders, asking Pitchbook if it’s a “bigger quarter than usual”, our eyebrows are not raised by this this thought, nor easy money, bubbles, unicorns. Nah, this is just routine in Sillycon Valley.


Megan Feil, May 12, 2016

Sponsored by, publisher of the CyberOSINT monograph

Facebook: Complaining and Explaining

May 10, 2016

A stiff upper lip type told me eons ago: “Never complain, never explain.” I just read “Facebook denies Claims It Suppressed Conservative and Controversial New on Its Trending Topics Sidebar.” If the write up is accurate, Facebook may be explaining and complaining.

I read:

The company had been accused of encouraging the humans that run its “Trending Topics” sidebar to suppress conservative stories and those from right-of-centre outlets. But the company has “found no evidence that the anonymous allegations are true”, according to a post from its head of search Tom Stocky.

The write up quoted Facebook as an “it” which said:

It said also that does not “insert stories artificially into trending topics, and do not instruct our reviewers to do so”. While it’s possible for reviewers to stick certain topics together – such as #StarWars and #?maythefourthbewithyou – a topic must already be trending for it to be added to the panel, Mr Stocky claimed.

Interesting. Why would a giant in social media let humans interfere with smart software? But why would Jeff Bezos buy a newspaper?

Alas, no answers and certainly no complaining or explaining from Harrod’s Creek.

Stephen E Arnold, May 10, 2016

Next Page »