Academic Sees Facebook Chasing Amazon
September 17, 2018
I assume the one trillion dollar Amazon poobah and the world’s richest hombre will not friend Facebook. The assumption is that the information in “How Facebook AI May Help to Change the Way We Shop Online in the Future” is accurate. The author is an accounting instructor at Villanova University. We do not have that type of expert in Harrod’s Creek. We do have some fast money guys from the health care outfits down the road, however.
The main point of the write up is that Facebook has some smart software which will change the way people shop. Maybe not in Harrod’s Creek, but certainly in a big city where the economic action is. (Keep in mind that insurance fraud is a core competency of some in the Commonwealth of Kentucky.)
I learned that:
The most powerful algorithm is called FBLearner Flow: Facebook could use its massive data on user preferences to anticipate the products that consumers want before consumers even realize it, and could work with retailers on predictive shipping.
Facebook also has DeepText and DeepFace. The trio of smart software adds up to a potential threat to Amazon.
The dismal performance of some facial recognition and image analysis systems is not a problem for the Facebook wizards. I learned:
DeepFace is used to identify people in photos and suggest that users tag people they know. In reality, DeepFace can recognize any face in any photograph on its own. This facial recognition algorithm is actually 97 percent accurate, incredibly even higher than humans who fall a close second at 96 percent accuracy, and the FBI at 85 percent.
The write up suggests that Facebook’s technology could, maybe, possibly could edge toward mind control.
Whatever.
My thought is that Facebook can snag more ad revenue. I think that Facebook ad gains might come at the expense of the Google. Google, unlike Amazon, seems to be drifting with Loon balloons, employee push back, and electric scooter investments. Amazon’s ads are just fly wheeling up and trying to build Mr. Bezos’ much beloved momentum.
Our research suggests that Amazon is implementing a game plan that once was associated with the pre 2006 Google; that is, a number of large scale plays for core business expansions. These range from policeware to back office financial services to replacing existing retail infrastructure with the Amazon equivalent of old school retail.
Ads, therefore, will be a billion dollar plus business at Amazon. Are those product listings ads or objective product summaries. That’s a question to ponder.
But ads may not become much more than just another Amazon revenue stream.
Facebook has to find revenue. Amazon, thanks to the cleverness of the happy Amazonians, is wallowing in revenue streams. Some employees may be unhappy, but most customers are thrilled for Amazon’s gentle approach to vendor lock in.
Net net: Facebook will do ads. Facebook will do smart software. Facebook will also have to figure out how to dodge the bullets regulators are now stuffing into regulatory weapons to tackle with “we’re sorry, we’ll do better” approach to business.
Stephen E Arnold, September 15, 2018
Tips for Dealing with Content Stealers
September 17, 2018
Content on the Internet gets stolen. It is a Reddit trademark, i,e, memes. If you inhabit any part of the fandom community (visit 4chan if you want to have nightmares), then copyright content is plagiarized in fanfiction, fanart, and fan videos. Some cases of fan content have become Internet legends. What do you do, however, if you do not want your content to be stolen and how do you prevent it?
Hacker News Hosted at Y Combinator has a post about, “Dealing with a Competitor Who Is Scraping My Content and Ranking Higher.” Here is the situation:
“One of my competitors has been scrapping my site and providing service to their users without paying anything. And now they surpassed me on Google ranking. I’ve created a website which gets around 5K unique hits every day. It’s a free service for users but I’ve to pay a monthly fee to a third party service provider. Because my site is free for users and doesn’t require users to register it’s been very hard to keep up with this guy. If I change certain things, they counter it immediately and make it work. And they use several proxies to send the request, it’s virtually impossible to block based on IP. Please suggest, if there is anything I’m missing that can be done.”
There are some suggestions such as trap sheets, which is similar to paper towns, where mapmakers include fake streets and towns to identify if any steals their work. Google used something similar when they suspected that Bing was stealing its search results. The proof was in the search results pudding, Bing did scrap the results.
Google also might be helpful in getting them delisted, because the competitor is stealing the content which is illegal. Another suggestion is that the original content writer have the competitor license the original content as an extra revenue stream. Will any of these work?
Whitney Grace, September 17, 2018
High School Science Club Management Method: Protecting the In Crowd Culture
September 17, 2018
I read a quasi news / semi MBA write up with the clicky title “A Wave of News Leaks Is Triggering a Crackdown at Google and Causing Fears That the Culture Is Being Openly Destroyed.” You know my procedure. First, I check out the loaded words in a write up. Not too tough because “crackdown” and “fear” are front and center. But the keeper is “destroyed.” Not damaged. Destroyed. Yikes.
There is one word I hoped the write up would define. It is “culture.” This may be one of the “I will know it when I encounter it” terms. Here in Harrod’s Creek we have culture. When one wants to shoot a squirrel in a neighbor’s tree, one pulls the trigger. Squirrels are fair game no matter where they are. Neighbors? Hah. Should have spotted the critter first and nailed it.
I don’t think too many Googlers or Alphabeters think about squirrels. I assume that if a squirrel were to find itself in need at the Googleplex, a squad of high technology wizards with minors in animal husbandry would rush to aid the furry creature. Another group of Googlers with degrees in food science would debate the virtues of frying versus grilling the animal. A third group of Googlers might make signs and protest improper intervention into the life of the confused rat like creature.
Ah, Google.
The write up, however, does not address these issues, whether squirrely or not.
I learned:
The increasingly heated and contentious atmosphere within Google mirrors the highly politicized nature of the country. As on the political stage, behavior within Google that was once considered unthinkable is now occurring with increasing regularity.
Okay, no definition of culture. Not too much about the destroy and fear thing.
Maybe crackdown? I noted after a bit of chatter about employees who send Twitter messages during meetings:
“People who leak are hated” internally, one source said. “There’s a reasonably open culture that many feel is being openly destroyed…. “There’s a perception that if you leak you’re destroying communication,” the source said.”
None of that anonymous stuff. This is a “source.” Helpful.
There’s a new security measure too:
Google informed employees in a weekly email update that the TGIF meetings would no longer be available to be streamed on individual laptops. Instead, employees who were not at the main event at a cafe in Google’s Mountain View, Calif. campus, would need to show up at special designated locations within its satellite offices to watch a feed of the proceedings. Anyone working from home or wishing to tune in from their desk while working was now out of luck.
Okay, the high school science club management method of restricting meetings and keeping the non sciclub types out. Insiders only. But only insiders who are actually inside something.
So much for the legions of remote employees, those traveling, or the hapless consultants who are “sort of like” employees.
I will keep looking for more HSSCM methods. These are useful and informative. Almost as nifty as leaked videos and real time Twitter messages, the follow up real news stories, and the wild and crazy apologia which Silicon Valley pundits contribute to the Gray Lady.
And the squirrel? Not qualified to be a Google target yet. And what is “culture” anyway. If it is not defined, can it be destroyed?
Stephen E Arnold, September 17, 2018
Machine Learning Frameworks: Why Not Just Use Amazon?
September 16, 2018
A colleague sent me a link to “The 10 Most Popular Machine Learning Frameworks Used by Data Scientists.” I found the write up interesting despite the author’s failure to define the word popular and the bound phrase data scientists. But few folks in an era of “real” journalism fool around with my quaint notions.
According to the write up, the data come from an outfit called Figure Eight. I don’t know the company, but I assume their professionals adhere to the basics of Statistics 101. You know the boring stuff like sample size, objectivity of the sample, sample selection, data validity, etc. Like information in our time of “real” news and “real” journalists, some of these annoying aspects of churning out data in which an old geezer like me can have some confidence. You know like the 70 percent accuracy of some US facial recognition systems. Close enough for horseshoes, I suppose.
Here’s the list. My comments about each “learning framework” appear in italics after each “learning framework’s” name:
- Pandas — an open source, BSD-licensed library
- Numpy — a package for scientific computing with Python
- Scikit-learn — another BSD licensed collection of tools for data mining and data analysis
- Matplotlib — a Python 2D plotting library for graphics
- TensorFlow — an open source machine learning framework
- Keras — a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano
- Seaborn — a Python data visualization library based on matplotlib
- Pytorch & Torch
- AWS Deep Learning AMI — infrastructure and tools to accelerate deep learning in the cloud. Not to be annoying but defining AMI as Amazon Machine Learning Interface might be useful to some
- Google Cloud ML Engine — neural-net-based ML service with a typically Googley line up of Googley services.
Stepping back, I noticed a handful of what I am sure are irrelevant points which are of little interest to a “real” journalists creating “real” news.
First, notice that the list is self referential with python love. Frameworks depend on other python loving frameworks. There’s nothing inherently bad about this self referential approach to shipping up a list, and it makes it a heck of a lot easier to create the list in the first place.
Second, the information about Amazon is slightly misleading. In my lecture in Washington, DC on September 7, I mentioned that Amazon’s approach to machine learning supports Apache MXNet and Gluon, TensorFlow, Microsoft Cognitive Toolkit, Caffe, Caffe2, Theano, Torch, PyTorch, Chainer, and Keras. I found this approach interesting, but of little interest to those creating a survey or developing an informed list about machine learning frameworks; for example, Amazon is executing a quite clever play. In bridge, I think the phrase “trump card” suggests what the Bezos momentum machine has cooked up. Notice the past tense because this Amazon stuff has been chugging along in at least one US government agency for about four, four and one half years.
Third, Google brings up dead last. What about IBM? What about Microsoft and its CNTK. Ah, another acronym, but I as a non real journalist will reveal that this acronym means Microsoft Cognitive Toolkit. More information is available in Microsoft’s wonderful prose at this link. By the way, the Amazon machine learning spinning momentum thing supports the CNTK. Imagine that? Right, I didn’t think so.
Net net: The machine learning framework list may benefit from a bit of refinement. On the other hand, just use Amazon and move down the road to a new type of smart software lock in. Want to know more? Write benkent2020 @ yahoo dot com and inquire about our for fee Amazon briefing about machine learning, real time data marketplaces, and a couple of other most off the radar activities. Have you seen Amazon’s facial recognition camera? It’s part of the Amazon machine learning imitative, and it has some interesting capabilities.
Stephen E Arnold, September 16, 2018
Google: Making Cross Correlation Factually Fluffy General Tso Dish?a
September 15, 2018
I read “Google China Prototypes Links Searches to Phone Number”. This is one of those write ups which offers some possibly accurate information attributed to anonymous sources or “sources familiar with the project.”
Nifty. Nothing like anonymity.
But for the moment, let’s assume that queries from mobile devices are explicitly linked to the a specific mobile device. What’s the big deal?
According to the write up:
“This is very problematic from a privacy point of view, because it would allow far more detailed tracking and profiling of people’s behavior,” said Cynthia Wong, senior internet researcher with Human Rights Watch. “Linking searches to a phone number would make it much harder for people to avoid the kind of overreaching government surveillance that is pervasive in China.”
What I find interesting is that the Google China service would be a joint venture. That means that both parties to the deal can perform cross correlation from unencrypted, direct data flows from the individuals who use the alleged Google China search system.
From my point of view, those with these data streams could:
- Identify patterns of behavior related to queries, locations, and purchases in real time
- Use the sensor devices on the mobile devices to discern useful items of information; for example, pass codes
- Make relationship maps which reveal who calls whom or who texts whom under what circumstances part of the normal data stream.
I can visualize a scenario in which Cambridge Analytica style reports about who is interested in what, when, and under what “circumstances” an action, event, or idea occurs.
I don’t have any anonymous sources to cite. I am just thinking about how difficult it is to determine what’s accurate and what’s a confection spun by a fictionist like Alastair Reynolds.
Speculation is fun and easy. Maybe not accurate? Slight downside. But “real” journalism is almost as interesting as revelation space. But if the write up is accurate, Google may be operating in a galaxy far, far away from Harrod’s Creek and its outmoded ideas about objective search.
Stephen E Arnold, September 15, 2018
Google and the Right to Be Mostly Removed from an Index
September 14, 2018
Yeah, the deletion thing.
I am able to recall some exciting “deletion” events over my 50 year working career. Let me recount one amusing deletion event. The year is 1980 (give or take a year or two). The topic was the Capital Holding IBM mainframe system running the mission critical IBM CICS (Customer Information and Control System). The CICS system and its many components was designed to make it theoretically impossible to delete a record when a high priority process was running in memory. Yes, gentle reader, in memory with data not yet written to disc. The technically fascinating Capital Holding computer center and its mainframes are no more, and on that day in 1980 neither was the data which, according to the IBM CICS manual could not be deleted.
Yeah, well.
I did not work at Capital Holding; I worked at the Louisville Courier Journal database unit, and we supported our electronic products on IBM MVS TSO systems at Bell Labs. Close enough for horseshoes, right. I sat in the meeting for an hour and contributed one comment, “Fiddling with live CICS processes by deleting a record is not a good idea. Find a work around. I have to go.” I left the wizards of the insurance business to sort out the reality of what happens when you poke around in an IBM in memory process. By the way, you can kill an AS/400 database process and the data with an ill advised delete.
At Capital Holding, one of the Job Control Language crew managed to issue a command and trash the database and whatever else was in memory at the time.
Yeah, well.
In retrospect, this was a useful reminder to me that one does not remove things from an index. One finds a way to leave the thing in the index and make sure the thing does not show up in a query. To the outsider, the data are gone. To someone who knows how the “gone” was implemented, the data are still in the index, probably on disc somewhere, and maybe on a tape in an Iron Mountain cave too. But “gone” means that the managers and lawyers in carpetland can demonstrate the datum is indeed gone.
Yeah, well. Like the internal Google video, gone is relative.
I thought of this when I read “Google Digs In Heels Over Global Expansion of EU’s Right to Be Forgotten.” The write up does not explain that stuff in an index may never really go away. I don’t think the EU cares, and I know that users who want information about people who want certain information to never be displayed don’t care about how. The goal is to have the information disappear.
Yeah, well.
Google may have some business, political, social, and economic reasons to stop this deletion demand.
From my rural Kentucky redoubt, I wonder if the Google wants to figure out how to delete information from an index without creating more work, more computational costs, and more headaches when the CICS behavior surfaces somewhere in the sensitive plant that is the global Google computing infrastructure. Of course, one can rebuild the indexes and really make the datum disappear, but rebuilds are interesting. Really expensive too when measured in terms of machine time, lost uptime, etc., etc.
The write up does a good job of explaining the non technical aspects of the issue.
I am sitting here wondering if Google when forced to delete lots of stuff from its indexes is concerned about the specific methodology of removing and removing and removing from a dynamic, distributed index.
IBM asserted that its delete function could not operate when a CICS process was chugging along.
Yeah, well.
Stephen E Arnold, September 14, 2018
Microsoft Pushes Forward in Its Effort to Be More Like Amazon
September 14, 2018
Amazon is making progress in the policeware sector. The company has contracts and some strong supporters in the US government and elsewhere. Like Amazon, Microsoft covets the JEDI cloud contract. The deal is an important one because the Department of Defense has been a unit which uses and likes Microsoft’s software.
If Amazon snags the JEDI contract, Amazon may be in a position to chew into Microsoft’s revenue streams flowing from the DoD and maybe other US governmental entities. Microsoft already faces a loss of young users to the Chrome approach to personal computing.
What to do?
One answer seems to be to buy another “make it easy to program” tool for machine learning and other red hot buzzwords.
We learned in “Microsoft Acquires AI startup Lobe to Help People Make Deep Learning Models without Code” that Microsoft aqui-hired more programming tools capabilities. The company Lobe, it seems, offers a partial solution to the Amazon SageMaker software systems.
Perhaps this and other acquisitions will give Microsoft a way to scale Mt. JEDI, but we think that’s a long shot for three reasons:
- Amazon has been preparing for policeware for about a decade; Microsoft is late to the game
- Amazon has a hardware and software platform which creates a policeware ecosystem when combined with the Amazon content streams and archives; Microsoft does not have this “vision” and “reality”
- Amazon’s policeware tactic is part of a larger strategy to become more than a provider of cloud resources and some interesting applications.
What’s the big picture? If you want to discuss a presentation of my Amazon policeware lecture delivered in Prague and Washington, DC in the last three months, let me know by writing benkent2020 at yahoo dot com. If not, well, that’s okay with me and probably Amazon, an outfit which does not emit too many chunks of information about this SageMaker thing.
Stephen E Arnold, September 14, 2018
How Many Lawsuits Can Fit on the Shoulders of the Google?
September 14, 2018
Google users who disabled its location tracking services are very upset, because Google is still tracking them. According to Gizmodo, there is now, “A Lawsuit Over Google’s Sneaky Location Tracking Could Be A Game-Changer.” Google is not apologetic about its sneak tactics and have changed its location policy. California resident Napoleon Patacsil is upset enough to take Google to court. Patacsil wants a judge to grant his case a class-action status so other Google users can join.
Google fooled users by making it seem very simple to opt out of location tracking, but it is not:
“A slider control on the Location History section seemed to state that this was a one-stop shop to prevent Google hanging onto your location data. A support page for the feature read, “With Location History off, the places you go are no longer stored.” But that wasn’t entirely true. In order to fully opt-out of having your location activity stored by Google, you have to also pause the Web & Activity control as well. This is acknowledged if a user digs deeper into Google’s product documentation.”
Google responded by changing the wording in its location policy, stating that some of its services will continue to track users. Patascil’s case includes evidence showing how Google continued to track user information, even when the option was turned off. The argument is that his violates California privacy laws and an individual’s privacy expectations.
“The biggest question the courts will have to consider is whether or not Google met its legal obligation to obtain consent from its users. Does burying all of the information a user needs deep within separate documents on separate web pages adequately inform a user about what they are agreeing to? If all that information is collected in one document located at a separate portal, would that qualify as sufficient explanation of a company’s policies?”
If the lawsuit does become gain traction, then others could grab the fire wagon. Google still does not admit any fault, but has agreed not to misrepresent its privacy practices anymore. Google is probably going to wait for this to blow over. The company spurs too much of California’s economy to lose its business license.
Whitney Grace, September 2, 2018
Facebook: The Old Is Newish Again
September 14, 2018
Social media giant, Facebook, has been making a very public effort to clean up its act and establish a greater sense of security for users. As this campaign is underway more troubling news recently came out regarding the platform. We learned all the disturbing information from a recent article in The Verge, “How Autocratic Governments Use Facebook Against Their Own Citizens.”
According to the story:
“Armed groups use Facebook to find opponents and critics, some of whom have later been detained, killed or forced into exile, according to human rights groups and Libyan activists…Swaggering commanders boast of their battlefield exploits and fancy vacations, or rally supporters by sowing division and ethnic hatred. Forged documents circulate widely, often with the goal of undermining Libya’s few surviving national institutions.”
While this is indeed interesting news, we’d say that it is not just limited to autocratic regimes. Take, for example, the news that the US government would like to start wiretapping Facebook’s messenger app. Clearly, some governments are using social media for more overt evil, however, we can’t imagine a nation in the world that would overlook this powerful tool and consider ways they can use it for their own good.
Patrick Roland, September 14, 2018
Quote to Note: The Election. Yes, That Election
September 13, 2018
I read CNBC’s “Leaked Video Shows Upset Alphabet Executives Responding to President Trump’s Election in Company-Wide Meeting.”
Here’s the quote I noted. I even circled it with my True Blue marker which I usually reserve for IBM Watson factoids:
Myself as an immigrant and a refugee, I certainly find the selection deeply offensive and I know many of you do too. I think it’s a very stressful time and conflicts with many of our values.
Who made this statement? According to the “real” news source CNBC, the answer is Sergey Brin, one of the founders of Google.
The “real” news outlet reports “real” news from Google. The article reproduces text of an alleged statement made by a Google spokesperson (yes, this is similar to an anonymous source so keep this in mind). That statement asserts:
At a regularly scheduled all hands meeting, some Google employees and executives expressed their own personal views in the aftermath of a long and divisive election season. For over 20 years, everyone at Google has been able to freely express their opinions at these meetings. Nothing was said at that meeting, or any other meeting, to suggest that any political bias ever influences the way we build or operate our products. To the contrary, our products are built for everyone, and we design them with extraordinary care to be a trustworthy source of information for everyone, without regard to political viewpoint.
Interesting stuff. Is it possible that an algorithm, particularly a Clever one like Google’s, could be influenced by individuals who set thresholds, sequence the order of numerical recipes, and feed inputs into a system. Could these individuals make the math perform like a circus pony.
Here in Harrod’s Creek we know that companies like Google let the math do the talking. Well, there are exceptions like company wide meetings and anonymous statements.
Stephen E Arnold, September 13, 2018