Monkeys Cause System Failure
July 28, 2015
Nobody likes to talk about his or her failures. Admitting to failure proves that you failed at a task in the past and it is a big blow to the ego. Failure admission is even worse for technology companies, because users want to believe technology is flawless. On Microsoft’s Azure Blog, Heather Nakama posted “Inside Azure Search: Chaos Engineering” and she explains that software engineers are aware that failure is unavoidable. Rather than trying to prevent failure, they welcome potential failure. Why? It allows them to test software and systems to prevent problems before they develop.
Nakama mentions it is not a sustainable model to account for every potential failure and to test the system every time it is upgraded. Azure Search borrowed chaos engineering from Netflix to resolve the issue and it is run by a bunch of digital monkeys
“As coined by Netflix in a recent excellent blog post, chaos engineering is the practice of building infrastructure to enable controlled automated fault injection into a distributed system. To accomplish this, Netflix has created the Netflix Simian Army with a collection of tools (dubbed “monkeys”) that inject failures into customer services.”
Netflix basically unleashes a Search Chaos Monkey into its system to wreck havoc, then Netflix learns about system weaknesses and repairs accordingly. There are several chaos levels: high, medium, and low, with each resulting in more possible damage. At each level, Search Chaos Monkey is given more destructive tools to “play” around with. The high levels are the most valuable to software engineers, because it demonstrates the largest and worst diagnostic failures.
While letting a bull loose in a china shop is bad, because you lose your merchandise, letting a bunch of digital monkeys loose in a computer system is actually beneficial. It remains true that you can learn from failure. I just hope that the digital monkeys do not have digital dung.
Whitney Grace, July 28, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Googles Chauvinistic Job Advertising Delivery
July 28, 2015
I thought we were working to get more women into the tech industry, not fewer. That’s why it was so disappointing to read, “Google Found to Specifically Target Men Over Women When It Comes to High-Paid Job Adverts” at IBTimes. It was a tool dubbed AdFisher, developed by some curious folks at Carnegie Mellon and the International Computer Science Institute, that confirmed the disparity. Knowing that internet-usage tracking determines what ads each of us sees, the researchers wondered whether such “tailored ad experiences” were limiting employment opportunities for half the population. Reporter Alistair Charlton writes:
“AdFisher works by acting as thousands of web users, each taking a carefully chosen route across the internet in such a way that an ad-targeting network like Google Ads will infer certain interests and characteristics from them. The programme then records which adverts are displayed when it later visits a news website that uses Google’s ad network. It can be set to act as a man or woman, then flag any differences in the adverts it is shown.
“Anupam Datta, an associate professor at Carnegie Mellon University, said in the MIT Technology Review: ‘I think our findings suggest that there are parts of the ad ecosystem where kinds of discrimination are beginning to emerge and there is a lack of transparency. This is concerning from a societal standpoint.’”
Indeed it is, good sir. The team has now turned AdFisher’s attention to Microsoft’s Bing; will that search platform prove to be just as chauvinistic? For Google’s part, they say they’re looking into the study’s methodology to “understand its findings.” It remains to be seen what sort of parent the search giant will be; will it simply defend its algorithmic offspring, or demand it mend its ways?
Cynthia Murrell, July 28, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
PageRank: Viewed through the Linear Algebra Sunglasses
July 27, 2015
I urge you to read and work through the examples in “The $25,000,000,000,000 Eigenvector: The Linear Algebra behind Google.” The write up was a tour be force when it became available in 2006. The explains Google PageRank’s ability to display “the good stuff.” The idea is that the system and method set forth in the PageRank patent finds the most “relevant” Web pages matching a query.
I won’t trouble you with references to some work by Dr. Jon Kleinberg for the CLEVER system. I won’t peel back any of the wrappers which have been layered around the PageRank system after the company went public in 2004.
I will call your attention to a Wall Street Journal article which strikes me as reasonably accurate. The story is “How Google Skewed Search Results.” If you cannot access this document, check out the also acceptable article “Google May Be Hurting Users by Manipulating Search Results, Says Study.”
Fancy math is great in the classroom. Is it possible that good old human intervention works in some situations? And if manipulation does achieve a desired goal, what’s the point of belaboring the fancy math, precedent work, or the subsequent wrapper code?
Stephen E Arnold, July 27, 2015
Unemployed in Search or Content Processing? Go for Data Science
July 27, 2015
I read an amazing write up. The title of this gem of high school counseling is “7 Skills/Attitudes to Become a Better Data Scientist.” What does one need to be a better data scientist? Better python or R programming methods? Sharper mathematical intuition? Ability to do the least upper bound (sup) and greatest lower bound (inf) of a set of real numbers) in your head, without paper, and none of that Mathematica software? Wrong.
What you need is to be intellectually curious, an understanding of business, ability to communicate (none of the Cool Hand Luke pithiness), knowledge of more than one programming language, knowledge of SQL, be a participant in competitions, and read articles like “7 Skills and Attitudes.”
Yep, follow these tips and you too can be a really capable data scientist. Why wait? Act now. Read the “7 Skills” article. Nah, don’t worry about such silly notions as data integrity or statistical procedures. Talk to someone, anyone and you will be 14.28 percent of the way to your goal.
Stephen E Arnold, July 27, 2015
PowerPoint Enabled Big Data Presenters Rejoice
July 27, 2015
Navigate to “A Plethora of Big Data Infographics.” Note that the original write up misspells “plethora” at “pletora” but, as many in Big Data say, “it is close enough for horseshoes.”
I quit browsing after a baker’s dozen of these puppies. If you want to be an expert in Big Data, these charts will do the trick. I would steer clear of a person with a PhD in statistics, however.
Stephen E Arnold, July 27, 2015
Instagram’s Search Feature Is A Vast Improvement
July 27, 2015
Instagram apparently knows more about your life than you or your friends. The new search overhaul comes with new features that reveal more information than you ever expected to get from Instagram. VentureBeat reviews the new search feature and explains how it works: “Hands-On: Instagram’s New Search And Explore Features Are A Massive Improvement.”
Many of the features are self-explanatory, but have improved interactivity and increased the amount of eye candy.
- Users can Explore Posts, which are random photos from all over Instagram and they can be viewed as a list or thumbnails.
- The Discover People feature suggests possible people for users to follow. According the article, it dives deep into your personal social network and suggests people you never thought Instagram knew about.
- Curated Collections offer content based off pre-selected categories that pull photos from users’ uploads.
Trending tags is another new feature:
“Trending Tags is Instagram’s attempt at gauging the platform’s pulse. If you’ve ever wondered what most people on Instagram are posting about, trending tags has the answer. These seemed very random and oddly insightful.”
Instagram is quickly becoming a more popular social media platform than Facebook and Twitter for some people. Its new search feature makes it more appealing to users and increases information discovery. Be sure that you will be spending hours on it.
Whitney Grace, July 27, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Data Companies Poised to Leverage Open Data
July 27, 2015
Support for open data, government datasets freely available to the public, has taken off in recent years; the federal government’s launch of Data.gov in 2009 is a prominent example. Naturally, some companies have sprung up to monetize this valuable resource. The New York Times reports, “Data Mining Start-Up Enigma to Expand Commercial Business.”
The article leads with a pro bono example of Enigma’s work: a project in New Orleans that uses that city’s open data to identify households most at risk for fire, so the city can give those folks free smoke detectors. The project illustrates the potential for good lurking in sets of open data. But make no mistake, the potential for profits is big, too. Reporter Steve Lohr explains:
“This new breed of open data companies represents the next step, pushing the applications into the commercial mainstream. Already, Enigma is working on projects with a handful of large corporations for analyzing business risks and fine-tuning supply chains — business that Enigma says generates millions of dollars in revenue.
“The four-year-old company has built up gradually, gathering and preparing thousands of government data sets to be searched, sifted and deployed in software applications. But Enigma is embarking on a sizable expansion, planning to nearly double its staff to 60 people by the end of the year. The growth will be fueled by a $28.2 million round of venture funding….
“The expansion will be mainly to pursue corporate business. Drew Conway, co-founder of DataKind, an organization that puts together volunteer teams of data scientists for humanitarian purposes, called Enigma ‘a first version of the potential commercialization of public data.’”
Other companies are getting into the game, too, leveraging open data in different ways. There’s Reonomy, which supplies research to the commercial real estate market. Seattle-based Socrata makes data-driven applications for government agencies. Information discovery company Dataminr uses open data in addition to Twitter’s stream to inform its clients’ decisions. Not surprisingly, Google is a contender with its Sidewalk Labs, which plumbs open data to improve city living through technology. Lohr insists, though, that Enigma is unique in the comprehensiveness of its data services. See the article for more on this innovative company.
Cynthia Murrell, July 27, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Semantic Promotions and a Nutrition Free Exercise
July 26, 2015
I saw a link to an item called “5 Basic Steps to Make Sure You Hit Page 1 on Google.” I followed it to this message:
One link pointed to this page:
But this image was the message.
Intrigued I chased the other urls in the post and located this write up:
What happened? I am not sure how the link bait promising number one on Google leads to “Semantic Search: The Future of Marketing.” There must be an informing hand somewhere.
I looked at the write up. Like many odd duck semantic search honks, the article explains that semantic search allows software to understand the context of a search. I am not sure how software would have navigate the original message but broken html is not the focus of the future of marketing. Or is it?
The write up zips through schemas, provides an example of structured data testing courtesy of the Google, dips into the knowledge graph thing, mentions Google’s direct answers, and more. Just briefly and in an earthworm manner. The write up strings out screen shots in order to explain the future of marketing I assume.
The conclusion is an interesting collection of “opportunities.” One of them is to be “mobile friendly.” Got it.
Now if we go back to the starting point we see that this is a collection of links and digital pabulum, an insufficient comestible for the addled goose. My hunch is that others may want some more substantial victuals. On the other hand, SEO experts may find the article’s information a feast at the Golden Arches.
Stephen E Arnold, July 26, 2015
Forbes and Some Big Data Forecasts
July 26, 2015
Short honk: For fee, mid tier consultants have had their thunder stolen. Forbes, the capitalist tool, wants to make certain its readers know how juicy Big Data is as a market. Navigate to “Roundup Of Analytics, Big Data & Business Intelligence Forecasts And Market Estimates, 2015.”
The write up summarizes the eye watering examples of spreadsheet fever’s impact on otherwise semi-rationale MBAs, senior managers, and used car sales professionals. IDC, without the inputs of Dave Schubmehl comes up with a spectacular number: $125 billion in 2015.
Sounds good, right?
The data will find their way into innumerable PowerPoint presentations. Snag ‘em while you can.
Stephen E Arnold, July 26, 2015
Hewlett Packard: A Fashion Forward Management Moment
July 26, 2015
I read “HP Bans T Shirts at Work, and Employees Are Furious.” The write up explains:
several teams within HP’s 100,000-employee-strong Enterprise Services division were sent a confidential memo cracking down on casual dress in the workplace, because higher-ups in the company are concerned that customers visiting the offices will be put off by dressed-down developers, reports The Register.
I enjoy the management antics of HP. I recall the dust up with the board of directors. There is the MBA play of splitting the company in half in hopes of doubling the “value” for someone, maybe a banker or a senior manager. And, who can forget, HP’s purchase of Autonomy, the subsequent mea culpa, and the long flights of legal eagles. A deft touch for sure.
The write up states:
An HP spokesperson said the company does not have a global dress code but had no immediate comment on the report of the memo about the Enterprise group dress code.
Organized to a T shirt like this one:
Stephen E Arnold, July 26, 2015