Monkeys Cause System Failure

July 28, 2015

Nobody likes to talk about his or her failures.  Admitting to failure proves that you failed at a task in the past and it is a big blow to the ego.  Failure admission is even worse for technology companies, because users want to believe technology is flawless.  On Microsoft’s Azure Blog, Heather Nakama posted “Inside Azure Search: Chaos Engineering” and she explains that software engineers are aware that failure is unavoidable.  Rather than trying to prevent failure, they welcome potential failure.  Why?  It allows them to test software and systems to prevent problems before they develop.

Nakama mentions it is not a sustainable model to account for every potential failure and to test the system every time it is upgraded.  Azure Search borrowed chaos engineering from Netflix to resolve the issue and it is run by a bunch of digital monkeys

“As coined by Netflix in a recent excellent blog post, chaos engineering is the practice of building infrastructure to enable controlled automated fault injection into a distributed system.  To accomplish this, Netflix has created the Netflix Simian Army with a collection of tools (dubbed “monkeys”) that inject failures into customer services.”

Netflix basically unleashes a Search Chaos Monkey into its system to wreck havoc, then Netflix learns about system weaknesses and repairs accordingly.  There are several chaos levels: high, medium, and low, with each resulting in more possible damage.  At each level, Search Chaos Monkey is given more destructive tools to “play” around with.  The high levels are the most valuable to software engineers, because it demonstrates the largest and worst diagnostic failures.

While letting a bull loose in a china shop is bad, because you lose your merchandise, letting a bunch of digital monkeys loose in a computer system is actually beneficial.  It remains true that you can learn from failure.  I just hope that the digital monkeys do not have digital dung.

Whitney Grace, July 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta