Imagine a monkey entering your data centers that hold all the critical information of your business. The results could be ripped cables, destroyed devices, power failures and a lot more. The challenge for our IT managers is to design systems they are responsible for in such a way that it can work despite these monkeys- which no one ever knows when they arrive and what they will destroy!
Such a simulation technique called Chaos Monkey was developed by Netflix to test resiliency and recover-ability of their data centers. They are a part of larger family of open source applications called Simian Army– which is built to wreak havoc on live and running production environments by introducing network delays, making data center segments go offline, or identifying security vulnerabilities. The idea is to simulate major failures to ensure that the systems are robust enough to handle all eventualities.
This content was originally published for my TechTuesday’s initiative on LinkedIn.