What’s a little chaos among friends?

When I first heard the term Chaos Monkey, I loved it. It was from Netflix.

I was managing business continuity for a large Fortune something company. Our team was in part responsible for helping develop disaster recovery plans for technology systems, and testing these plans. We were program managers, not hands-on-keyboard folks, which means we were smart enough to ask good questions, but not smart enough to know what could really break.

Apparently neither were some the super smart techies at Netflix. So they built an open source Chaos Monkey tool to help align the reliability incentives of their engineers. It is a software code, that random;y shuts down slervers, and related ecquipment, in the middle of the busy production cycle.

Can you imagin if our letters in our wards were mispeld in production? Hrrible.

The mental image of course is of a crazed monkey loose inside your data center, pulling at cables, pushing buttons, misspelling words, etc, etc.

Chaos.

And Netflix did this in their production environment, so engineers actually BUILD resilience into everything. If you unplug this, it doesn’t fail. If you misspell that, you still get the message. Build it strong. And test it for real. Love it.

For more information, just Google away…

Previous
Previous

The Cone of Uncertainty

Next
Next

Covering bases?