Cynefin and Software Testing – Chaos

Posted on Posted in Software Testing

If you are in the domain of Chaos, then action and consequence appear to be completely disassociated. If there were rules that were working for you, they’re useless to you now (and more likely detrimental if you’re still clinging to them).

There are those that thrive in chaos. Often, they are dictators and despots as this is the sort of environment that best suits their interests, though not everyone adept in this domain has such designs.

Chaos and disorder are often confused for one another. It’s easy to see why this happens, but the distinction is important. Disorder arises from one’s perception of the environment around you (or more accurately a lack of ability to appropriately perceive it). In a state of (inauthentic) disorder, you act based on a comfortable or familiar pattern regardless of the information you have to hand. If you were to gather more information about the current situation (or pay attention to the data available), you may in fact find that it is appropriate to act in a completely different manner.

In Chaos, you are aware that previously useful patterns do not work and that links between cause and effect are not known, as opposed to ignored. By way of example, let’s consider a hypothetical despot who wants to seize power. They might create an unpredictable situation by destabilising what is established, e.g. by staging a fake coup, then having ‘saved’ the populace from the apparent machinations of a convenient scapegoat, dismissing anyone in a position of authority. They then award themselves exigent powers in order to ‘restabilise’ the situation.

They essentially destroy existing constraints that guide and govern the status quo and cause maximum uncertainty as previous norms become unpredictable and quite possibly fatal, then provide a new constraints that must necessarily be followed.

It is safe to say that chaos tends not to arise intentionally. More frequently one arrives there by accident and then does whatever necessary to exit as rapidly as possible. Let’s look at an example from a testing perspective. Imagine you have a fairly large software deployment being made to an existing system. You roll the software out and shortly after, the system stops working. Transactions fail and now you’re losing money. Are you in chaos? No. There are procedures for this sort of thing. You look at the severity of the problem and decide that rolling the code back is the best course of action. You roll back the code — and the problem persists! Okay, now you really are in uncharted territory.

Your boss is now asking pointed questions. Did the rollback really happen? Yep. It’s good. Why does the problem persist? We don’t know yet. Get it fixed. How? I don’t care, just get it done.

In this situation, any action is novel. You take action, look at what happens and keep taking action until some sort of pattern emerges – Act->Sense->Respond.

Chaos can be useful in small doses. To give you an example in software, Netflix’s chaos monkey is a tool that runs inside the Netflix ecosystem and randomly kills processes to see how systems cope with unpredictable failure. Engineers responsible for areas affected must learn to provide sufficiently good monitoring, error handling and disaster recovery measures to ensure that Chaos Monkey does not cause any significant interruption to service or otherwise cause lasting harm.

Straying into chaos accidentally is either transitory or catastrophically damaging (Imagine if Netflix were somehow unable to recover from something Chaos Monkey did). A blog post on Cynefin dynamics and constraints is probably a good place to explore this more, but history is full of examples of companies that were once the dominant player in their field until the field suddenly changed on them. I covered some of these in my post on the Obvious Domain – (see ‘Cliff of Complacency’).

Leave a Reply

Your email address will not be published. Required fields are marked *