A common idiom in our industry is, “You never can predict how the user will use your product once they get it in their hands.” If you’ve ever watched a stakeholder use a website or web application for the first time, you may know this firsthand. I can’t count the number of times I’ve seen a user seemingly forget how to use websites on a mobile device, or try to use it in a way that makes you think, “But no one would actually do that in real life!”
The thing is, you never really do know what a user may do in the moment. They might be in a frantic state of mind, trying to accomplish something too quickly, and don’t tap or type the way a calm, focused user might. This fits right into the all-too-common scenario of designing and developing for the best case scenario, and not thinking about edge cases or “what happens if things don’t happen perfectly in this order?” As developers, we tend to build things thinking that everything will be understood, that the user is rational and should just know that tapping around too quickly might cause something weird to happen. It can even affect those who might make accidental taps or clicks when not giving an app full attention - how many times have you accidentally tapped on a mobile device when you were walking and talking while also trying to reply to a tweet or email.
Building out tools to help us test the unpredictable aren’t entirely new. In 2012, Netflix had open-sourced their internal service Chaos Monkey, which “terminates virtual machine instances and containers that run inside of your production environment.” In plain language, it’s a service that tears down servers at random to ensure an entire system doesn’t violently collapse during a failure. Our development communities also remind us to not just design for “the happy path”, but how can we actually detect for unpredicted points of failure in our interfaces the way we can with our server architectures?
If a hundred monkeys at typewriters can write the works of Shakespeare, then one monkey should surely be able to find bugs and problems in our interfaces.
Bring in the monkeys
Monkey testing is a method of testing that generates random user input - clicks, swipes, entering input - with the sole purpose of finding issues with, or entirely breaking, your application. Unlike unit and acceptance testing, where you are writing test cases that occur in a specific order or set of conditions, which ultimately creates bias in how your interface is tested. Developers have less control over how a monkey test will execute, and since they are random every time they are run, you’ll never be testing for just one scenario, but rather an infinite combination of interactions.
Although this type of testing is available for most technology stacks, things built for the web haven’t necessarily got there yet. For example, the Android SDK has a UI Exerciser Monkey that handles most interface-level and system-level events. As web developers have begun to think more critically about performance and stress testing, some of these ideas have finally made it over to the world of the web in the form of Gremlins.js, a JavaScript-based monkey testing library written by the team at Marmelab.