Do-It-Yourself Stress Testing

Earlier we wrote about stress testing, featuring Blazemeter where you could learn how to do crash your site without worrying about the infrastructure. So why did I even bother to write this post about the do-it-yourself approach? We have a complex frontend app, where it would be nearly impossible to simulate all the network activities faithfully during a long period of time. We wanted to use a browser-based testing framework, namely WebdriverI/O with some custom Node.js packages on Blazemeter, and it proved to be quicker to start to manage the infrastructure and have full control of the environment. What happened in the end? Using a public cloud provider (in our case, Linode), we programmatically launched the needed number of machines temporarily, provisioned them to have the proper stack, and the WebdriverI/O test was executed. With Ansible, Linode CLI and WebdriverIO, the whole process is repeatable and scalable, let’s see how!

Infrastructure phase

Any decent cloud provider has an interface to provision and manage cloud machines from code. Given this, if you need an arbitrary number of computers to launch the test, you can have it for 1-2 hours (100 endpoints for a price of a coffee, how does this sound?).

There are many options to dynamically and programmatically create virtual machines for the sake of stress testing. Ansible offers dynamic inventory, however the cloud provider of our choice wasn’t included in the latest stable version of Ansible (2.7) by the the time of this post. Also the solution below makes the infrastructure phase independent, any kind of provisioning (pure shell scripts for instance) is possible with minimal adaptation.

Let’s follow the steps at the guide on the installation of Linode CLI. The key is to have the configuration file at ~/.linode-cli with the credentials and the machine defaults. Afterwards you can create a machine with a one-liner:

linode-cli linodes create --image "linode/ubuntu18.04" --region eu-central --authorized_keys "$(cat ~/.ssh/id_rsa.pub)"  --root_pass "$(date +%s | sha256sum | base64 | head -c 32 ; echo)" --group "stress-test"

Given the specified public key, password-less login will be possible. However this is far from enough before the provisioning. Booting takes time, SSH server is not available immediately, also our special situation is that after the stress test, we would like to drop the instances immediately, together with the test execution to minimize costs.

Waiting for machine booting is a slightly longer snippet, the CSV output is robustly parsable:

## Wait for boot, to be able to SSH in.
while linode-cli linodes list --group=stress-test --text --delimiter ";" --format 'status' --no-headers | grep -v running
do
  sleep 2
done

However the SSH connection is likely not yet possible, let’s wait for the port to be open:

for IP in $(linode-cli linodes list --group=stress-test --text --delimiter ";" --format 'ipv4' --no-headers);
do
  while ! nc -z $IP 22 < /dev/null > /dev/null 2>&1; do
    sleep 1
  done
done

You may realize that this is overlapping with the machine booting wait. The only benefit is that separating the two allows more sophisticated error handling and reporting.

Afterwards, deleting all machines in our group is trivial:

for ID in $(linode-cli linodes list --group=stress-test --text --delimiter ";" --format 'id' --no-headers);
do
  linode-cli linodes delete "$ID"
done

So after packing everything in one script, also to put an Ansible invocation in the middle, we end up with stress-test.sh:

#!/bin/bash

LINODE_GROUP="stress-test"
NUMBER_OF_VISITORS="$1"

NUM_RE='^[0-9]+$'
if ! [[ $NUMBER_OF_VISITORS =~ $NUM_RE ]] ; then
  echo "error: Not a number: $NUMBER_OF_VISITORS" >&2; exit 1
fi

if (( $NUMBER_OF_VISITORS > 100 )); then
  echo "warning: Are you sure that you want to create $NUMBER_OF_VISITORS linodes?" >&2; exit 1
fi

echo "Reset the inventory file."
cat /dev/null > hosts

echo "Create the needed linodes, populate the inventory file."
for i in $(seq $NUMBER_OF_VISITORS);
do
  linode-cli linodes create --image "linode/ubuntu18.04" --region eu-central --authorized_keys "$(cat ~/.ssh/id_rsa.pub)" --root_pass "$(date +%s | sha256sum | base64 | head -c 32 ; echo)" --group "$LINODE_GROUP" --text --delimiter ";"
done

## Wait for boot.
while linode-cli linodes list --group="$LINODE_GROUP" --text --delimiter ";" --format 'status' --no-headers | grep -v running
do
  sleep 2
done

## Wait for the SSH port.
for IP in $(linode-cli linodes list --group="$LINODE_GROUP" --text --delimiter ";" --format 'ipv4' --no-headers);
do
  while ! nc -z $IP 22 < /dev/null > /dev/null 2>&1; do
    sleep 1
  done
  ### Collect the IP for the Ansible hosts file.
  echo "$IP" >> hosts
done
echo "The SSH servers became available"

echo "Execute the playbook"
ansible-playbook -e 'ansible_python_interpreter=/usr/bin/python3' -T 300 -i hosts main.yml

echo "Cleanup the created linodes."
for ID in $(linode-cli linodes list --group="$LINODE_GROUP" --text --delimiter ";" --format 'id' --no-headers);
do
  linode-cli linodes delete "$ID"
done

Provisioning phase

As written earlier, Ansible is just an option, however a popular option to provision machines. For such a test, even a bunch of shell command would be sufficient to setup the stack for the test. However, after someone tastes working with infrastructure in a declarative way, this becomes the first choice.

If this is your first experience with Ansible, check out the official documentation. In a nutshell, we just declare in YAML how the machine(s) should look, and what packages it should have.

In my opinion, a simple playbook like this below, is readable and understandable as-is, without any prior knowledge. So our main.yml is the following:

- name: WDIO-based stress test
  hosts: all
  remote_user: root

  tasks:
    - name: Update and upgrade apt packages
      become: true
      apt:
        upgrade: yes
        update_cache: yes
        cache_valid_time: 86400

    - name: WDIO and Chrome dependencies
      package:
        name: "{{ item }}"
        state: present
      with_items:
         - unzip
         - nodejs
         - npm
         - libxss1
         - libappindicator1
         - libindicator7
         - openjdk-8-jre

    - name: Download Chrome
      get_url:
        url: "https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb"
        dest: "/tmp/chrome.deb"

    - name: Install Chrome
      shell: "apt install -y /tmp/chrome.deb"

    - name: Get Chromedriver
      get_url:
        url: "https://chromedriver.storage.googleapis.com/73.0.3683.20/chromedriver_linux64.zip"
        dest: "/tmp/chromedriver.zip"

    - name: Extract Chromedriver
      unarchive:
        remote_src: yes
        src: "/tmp/chromedriver.zip"
        dest: "/tmp"

    - name: Start Chromedriver
      shell: "nohup /tmp/chromedriver &"

    - name: Sync the source code of the WDIO test
      copy:
        src: "wdio"
        dest: "/root/"

    - name: Install WDIO
      shell: "cd /root/wdio && npm install"

    - name: Start date
      debug:
        var=ansible_date_time.iso8601

    - name: Execute
      shell: 'cd /root/wdio && ./node_modules/.bin/wdio wdio.conf.js --spec specs/stream.js'

    - name: End date
      debug:
        var=ansible_date_time.iso8601

We install the dependencies for Chrome, Chrome itself, WDIO, and then we can execute the test. For this simple case, that’s enough. As I referred to earlier:

ansible-playbook -e 'ansible_python_interpreter=/usr/bin/python3' -T 300 -i hosts main.yml

What’s the benefit over the shell scripting? For this particular use-case, mostly that Ansible makes sure that everything can happen in parallel and we have sufficient error-handling and reporting.

Test phase

We love tests. Our starter kit has WebdriverIO tests (among many other type of tests), so we picked it to stress test the full stack. If you are familiar with JavaScript or Node.js the test code will be easy to grasp:

const assert = require('assert');

describe('podcasts', () => {
    it('should be streamable', () => {
        browser.url('/');
        $('.contact .btn').click();

        browser.url('/team');
        const menu = $('.header.menu .fa-bars');
        menu.waitForDisplayed();
        menu.click();
        $('a=Jobs').click();
        menu.waitForDisplayed();
        menu.click();
        $('a=Podcast').click();
        $('#mep_0 .mejs__controls').waitForDisplayed();
        $('#mep_0 .mejs__play button').click();
        $('span=00:05').waitForDisplayed();
    });
});

This is our spec file, which is the essence, alongside with the configuration.

Could we do it with a bunch of requests in jMeter or Gatling? Almost. The icing on the cake is where we stress test the streaming of the podcast. We simulate a user who listens the podcast for 10 seconds. For for any frontend-heavy app, realistic stress testing requires a real browser, WDIO provides us exactly this.

The WebdriverIO test execution - headless mode deactivated

Test execution phase

After making the shell script executable (chmod 750 stress-test.sh), we are able to execute the test either:

with one visitor from one virtual machine: ./stress-test.sh 1
with 100 visitors from 100 virtual machines for each: ./stress-test.sh 100

with the same simplicity. However, for very large scale tests, you should think about some bottlenecks, such as the capacity of the datacenter on the testing side. It might make sense to randomly pick a datacenter for each testing machine.

The test execution consists of two main parts: bootstrapping the environment and executing the test itself. If bootstrapping the environment takes too high of a percentage, one strategy is to prepare a Docker image, and instead of creating the environment again and again, just use the image. In that case, it’s a great idea to check for a container-specific hosting solution instead of standalone virtual machine.

Would you like to try it out now? Just do a git clone https://github.com/Gizra/diy-stress-test.git!

Result analysis

For such a distributed DIY test, analyzing the results could be challenging. For instance, how would you measure requests/second for a specific browser-based test, like WebdriverI/O?

For our case, the analysis happens on the other side. Almost all hosting solutions we encounter support New Relic, which could help a lot in such an analysis. Our test was DIY, but the result handling was outsourced. The icing on the cake is that it helps to track down the bottlenecks too, so a similar solution for your hosting platform can be applied as well.

However what if you’d like to somehow gather results together after such a distributed test execution?

Without going into detail, you may study the fetch module of Ansible, so you can gather a result log from all the test servers and have it locally in a central place.

Conclusion

It was a great experience that after we faced some difficulty with a hosted stress test platform; in the end, we were able to recreate a solution from scratch without much more development time. If your application also needs special, unusual tools for stress-testing, you might consider this approach. All the chosen components, such as Linode, WebdriverIO or Ansible are easily replaceable with your favorite solution. Geographically distributed stress testing, fully realistic website visitors with heavy frontend logic, low-cost stress testing – it seems now you’re covered!