How to Install and Use Gremlin in a Docker Container

Introduction

Gremlin is a simple, safe, and secure way to use Chaos Engineering to improve system resilience. You can use Gremlin with Docker in a few ways:

  1. Run the Gremlin daemon directly on the host and attack Docker containers
  2. Run the official Gremlin Docker container and attack the host or neighboring containers.

This page explores #2. It walks through:

  • How to install Docker
  • How to create an NGINX container to attack using Gremlin
  • How to run the official Gremlin container
  • How to create a CPU Attack from the Gremlin container against the host
  • How to create a CPU Attack and a Blackhole Attack from the Gremlin container against an NGINX container

Prerequisites

Before you begin, you’ll need:

  • An Ubuntu 16.04 server
  • A Gremlin account

Step 1 – Installing Docker

Add Docker’s official GPG key:

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Set up the stable repository:

$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

Update the apt package index:

$ sudo apt-get update

Make sure you are about to install from the Docker repo instead of the default Ubuntu 16.04 repo:

$ apt-cache policy docker-ce

Install the latest version of Docker CE:

$ sudo apt-get install docker-ce

Docker should now be installed, the daemon started, and the process enabled to start on boot. Check that it is running:

$ sudo systemctl status docker

Make sure you are in the Docker usergroup (replace <USER> with your username):

$ sudo usermod -aG docker <USER>

Step 2 – Create an htop container for monitoring

Htop is an interactive process viewer for UNIX. We’ll use it to monitor the progress of our attacks.

First create the Dockerfile for your htop container:

$ vim Dockerfile

Add the following to the Dockerfile:

FROM alpine:latest
RUN apk add --update htop && rm -rf /var/cache/apk/*
ENTRYPOINT ["htop"]

Build the Dockerfile and tag the image:

$ sudo docker build -t htop .

Run htop inside a container, this will monitor the host:

$ sudo docker run -it --rm --pid=host htop

To exit htop, enter q.

Next we will create an NGINX container and monitor it directly by joining the container’s pid namespace.

Step 3 – Create an NGINX container to attack

First we will create a directory for the html page we will serve using nginx:

$ mkdir -p ~/docker-nginx/html
$ cd ~/docker-nginx/html

Create a simple HTML page:

$ vim index.html

Paste in this content:

<html>
    <head>
        <title>Docker nginx tutorial</title>
        <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
    </head>
    <body>
        <div class="container">
            <h1>Hello it is your container speaking</h1>
            <p>This nginx page was created by your Docker container.</p>
            <p>Now it’s time to create a Gremlin attack.</p>
        </div>
    </body>
</html>

Create a container using the nginx Docker image:

$ sudo docker run -l service=nginx --name docker-nginx -p 80:80 -d -v ~/docker-nginx/html:/usr/share/nginx/html nginx

Make sure the docker-nginx container is running:

$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND                       CREATED                 STATUS                   PORTS                         NAMES
352609a67e95        nginx               "nginx -g 'daemon of…"        33 seconds ago          Up 32 seconds       0.0.0.0:80->80/tcp              docker-nginx

Step 4 – Gather your Gremlin client credentials

The Gremlin daemon (gremlind) connects to the Gremlin backend and waits for attack orders from you. When it receives attack orders, it uses the CLI (gremlin) to run the attack.

To connect gremlind to the Gremlin backend, you need your client credentials. (This is NOT the same as the email/password credentials you use to access the Gremlin Web App.) Read the Gremlin Docs to see how to find your client credentials in the Gremlin Web App.

With the credentials in hand, set them via these environment variables:

$ export GREMLIN_TEAM_ID="<YOUR_TEAM_ID>"
$ export GREMLIN_TEAM_CERTIFICATE_OR_FILE="<YOUR_PEM_ENCODED_TEAM_CERTIFICATE or PATH_TO_FILE>"
$ export GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="<YOUR_PEM_ENCODED_TEAM_PRIVATE_KEY or PATH_TO_FILE>"

That’s enough configuration for this tutorial, but feel free to read about other configuration options in the Gremlin Docs.

Step 5 - Run the Gremlin Daemon in a Container

Use docker run to pull the official Gremlin Docker image and run the Gremlin daemon:

$ sudo docker run -d \
    --net=host \
    --pid=host \
    --cap-add=NET_ADMIN \
    --cap-add=SYS_BOOT \
    --cap-add=SYS_TIME \
    --cap-add=KILL \
    -e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \
    -e GREMLIN_TEAM_CERTIFICATE_OR_FILE="${GREMLIN_TEAM_CERTIFICATE_OR_FILE}" \
    -e GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="${GREMLIN_TEAM_PRIVATE_KEY_OR_FILE}" \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /var/log/gremlin:/var/log/gremlin \
    -v /var/lib/gremlin:/var/lib/gremlin \
    gremlin/gremlin daemon

Make sure to pass in the three environment variables you set in Step 4. If you don’t, the Gremlin daemon cannot connect to the Gremlin backend.

Use docker ps to see all running Docker containers:

$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                NAMES
7167cacb2536        gremlin/gremlin     "/entrypoint.sh daem…"   40 seconds ago      Up 39 seconds                            practical_benz
fb58b77e5ef8        nginx               "nginx -g 'daemon of…"   10 minutes ago      Up 10 minutes       0.0.0.0:80->80/tcp   docker-nginx

Jump into your Gremlin container with an interactive shell (replace 7167cacb2536 with the real ID of your Gremlin container):

$ sudo docker exec -it 7167cacb2536 /bin/bash

From within the container, check out the available attack types:

# gremlin help attack-container
Usage: gremlin attack-container CONTAINER TYPE [type-specific-options]

Type "gremlin help attack-container TYPE" for more details:

  blackhole # An attack which drops all matching network traffic
  cpu   # An attack which consumes CPU resources
  io    # An attack which consumes IO resources
  latency # An attack which adds latency to all matching network traffic
  memory  # An attack which consumes memory
  packet_loss # An attack which introduces packet loss to all matching network traffic
  shutdown  # An attack which forces the target to shutdown
  dns   # An attack which blocks access to DNS servers
  time_travel # An attack which changes the system time.
  disk    # An attack which consumes disk resources
  process_killer  # An attack which kills the specified process

Then exit the container.

Step 6 – Run a CPU Attack against the host from a Gremlin Container

We will use the Gremlin CLI attack command to create a CPU attack.

Now use the Gremlin CLI (gremlin) to run a CPU attack from within a Gremlin container:

sudo docker run -d \
    --net=host \
    --pid=host \
    --cap-add=NET_ADMIN \
    --cap-add=SYS_BOOT \
    --cap-add=SYS_TIME \
    --cap-add=KILL \
    -e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \
    -e GREMLIN_TEAM_CERTIFICATE_OR_FILE="${GREMLIN_TEAM_CERTIFICATE_OR_FILE}" \
    -e GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="${GREMLIN_TEAM_PRIVATE_KEY_OR_FILE}"
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /var/log/gremlin:/var/log/gremlin \
    -v /var/lib/gremlin:/var/lib/gremlin \
    gremlin/gremlin attack cpu

This attack will hog a single CPU core for 60 seconds. (That’s the default setting for CPU attacks.)

Monitor the progress of the attack using the htop container you created earlier:

$ sudo docker run -it --rm --pid=host htop

If you have set up the Gremlin Slackbot, it will also notify your team via Slack:

slackcpu

Step 7 – Run a CPU Attack against the NGINX container from a Gremlin Container

In this step we will run gremlin attack-container to target the NGINX container by its ID and run a CPU attack against it. The attack will hog one CPU core for 60 seconds.

Before running the attack, use htop to monitor the docker-nginx container (replace f291a040a6aa with your docker-nginx container’s ID):

sudo docker run -it --rm --pid=container:f291a040a6aa htop

You will see the following:

  1  [                                                                           0.0%]   Tasks: 3, 0 thr; 1 running
  2  [|                                                                          0.7%]   Load average: 0.72 0.41 0.21 
  Mem[|||||||||||||||||||||||||                                            141M/3.86G]   Uptime: 00:30:34
  Swp[                                                                          0K/0K]

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command          
   47 root       20   0  4488  2236   932 R  0.0  0.1  0:00.07 htop
    1 root       20   0 32428  5180  4504 S  0.0  0.1  0:00.03 nginx: master process nginx -g daemon off;
    8 101        20   0 32900  2476  1448 S  0.0  0.1  0:00.00 nginx: worker process

Run the following to create the CPU container attack against a container (replace f291a040a6aa with your docker-nginx container’s ID):

$ sudo docker run -d -it \
    --cap-add=NET_ADMIN \
    -e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \
    -e GREMLIN_TEAM_CERTIFICATE_OR_FILE="${GREMLIN_TEAM_CERTIFICATE_OR_FILE}" \
    -e GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="${GREMLIN_TEAM_PRIVATE_KEY_OR_FILE}" \
    -v /var/run/docker.sock:/var/run/docker.sock \
    gremlin/gremlin attack-container f291a040a6aa cpu

View the progress of the attack using the htop container you created earlier:

$ sudo docker run -it --rm --pid=container:f291a040a6aa htop

You will see the following:

  1  [|                                                                            0.7%]   Tasks: 4, 1 thr; 2 running
  2  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||          100.0%]   Load average: 0.30 0.33 0.19 
  Mem[|||||||||||||||||||||||||                                              163M/3.86G]   Uptime: 00:32:09
  Swp[                                                                            0K/0K]

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command          
   51 root       20   0 15456 13696  4112 S 99.0  0.3  0:11.25 gremlin attack cpu
   70 root       20   0  4488  1988   948 R  0.0  0.0  0:00.04 htop
    1 root       20   0 32428  5180  4504 S  0.0  0.1  0:00.03 nginx: master process nginx -g daemon off;
    8 101        20   0 32900  2476  1448 S  0.0  0.1  0:00.00 nginx: worker process

Step 8 – Run a Blackhole Attack against the NGINX container from the Gremlin Container

sudo docker run -it \
    --cap-add=NET_ADMIN \
    -e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \
    -e GREMLIN_TEAM_CERTIFICATE_OR_FILE="${GREMLIN_TEAM_CERTIFICATE_OR_FILE}" \
    -e GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="${GREMLIN_TEAM_PRIVATE_KEY_OR_FILE}"
    -v /var/run/docker.sock:/var/run/docker.sock \
    gremlin/gremlin attack-container f291a040a6aa blackhole -h google.com

View the progress of the attack using the htop container you created earlier:

sudo docker run -d -it --rm --pid=container:f291a040a6aa htop

You will see the following result:

Attacking container 'f291a040a6aa' with command ["attack", "blackhole", "-h", "google.com"] ...
Spawning sidecar container 'gremlin-f291a040a6aa' based on 'gremlin/gremlin:latest' for attack ...
Setting up blackhole gremlin with guid '0df1ccf5-0801-11e8-9acf-0242fe3ba0bc' for 60 seconds
Setup successfully completed
Running blackhole gremlin with guid '0df1ccf5-0801-11e8-9acf-0242fe3ba0bc' for 60 seconds
Dropping all egress traffic to 172.217.12.174
Dropping all ingress traffic from 172.217.12.174
Dropping all ingress traffic from 172.217.11.46
Dropping all egress traffic to 172.217.11.46
Dropping all egress traffic to 172.217.10.110
Dropping all ingress traffic from 172.217.10.110
Reverting impact!

Conclusion

You’ve installed Gremlin in a Docker container and validated that Gremlin works by running the “Hello World” of Chaos Engineering, the CPU Resource attack. You have run a CPU resource attack from a Gremlin container against its host. You have also run a CPU attack and a Blackhole attack from a Gremlin container against an NGINX container.

Gremlin can run other kinds of attacks, like State and Network attacks. Try running some!

Check out the Gremlin Blog for more ideas on how to do Chaos Engineering on your application infrastructure.