How to Install and Use Gremlin on Azure¶
This tutorial walks through how to install Gremlin on Ubuntu 16.04 server in Microsoft Azure and run a CPU attack.
Before you begin, you’ll need:
- A Microsoft Azure account
- An Ubuntu 16.04 server
- A Gremlin account
Step 1 - Installing the Gremlin Daemon and CLI¶
First, ssh into your server and add the Gremlin Debian repository:
$ echo "deb https://deb.gremlin.com/ release non-free" | sudo tee /etc/apt/sources.list.d/gremlin.list
Import the repo’s GPG key:
$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys C81FC2F43A48B25808F9583BDFF170F324D41134 9CDB294B29A5B1E2E00C24C022E8EF3461A50EF6
Then install the Gremlin daemon and CLI:
sudo apt-get update && sudo apt-get install -y gremlind gremlin
Step 2 - Validating the Install¶
Run the following command to confirm
gremlin has everything it needs to function:
Note: DO NOT run this command on production hosts
$ gremlin syscheck
The CLI will walk through its library of attack types and run some mock attacks:
Checking resource gremlins ... Checking CPU gremlin ... Attack on cpu_1 completed successfully CPU gremlin OK ...
The full syscheck may take a few minutes, so please be patient!
Step 3 - Configuring the Gremlin Daemon¶
The Gremlin daemon (
gremlind) connects to the Gremlin backend and waits for attack orders from you. When it receives attack orders, it uses the CLI (
gremlin) to run the attack.
gremlind to the Gremlin backend, you need your client credentials. (This is NOT the same as the email/password credentials you use to access the Gremlin Web App.) Read the Client Auth docs to see how to find your client credentials in the Web App.
With the credentials in hand, it’s time to configure the daemon. As with most daemons, you can configure
gremlind either by configuration file or environment variables. Let’s use the configuration file.
Add these configuration options to the daemon’s configuration file:
$ echo 'GREMLIN_TEAM_ID="<INSERT_YOUR_TEAM_ID>"' >> /etc/default/gremlind $ echo 'GREMLIN_TEAM_CERTIFICATE_OR_FILE="file:///var/lib/gremlin/gremlin.cert"' >> /etc/default/gremlind $ echo 'GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="file:///var/lib/gremlin/gremlin.key"' >> /etc/default/gremlind
Then add your PEM-encoded certificate and key to two new files—
/var/lib/gremlin/gremlin.key, respectively—and set the ownership and permissions on the files so that only
gremlind can access them:
$ sudo chown gremlin:gremlin /var/lib/gremlin/gremlin.* $ sudo chmod 600 /var/lib/gremlin/gremlin.*
Optionally, give the Gremlin daemon a custom ID so it’s easy to find in the Web App later:
$ echo 'GREMLIN_IDENTIFIER="my-first-gremlin-host"' >> /etc/default/gremlind
That’s enough configuration for this tutorial, but feel free to read about other configuration options in the Gremlin Docs.
Restart the daemon to apply the configuration changes:
$ sudo systemctl restart gremlind
Now you’re ready to run attacks using the Gremlin Web App.
Step 4 - Creating Attacks¶
The “Hello World” of Chaos Engineering is the CPU Resource Attack. To create one, first click the Attack Category dropdown and select Resource. Then, in the Gremlin Attack dropdown, select CPU.
Next, you can choose how many CPU cores the attack should consume, and for how long. The default is to hog a single core for 60 seconds.
Finally, it’s time to target the host you just configured. If you have many hosts running the Gremlin daemon, you can filter through them here, choosing to run the attack only on some subset of hosts. Since you’re only attacking a single host for now, just tick the checkbox next to the host. (If you don’t see your host in the list, search for its
$GREMLIN_IDENTIFIER in the search bar.)
As soon as you click Create New Attack, your host’s
gremlind will pick up the attack order and start to chew up your CPU. You can see the attack’s progress on the Attacks page.
On your host, run
top to check the impact of the Gremlin Attack:
$ top top - 06:26:47 up 7 days, 7:00, 1 user, load average: 0.28, 0.07, 0.02 Tasks: 105 total, 1 running, 104 sleeping, 0 stopped, 0 zombie %Cpu(s): 79.7 us, 20.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 1016120 total, 127140 free, 93956 used, 795024 buff/cache KiB Swap: 0 total, 0 free, 0 used. 712192 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 23768 gremlin 20 0 13268 11136 3576 S 99.3 1.1 0:14.05 gremlin 23766 root 20 0 40388 3600 3072 R 0.3 0.4 0:00.03 top 1 root 20 0 37760 5760 3940 S 0.0 0.6 0:13.74 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:01.28 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 0:06.14 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:04.09 watchdog/0
When your attack is complete, it will move to Completed Attacks on the Attacks page.
Step 5 - Halting a CPU resource attack using the Gremlin Control Panel¶
You can halt any attack at any time from the Attacks page. Just find your attack and click the red Halt button next to it.
You’ve installed Gremlin on a Microsoft Azure server running Ubuntu 16.04 and tested Gremlin by running the “Hello World” of Chaos Engineering, the CPU Resource attack. You now possess tools that make it possible for you to explore additional Gremlin Attacks including attacks that impact State and Network.
You can also explore the Gremlin Blog for more ideas on how to do Chaos Engineering on your application infrastructure.