How to Install and Use Gremlin on Ubuntu 16.04

Introduction

Gremlin’s “Resilience as a Service” makes it easy to find weaknesses in your system before they cause problems for your customers. Gremlin is a simple, safe and secure way to use Chaos Engineering to improve system resilience.

Gremlin’s main advantages are:

  • Simple: Instead of crafting chaos engineering experiments by hand, you are provided with a range of simple to use attacks that can be automated. Gremlin provides a simple to use Control Panel, API and CLI.
  • Safe: All attacks have a halt feature, any experiment can be terminated within seconds. Gremlin provides role-based access controls (RBAC). Companies are the top-level organizational unit in Gremlin. All resources including Clients, Users, and Templates are associated with a Company.
  • Secure: Gremlin attacks are generated on the control plane, clients make outbound SSL calls to poll for attacks. Gremlin does not require root privileges to any machines in your infrastructure. Gremlin provides secure command execution, security auditing, multi-factor authentication (MFA) and SAML SSO.

Prerequisites

Before you begin this tutorial, you’ll need the following:

  • An Ubuntu 16.04 server
  • A Gremlin account
  • The apt-transport-https package to be able to install gremlin from our repo via HTTPS.

Step 1 - Installing Gremlin

In this step, you’ll install Gremlin

First, ssh into your host and add the gremlin repo:

ssh username@your_server_ip

echo "deb https://deb.gremlin.com/ release non-free" | sudo tee /etc/apt/sources.list.d/gremlin.list

Import the GPG key:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys C81FC2F43A48B25808F9583BDFF170F324D41134 9CDB294B29A5B1E2E00C24C022E8EF3461A50EF6

Then install the Gremlin client and daemon:

sudo apt-get update && sudo apt-get install -y gremlin gremlind

Step 2 - Validating Installation

Run the following command to confirm you have all the necessary components installed on your host for Gremlin to function correctly:

gremlin syscheck

You’ll see the output from the Gremlin checks in your console:

Checking resource gremlins ...
Checking CPU gremlin ...
Attack on cpu_1 completed successfully
CPU gremlin OK

Step 3 - Registering with Gremlin

You’ll need to register with the Gremlin control plane to create a new Gremlin client session. Gremlin offers tags, a feature allowing you to apply custom labels to a Gremlin client. You can tag more than one Gremlin client with the same label, then use it to view a filtered list of Gremlin clients that share that particular tag. In addition to filtering, the Gremlin Control Panel and API allow you to initiate an action across multiple Gremlin clients with the same tag. Identifying groups of Gremlin clients and administering all of them at once reduces the time required to manage hosts.

Start by initializing Gremlin and assigning tags with the following command. Substitute your desired tag name for tag_name:

gremlin init
    --tag tag_name1=tag_value1 \
    --tag tag_name2=tag_value2

This will create a new Gremlin client with the tags tag_name and tag_name applied.

Example: Adding tags to a Gremlin client

Suppose that you have a collection of hosts and you want to tag them by their service, service-version and service-type.

Initialize Gremlin and assigning tags with the following command:

gremlin init \
    --tag service=api \
    --tag service-version=1.0.0 \
    --tag service-type=http

This will set tags as service=api, service-version=1.0.0 and service-type=http. If you are using an AWS EC2 instance, Gremlin will also auto-populate tags for region and zone.

You’ll then be prompted to input your Team ID and Team Secret. You can obtain these credentials from Gremlin Settings in the Gremlin Control Panel.

Login to the Gremlin Control Panel using your Company name and sign-on credentials. These details were emailed to you when you signed up to start using Gremlin.

Next click on your name and select Settings in the Gremlin Control Panel.

You will find your Team ID on the left under your company name, then click to generate your Team Secret. We recommend you store your Team Secret somewhere safe since it is only available once. If you lose your Team Secret you will be able to reset it.

On your computer, open your terminal and paste your Team ID and then Team Secret:

Please input your Team ID:

Please input your Team Secret:

You are now ready to create attacks using the Gremlin Control Panel.

Step 4 - Creating attacks using the Gremlin Control Panel

Login to the Gremlin Control Panel using your Company name and sign-on credentials. These details were emailed to you when you created your Gremlin account.

Select Create Attack in the Gremlin Control Panel.

Example: The “Hello World” of Chaos Engineering (a CPU attack)

You can use the Gremlin Control Panel or the Gremlin API to trigger Gremlin attacks. You can view the available range of Gremlin Attacks in Gremlin Help.

The “Hello World” of Chaos Engineering is the CPU Resource Attack. To create a CPU Resource Attack select “Resource” and then “CPU” in the dropdown menu.

The CPU Resource Attack will consume CPU resources based on the settings you select. The most popular default settings for a CPU Resource Attack are pre-selected, a default attack will utilize 1 core for 60 seconds. Before you can run the Gremlin attack you will need to click either Exact hosts to run the attack on or click the Random attack option.

Click Exact and select a Gremlin Client in the list.

Your attack will begin to run, you will be able to view its progress via Gremlin Attacks in the Gremlin Control Panel.

On your server, run top to check the impact of the Gremlin Attack:

$ top

top - 06:26:47 up 7 days,  7:00,  1 user,  load average: 0.28, 0.07, 0.02
Tasks: 105 total,   1 running, 104 sleeping,   0 stopped,   0 zombie
%Cpu(s): 79.7 us, 20.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1016120 total,   127140 free,    93956 used,   795024 buff/cache
KiB Swap:        0 total,        0 free,        0 used.   712192 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND     
23768 gremlin   20   0   13268  11136   3576 S 99.3  1.1   0:14.05 gremlin     
23766 root      20   0   40388   3600   3072 R  0.3  0.4   0:00.03 top         
    1 root      20   0   37760   5760   3940 S  0.0  0.6   0:13.74 systemd     
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd    
    3 root      20   0       0      0      0 S  0.0  0.0   0:01.28 ksoftirqd/0 
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H
    7 root      20   0       0      0      0 S  0.0  0.0   0:06.14 rcu_sched   
    8 root      20   0       0      0      0 S  0.0  0.0   0:00.00 rcu_bh      
    9 root      rt   0       0      0      0 S  0.0  0.0   0:00.00 migration/0 
   10 root      rt   0       0      0      0 S  0.0  0.0   0:04.09 watchdog/0  

When your attack is complete it will move to Completed Attacks.

Step 5 - Halting a CPU resource attack using the Gremlin Control Panel

You can stop a Gremlin Attack at anytime using the Gremlin Control Panel. Navigate to Gremlin Attacks and click on the red “Halt” button.

Conclusion

You’ve installed Gremlin on a server running Ubuntu 16.04 and validated that Gremlin works by running the “Hello World” of Chaos Engineering, the CPU Resource attack. You now possess tools that make it possible for you to explore additional Gremlin Attacks including attacks that impact State and Network.

Gremlin’s Developer Guide is a great resource and reference for using Gremlin to do Chaos Engineering. You can also explore the Gremlin Blog for more information on how to use Chaos Engineering with your application infrastructure.