Using Ansible; how to structure yaml and projects
I find ansible to be very easy to jump into using, but all the different ways to structure projects can make it confusing to understand how to optimally stucture things. This is in part due to changes in ansible over time that have substantially altered the best ways to accomplish things. I am going to briefly show how to better understand the yaml file format since it rarely seems to get covered and then go over methods I use for ansible project structure on the whole.
Ansible is a frequent part of my projects
To see how I have put ansible to use, check out my writeup on building a kubernetes cluster on digitalocean Cluster built with ansible. That was just one use for it. I also have an ansible playbook for building my development machine (Build an Ubuntu dev machine) so that I can quickly restore/rebuild if I have a major issue, and other playbooks for tasks like docker image builds and raspberry pi installation.
How yaml format works
Yaml and json are mostly interoperable, yaml is just a simplified form. Think of yaml as json without any braces/brackets, and a few other changes.
There are dictionaries and arrays
These are the yaml building blocks, any yaml structure is composed of these used in combination. As I mentioned, it is like json except without any of the braces and brackets, which can make it a bit harder to separate one structure from another, but also makes the syntax very clean.
Levels of indentation define object structure -
Arrays or dictionaries at a specific level (of nesting) must have at least as much indentation as the current block level. Since there are no braces/brackets it takes python approach where indentation is critical to defining structures.
Nested Dictionaries are dependent on greater nesting
A nested dictionary MUST have greater indentation to indicate it is nested. A dictionary is key-value pairs, just like a JSON object.
key: value
key2:
subkey: value
subkey3:
nestedkey: value
Arrays do not require greater nesting than the defining key
An array does not need to be nested, so both of these are valid:
arr1:
- member1
- member2
arr2:
- member1
- member2
Arrays use the dash (-) to denote members since it does not use brackets ([ ]) or commas like would be the case in JSON. Since both of these are valid you will run into both and it is just something to remember that it is just a style preference.
How Ansible code is structured
It can be confusing looking at different examples because there are several/many ways to do it, but here are the building blocks and how they are put together.
Play -
A play is a block level above tasks. It specifies hosts and attaches one or more tasks to run on them. If you wanted to run just a single task, the play would have an associated task and module/keyswords to define what it does with those hosts.
---
- hosts: groupname
// other options (become, gather facts, ..)
tasks:
// Array of tasks
The nice thing about ansible structure however is that you can break the tasks off into tasks files or roles so that the play specifies the hosts to target just once and then tasks/roles are attached based on some logic.
Tasks -
Uses module, modules options, and keywords to define actions to execute. The target hosts are determined by the containing play object. If the task is in a task file the “task:” heading does not need to be specified either, just sets of module/keyword combinations.
- name: Check if ripgrep is installed
command: dpkg-query -W ripgrep
register: check_deb
In this instance there is a module (command) keyword (register) and the name, which is technically a keyword but is used at the start of most tasks so that is not important to know.
Keywords
These are modifiers available for use in tasks. They are not module specific, but may alter the task/module behavior.
Some keywords are used constantly, like name
The example above already has used two; naming a task (name) registering the result (register) are both keyword commands. There are many commonly used ones like register, loop, when, vars, tags, become.
Keywords add generic functionality to modules
Keywords perform generally useful functionality that can apply to most modules. They can allow looping over input, saving output, making a task conditional upon various factors, and more.
Module
Modules specify the action for a task, which is modified with options to the module and keywords. This example is of the file module.
file:
path: "{{ tmp_dir }}"
state: absent
There are a huge number of modules, so the module documentation index is your friend Ansible All Modules Index . Sub keys specify module options (path, state, in this case), but this will vary module to module. Before trying to put together your own setup to accomplish a task, it is always best to check the module index first.
Playbook -
File containing plays set up to run tasks. You can put everything (plays, tasks, vars …) in monolithic playbooks or break things up using roles. The latter is preferred because it allows taking advantage of the modular structure of ansible so you can set up re-usable tasks, split out variables, files, host information and so forth and have some sort of clear organization.
To run a playbook -
ansible-playbook playbook.yml
Block -
Arbitrary grouping block that allows you to set keywords for all tasks container within that block. Allows a level of organization above tasks but less than roles/playbooks. This example adds a when condition, so the grouped tasks are executed only when it is true.
- name: "Install Helm"
block:
- name: "Task 1"
get_url:
url: https://raw.githubusercontent.com/helm/helm/master/scripts/get
dest: "{{ tmp_dir }}/get_helm.sh"
- name: "Task 2"
shell: "{{ tmp_dir }}/get_helm.sh"
when: helm_exists.rc > 0
Roles -
A set of tasks with all the supporting vars, files, handlers organized into separate files, but readily available. The tasks are also in their own file. The tasks from a role are then called in by an organizing playbook. Each category (task, var, handlers) gets its own folder and the files inside are assumed to contain just that type of code.
The location specifies what the file contents will be
Files in the tasks folder do not need a task: heading. It is assumed to be an array of tasks. Files in the vars folder do not need a vars: object. It is assumed to be an object containing variables. And so forth.
Top level Organization for Ansible -
Top level directory and file structure -
main_inventory - the primary inventory file. Can have many of these.
ansible.cfg - general config file
group_vars/
vars1 - files for group specific variables
vars2
host_vars/
host1 - files for host specific variables
host2
site.yml - a master playbook
webservers.yml - Other playbooks, optionally
roles/ - Roles; can be import/included into any playbook.
role1/
tasks/
vars/
I am not a fan of the roles naming convention
To have a default task, var, handler, or any other file, it must be called ‘main’ in that directory. This leads to lots of files all called ‘main’. I mostly give them other names and explicitly import/include by those names.
Importing roles in a playbook
You just specify the hosts in the main playbook, along with the role/tasks to import and run for those hosts.
- hosts: nodes
// Old role import syntax
roles:
- { role: tools, task: reset }
- { role: common, task: all}
// New role import syntax; import or include.
- name: Import a specific task file from role
import_role:
name: myrole
tasks_from: taskfile
- name: Include a specific task file from role
include_role:
name: other_role/or/path
tasks_from: othertaskfile
There are different ways to add roles, I am demonstrating three here. Roles can also be included conditionally based on variables, previous out, or tags.
Then run the playbook to use those roles
The job of the playbook ends up being to designate what hosts to target and import roles to run on them when using roles. This makes it easy to set up playbooks with different host targets running different roles on those targets.
Tags can allow more control when calling playbooks
For more specific control, I sometimes set up tags so I can run parts of a playbook, as opposed to setting up playbooks for each use case.