This serves as a monolithic repository for the Grafeas Group Ansible configuration scripts.
- Python 3.8.3 or higher
- Create a virtualenv with
python3 -m venv ./venv
and activate it (i.e.source ./venv/bin/activate
) - Install
pip-tools
(pip install pip-tools
) - Sync up python dependencies with
pip-sync ./dependencies/requirements.txt
(or if running molecule, use./dependencies/dev-requirements.txt
instead)
Executing these steps from the shell would look like this:
~/src/ansible-workspace $ python3 -m venv ./venv
~/src/ansible-workspace $ source ./venv/bin/activate
(venv) ~/src/ansible-workspace $ pip install --upgrade pip
(venv) ~/src/ansible-workspace $ pip install pip-tools
(venv) ~/src/ansible-workspace $ pip-sync ./dependencies/requirements.txt
(venv) ~/src/ansible-workspace $ hash -r # optional: make your current shell notice the new commands in your PATH
Now you're ready to go!
By convention, from the Ansible ecosystem, site.yml
is the default build-the-world playbook. In our case, there's some work that's done for us already. See our packer repository for the scripts we use to build our base VM template, which is where Ansible in this repository picks things up.
While there are other playbooks, any playbook can be invoked like so:
(venv) ~/src/ansible-workspace $ export ANSIBLE_REMOTE_USER='my_server_username'
(venv) ~/src/ansible-workspace $ export ANSIBLE_REMOTE_PORT='9039' # Some alternate SSH port, if relevant
(venv) ~/src/ansible-workspace $ ssh-add ~/.ssh/id_rsa
(venv) ~/src/ansible-workspace $ ansible-playbook ./site.yml --inventory ./inventory --ask-become-pass
There are a few other playbooks defined in this repository, but any playbook can be invoked like so:
(venv) ~/src/ansible-workspace $ export ANSIBLE_REMOTE_USER='my_server_username'
(venv) ~/src/ansible-workspace $ ssh-add ~/.ssh/id_rsa
(venv) ~/src/ansible-workspace $ ansible-playbook ./site.yml --inventory ./inventory --ask-become-pass
This makes a few assumptions:
- SSH connectivity to the target servers is defined in the inventory (IP address, port, etc.)
- SSH private key authentication is setup (perhaps with
ssh-agent
?) - SSH username is defined in the environment variable
ANSIBLE_REMOTE_USER
or the inventory as an Ansible var,ansible_user
- The account password for the remote user is known and can be typed when prompted for a sudo password
~/src/ansible-workspace/
├── dev-requirements.in
├── dev-requirements.txt
├── get_password_hash.yml
├── inventory/
├── README.md
├── requirements.in
├── requirements.txt
├── roles/
├── site.yml
└── upgrade_bots.yml
Overall, this repository uses the requirements.txt
approach to tracking python dependencies, facilitated with venv
and pip-tools
. Dependencies in requirements.txt
(and dev-requirements.txt
) should represent a snapshot of all resolved versions of packages and all dependencies underneath. Since that is quite cumbersome to manually curate, we use pip-tools
to manage it.
This means your entire development environment can be installed with pip install -r ./dev-requirements.txt
or, for just the Ansible runtime requirements, pip install -r ./requirements.txt
.
For best results, a good requirements.txt
should include the most granular reference possible to a particular version of a package it depends on. pip-tools
provides the command pip-compile
to that end, where it takes a generalized list of packages from some input file (requirements.in
) and it outputs a file that one might have manually completed in requirements.txt
. Included in the output file are resolved versions of all packages declared in the input file, all underlying dependencies, and each dependency having a version constraint specifying one exact version. Passing the --generate-hashes
flag to pip-compile
will take it one step further than version constraints by also including package checksums, verified prior to installing each respective python package.
The output of pip-compile
is a file that retains full backward-compatibility with pip install -r
. The benefit is a much smaller, intention-revealing file with only the direct dependencies (requirements.in
).
To modify the dependencies, modify the input file(s) (dev-requirements.in
if it's specific to developing ansible code but not needed at runtime, requirements.in
otherwise) and run pip-compile --generate-hashes
on the pairings like so:
(venv) ~/src/ansible-workspace $ pip-compile --generate-hashes --output-file requirements.txt requirements.in
The resulting changes to the input and output files should be tracked in the repository.
The drawback of using pip install -r
, despite the output of pip-compile
being 100% compliant with it, is the case where extra dependencies exist. Say a library used to use a certain python package and now it doesn't. That adds exposure in case there's a CVE published for it, etc. Also, especially with a very fast-growing codebase, how do you know you're not using the orphaned library? This is why installing dependencies from the requirements.txt
file is recommended through pip-sync
, another aspect of the pip-tools
suite.
Running pip-sync requirements.txt
(or dev-requirements.txt
) will install all of the dependencies in the requirements file, even going so far as to uninstall any extra python packages you have in the venv that aren't included in it. This makes pip-sync
more of a blunt object in which to cause pain early on, in case a package you're relying on gets removed unexpectedly, but for greater long-term benefit. The smaller list in the .in
file becomes easier to curate than the entire snapshot in the .txt
file.
Ansible playbooks are YAML files that have a specific structure.
---
- hosts: botservers
gather_facts: false
tasks:
- name: 'Check if servers are online'
ping:
In the above, exceedingly simple example we see the playbook is a YAML document that is an array of objects. There can be one or more of these objects, but each one must contain the hosts:
key pointing at a specific host or host group from the inventory (all
for every one of the hosts from the inventory). In this case, it skips some basic feature detection with the gather_facts: false
line, and it executes the ping
module in order to determine if the servers in that host group are online. See the official Ansible docs for more.
Our playbooks include the following:
get_password_hash.yml
-- A helper for typing in a password locally, when prompted, to have it spit out a well-formed hash for use in creating accounts with theuser
modulesite.yml
-- A build-the-world playbook that provisions the servers from scratch in an idempotent wayupgrade_bots.yml
-- A targetted playbook that only upgrades the python bots for TranscribersOfReddit to new versions
These are a normalized structure for reusable Ansible components, similar to what a "cookbook" is in the Chef config management system.
Technically, everything in a role can be put into one massive playbook, but the approach of using a role allows for more clear intent and organization using the filesystem instead of a YAML structure with lots of scrolling.
For the sake of this monolithic repository, roles are split up in subdirectories like ./roles/{role_name}/
. Each role may contain any or all of these directories (relative to the root of the role):
defaults/
-- Default Ansible variables. Variables put inmain.yml
are the lowest precedence items and generally for applying a sane default if no overrides given.handlers/
-- A representation of thehandlers:
section of a playbook, readingmain.yml
within if it exists.meta/
-- A directory that defines metadata about the role, including dependencies on other roles, author information, and supported platforms. All of this is intended to be parsed and acted upon by automation (e.g., platform support).molecule/
-- A directory used for automated testing just this role in isolation with Molecule and Dockertasks/
-- These are the actual actions taken to provisiona target server, representing thetasks:
section of a playbook and reading themain.yml
file implicitly.vars/
-- A list of Ansible variables that may be programmatically included from a task in the role, or formain.yml
it is automatically included and often is used to override default variable values set in roles this role depends on. Generally it's fine to leave this directory alone.README.md
-- Just the readme outlining the role's purpose, dependencies, and other information for general usage.
More information can be found in the official ansible documentation on roles.
In the context of this monolithic repository, the Ansible inventory is kept as a submodule pointing to a private repository so that implementation-specific secrets may remain secret for security purposes, yet the majority of the implementation can still adhere to the open source ideals of Grafeas Group.
The function of an Ansible inventory is to provide a manifest of target servers in order to...
- include connection information for each server
- semantically organize each server into one or more groups which can be referenced from a playbook (e.g.,
webservers
andbotservers
) - set server- or group-specific Ansible variable values (i.e., MySQL connection string for
botservers
to use might be different than the one forwebservers
to use)
Since this is a different repository and is subject to change, see the other repository's README for more information on function, directory structure, and overall how to use it.
For this repository we make use of git submodules. While it is generally considered an antipattern except in very specific use cases, it proves very useful by keeping certain sections of the code base private in a private repository while the parts that aren't sensitive can remain publicly available.
To clone the repository and all submodules, run git clone --recursive {clone-url}
instead of just git clone {clone-url}
. If you're like me, however, you've likely already cloned the repository and need to just pull in the submodule content. Here's how that should go:
~/src/ansible-workspace $ git submodule init
Submodule 'ansible-secrets' (https://github.com/GrafeasGroup/ansible-secrets) registered for path 'inventory'
~/src/ansible-workspace $ git submodule update
Cloning into 'inventory'...
remote: Enumerating objects: 17, done.
remote: Counting objects: 100% (17/17), done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 17 (delta 0), reused 17 (delta 0), pack-reused 0
Receiving objects: 100% (17/17), 4.89 KiB | 0 bytes/s, done.
Checking connectivity... done.
Submodule path 'inventory': checked out '4e7f0f76fe3393aff5149709bf1dc5b452f6c8ff'
Submodules are tracked in the parent repository as a reference to a particular commit, not a rolling pointer to a particular branch. This means they need to be updated from time to time. To do that, all that needs to be done is:
~/src/ansible-workspace $ cd ./inventory
~/src/ansible-workspace/inventory $ git pull origin master
From https://github.com/GrafeasGroup/ansible-secrets.git
* branch master -> FETCH_HEAD
First, rewinding head to replay your work on top of it...
Fast-forwarded master to bcfdc30e5ed9d863b7374c79cac0aae51d8e4fb8.
~/src/ansible-workspace/inventory $ cd ..
~/src/ansible-workspace $ git add ./inventory
~/src/ansible-workspace $ git commit -m 'Updates submodule reference to latest commit on master branch'