One thing that you may or may not include is a setting that defines how SSH is used to connect to machines Ansible is going to configure. Before we do that, we need to spend a bit of time talking about security and Ansible. Like almost all things related to Linux (or
*nix in general), Ansible is not an integrated system, instead relying on different services that already exist. To connect to systems it manages and to execute commands, Ansible relies on
SSH (in Linux) or other systems such as WinRM or PowerShell on Windows. We are going to focus on Linux here, but remember that quite a bit of information about Ansible is completely system-independent.
SSH is a simple but extremely robust protocol that allows us to transfer data (Secure FTP, SFTP, and so on) and execute commands (
SSH) on remote hosts through a secure channel. Ansible uses SSH directly by connecting and then executing commands and transferring files. This, of course, means that in order for Ansible to work, it is crucial that SSH works.
There are a couple of things that you need to remember when using
SSH to connect:
- The first is a key fingerprint, as seen from the Ansible control node (server). When establishing a connection for the first time,
SSHrequires the user to verify and accept keys that the remote system presents. This is designed to prevent MITM attacks and is a good tactic in everyday use. But if we are in the position of having to configure freshly installed systems, all of them will require for us to accept their keys. This is time-consuming and complicated to do once we start using playbooks, so the first playbook you will start is probably going to disable key checks and logging into machines. Of course, this should only be used in a controlled environment since this lowers the security of the whole Ansible system.
- The second thing you need to know is that Ansible runs as a normal user. Having said that, maybe we do not want to connect to the remote systems as the current user. Ansible solves that by having a variable that can be set on individual computers or groups that indicates what username the system is going to use to connect to this particular computer. After connecting, Ansible allows us to execute commands on the remote system as a different user entirely. This is something that is commonly used since it enables us to reconfigure the machine completely and change users as if we were at the console.
- The third thing that we need to remember are the keys –
SSHcan log in by using interactive authentication, meaning via password or by using pre-shared keys that are exchanged once and then reused to establish the SSH session. There is also
ssh-agent, which can be used to authenticate sessions.
Although we can use fixed passwords inside inventory files (or special key vaults), this is a bad idea. Luckily, Ansible enables us to script a lot of things, including copying keys to remote systems. This means that we are going to have some playbooks that are going to automate deployment of new systems, and these will enable us to take control of them for further configuration.
To sum this up, the Ansible steps for deploying a system will probably start like this:
- Install the core system and make sure that
- Define a user that has admin rights on the system.
- From the control node, run a playlist that will establish the initial connection and copy the local
SSHkey to a remote location.
- Use the appropriate playbooks to reconfigure the system securely, and without the need to store passwords locally.
Now, let’s dig deeper.
Every reasonable manager will tell you that in order to do anything, you need to define the scope of the problem. In automation, this means defining systems that Ansible is going to work on. This is done through an inventory file, located in
Hosts can be grouped or individually named. In text format, that can look like this:
[servers] srv1.local srv2.local srv3.local [workstations] wrk1.local wrk2.local wrk3.local
All: Servers: Hosts: Srv1.local: Srv2.local: Srv3.local: Workstations: Hosts: Wrk1.local: Wrk2.local: Wrk3.local: Production: Hosts: Srv1.local: Workstations:
We created another group called Production that contains all the workstations and one server.
Anything that is not part of the default or standard configuration can be included individually in the host definition or in the group definition as variables. Every Ansible command has some way of giving you flexibility in terms of partially or completely overriding all the items in the configuration or inventory.
[servers] Srv[1:3].local [workstations] Wrk[1:3].local
IP ranges can also be used. So, for instance,
10.0.0.0/24 would be written down as follows:
There are two predefined default groups that can also be used:
ungrouped. As their names suggest, if we reference
all in a playbook, it will be run on every server we have in our inventory.
Ungrouped will reference only those systems that are not part of any group.
Ungrouped references are especially useful when setting up new computers – if they are not in any group, we can consider them new and set them up to be joined to a specific group.
These groups are defined implicitly and there is no need to reconfigure them or even mention them in the inventory file.
We mentioned that the inventory file can contain variables. Variables are useful when we need to have a property that is defined inside a group of computers, a user, password, or a setting specific to that group. Let’s say that we want to define a user that is going to be using on the
- First, we define a group:
- Then, we define the variables that are going to be used for the whole group:
[servers:vars] ansible_user=Ansibleuser ansible_connection=ssh
Note that the password is not present and that this playbook will fail if either the password is not separately mentioned or the keys are not exchanged beforehand. For more on variables and their use, consult Ansible documentation.
Now that we’ve created our first practical Ansible task, it’s time to talk about how to make Ansible do many things at once while using a more objective approach. It’s important to be able to create a single task or a couple of tasks that we can combine through a concept called a playbook, which can include multiple tasks/plays.
Working with playbooks
Once we’ve decided how to connect to the machines we plan to administer, and once we have created the inventory, we can start actually using Ansible to do something useful. This is where playbooks start to make sense.
In our examples, we’ve configured four CentOS7 systems, gave them consecutive addresses in the range of
10.0.0.4, and used them for everything.
Ansible is installed on the system with the IP address
10.0.0.1, but as we already said, this is completely arbitrary. Ansible has a minimal footprint on the system that is used as a control node and can be installed on any system as long as it has connectivity to the rest of the network we are going to manage. We simply chose the first computer in our small network. One more thing to note is that the control node can be controlled by itself through Ansible. This is useful, but at the same time not a good thing to do. Depending on your setup, you will want to test not only playbooks, but individual commands before they are deployed to other machines – doing that on your control server is not a wise thing to do.
Now that Ansible is installed, we can try and do something with it. There are two distinct ways that Ansible can be run. One is by running a playbook, a file that contains tasks that are to be performed. The other way is by using a single task, sometimes called ad hoc execution. There are reasons to use Ansible either way – playbooks are our main tool, and you will probably use them most of the time. But ad hoc execution also has its advantages, especially if we are interested in doing something that we need done once, but across multiple servers. A typical example is using a simple command to check the version of an installed application or application state. If we need it to check something, we are not going to write a playbook.
To see if everything works, we are going to start by simply using ping to check if the machines are online.
Ansible likes to call itself radically simple automation, and the first thing we are going to do proves that.
We are going to use a module named ping that tries to connect to a host, verifies that it can run on local Python environment, and returns a message if everything is ok. Do not confuse this module with the
ping command in Linux; we are not pinging through a network; we are only pinging from the control node to the server we are trying to control. We will use a simple
ansible command to ping all the defined hosts by issuing the following command:
ansible all -m ping
The following is the result of running the preceding command:
This particular module has no parameters or options, so we just need to run it in order to get a result. The result itself is interesting; it is in YAML format and contains a few things other than just the result of the command.
If we take a closer look at this, we will see that Ansible returned one result for each host in the inventory. The first thing we can see is the final result of the command –
SUCCESS means that the task itself ran without a problem. After that, we can see data in form of an array –
ansible_facts contains information that the module returns, and it is used extensively when writing playbooks. Data that is returned this way can vary. In the next section, we will show a much bigger dataset, but in this particular case, the only thing that is shown is the location of the Python interpreter. After that, we have the
changed variable, which is an interesting one.
When Ansible runs, it tries to detect whether it ran correctly and whether it has changed the system state. In this particular task, the command that ran is just informative and does not change anything on the system, so the system state was unchanged.
In other words, this means that whatever was run did not install or change anything on the system. States will make more sense later when we need to check if something was installed or not, such as a service.
Let’s do something similar, but this time with an argument, such as an ad hoc command that we want to be executed on remote hosts. So, type in the following command:
ansible all -m shell -a "hostname"
The following is the output:
Here, we called another module called
shell. It simply runs whatever is given as a parameter as a shell command. What is returned is the local hostname. This is functionally the same as what would happen if we connected to each host in our inventory using
SSH, executed the command, and then logged out.
For a simple demonstration of what Ansible can do, this is OK, but let’s do something more complex. We are going to use a module called
yum that is specific to CentOS/Red Hat to check if there is a web server installed on our hosts. The web server we are going to check for is going to be
lighttpd since we want something lightweight.
When we talked about states, we touched on a concept that is both a little confusing at first and extremely useful once we start using it. When calling a command like this, we are declaring a desired state, so the system itself will change if the state is not the one we are demanding. This means that, in this example, we are not actually testing if
lighttpd is installed – we are telling Ansible to check it and that if it’s not installed to install it. Even this is not completely true – the module takes two arguments: the name of the service and the state it should be in. If the state on the system we are checking is the same as the state we sent when invoking the module, we are going to get
changed: false since nothing changed. But if the state of the system is not the same, Ansible will make the current state of the system the same as the state we requested.
To prove this, we are going to see if the service is not installed or absent in Ansible terms. Remember that if the service was installed, this will uninstall it. Type in the following command:
ansible all -m yum -a "name=lighttpd state=absent"
Then, we can say that we want it present on the system. Ansible is going to install the services as needed:
Here, we can see that Ansible simply checked and installed the service since it wasn’t there. It also provided us with other useful information, such as what changes were done on the system and the output of the command it performed. Information was provided as an array of variables; this usually means that we will have to do some string manipulation in order to make it look nicer.
Now, let’s run the command again:
ansible all -m yum -a "name=lighttpd state=absent"
This should be the result:
As we can see, there were no changes here since the service is installed.
These were all just starting examples so that we could get to know Ansible a little bit. Now, let’s expand on this and create an Ansible playbook that’s going to install KVM on our predefined set of hosts.
Now, let’s create our first playbook and use it to install KVM on all of our hosts. For our playbook, we used an excellent example from the GitHub repository, created by Jared Bloomer, that we changed a bit since we already have our options and inventory configured. The original files are available at https://github.com/jbloomer/Ansible—Install-KVM-on-CentOS-7.git.
This playbook will show everything that we need to know about automating simple tasks. We chose this particular example because it shows not only how automation works, but also how to create separate tasks and reuse them in different playbooks. Using a public repository has an added benefit that you will always get the latest version, but it may differ significantly than the one presented here:
- First, we created our main playbook – the one that will get called – and named it
hostsvariable defines what part of the inventory this playbook is going to be performed on – in our case, all the hosts. We can override this (and all the other variables) at runtime, but it helps to limit the playbook to just the hosts we need to control. In our particular case, this is actually all the hosts in our inventory, but in production, we will probably have more than one group of hosts.
The next variable is the name of the user that is going to perform the task. What we did here is not recommended in production since we are using a superuser account to perform tasks. Ansible is completely capable of working with non-privileged accounts and elevating rights when needed, but as in all demonstrations, we are going to make mistakes so that you don’t have to and all in order to make things easier to understand.
Now comes the part that is actually performing our tasks. In Ansible, we declare roles for the system. In our example, there are two of them. Roles are really just tasks to be performed, and that will result in a system that will be in a certain state. In our first role, we are going to check if the system supports virtualization, and then in the second one, we will install KVM services on all the systems that do.
- When we downloaded the script from the GitHub, it created a few folders. In the one named
roles, there are two subfolders that each contain a file; one is called
checkVirtualizationand the other is called
You can probably already see where this is heading. First, let’s see what
This tasks simply calls a shell command and tries to
grepfor the lines containing virtualization parameters for the CPU. If it finds none, it fails.
- Now, let’s see the other task:
The first part is a simple loop that will just install five different packages if they are not present. We are using the package module here, which is a different approach than the one we used in our first demonstration regarding how to install packages. The module that we used earlier in this chapter is called
yumand is specific to CentOS as a distribution. The
packagemodule is a generic module that will translate to whatever package manager a specific distribution is using. Once we’ve installed all the packages we need, we need to make sure that
libvirtdis enabled and started.
We are using a simple loop to go through all the packages that we are installing. This is not necessary, but it is a better way to do things than copying and pasting individual commands since it makes the list of packages that we need much more readable.
Then, as the last part of the task, we verify if the KVM has loaded.
As we can see, the syntax for the playbook is a simple one. It is easily readable, even by somebody who has only minor knowledge of scripting or programming. We could even say that having a firm understanding of how the Linux command line works is more important.
- In order to run a playbook, we use the
ansible-playbookcommand, followed by the name of the playbook. In our case, we’re going to use the
ansible-playbook main.yamlcommand. Here are the results:
- Here, we can see that Ansible breaks down everything it did on every host, change by change. The end result is a success:
Now, let’s check if our freshly installed KVM cluster is working.
- We are going to start
virshand list the active VMs on all the parts of the cluster:
Having finished this simple exercise, we have a running KVM on four machines and the ability to control them from one place. But we still have no VMs running on the hosts. Next, we are going to show you how to create a CentOS installation inside the KVM environment, but we are going to use the most basic method to do so –
We are going to do two things: first, we are going to download a minimal ISO image for CentOS from the internet. Then, we are going to call
virsh. This book will show you different ways to accomplish this task; downloading from the internet is one of the slowest:
- As always, Ansible has a module dedicated to downloading files. The parameters it expects are the URL where the file is located and the location of the saved file:
- After running the playbook, we need to check if the files have been downloaded:
- Since we are not automating this and instead creating a single task, we are going to run it in a local shell. The command to run for this would be something like the following:
ansible all -m shell -a "virt-install --name=COS7Core --ram=2048 --vcpus=4 --cdrom=/var/lib/libvirt/boot/CentOS-7-x86_64-Minimal-1810.iso --os-type=linux --os-variant=rhel7 --disk path=/var/lib/libvirt/images/cos7vm.dsk,size=6"
- Without a kickstart file or some other kind of preconfiguration, this VM makes no sense since we will not be able to connect to it or even finish the installation. In the next task, we will remedy that using cloud-init.
Now, we can check if everything worked:
Now, we are going to wipe our KVM cluster and start again, but this time with a different configuration: we are going to deploy the cloud version of CentOS and reconfigure it using cloud-init.
Using Ansible and cloud-init for automation and orchestration
Cloud-init is one of the more popular ways of machine deployment in private and hybrid cloud environments. This is because it enables machines to be quickly reconfigured in a way that enables just enough functionality to get them connected to an orchestration environment such as Ansible.
More details can be found at cloud-init.io, but in a nutshell, cloud-init is a tool that enables the creation of special files that can be combined with VM templates in order to rapidly deploy them. The main difference between cloud-init and unattended installation scripts is that cloud-init is more or less distribution-agnostic and much easier to change with scripting tools. This means less work during deployment, and less time from start of deployment until machines are online and working. On CentOS, this can be accomplished with kickstart files, but this not nearly as flexible as cloud-init.
Cloud-init works using two separate parts: one is the distribution file for the operating system we are deploying. This is not the usual OS installation file, but a specially configured machine template intended to be used as a cloud-init image.
The other part of the system is the configuration file, which is compiled–or to be more precise, packed – from a special YAML text file that contains configuration for the machine. This configuration is small and ideal for network transmission.
These two parts are intended to be used as a whole to create multiple instances of identical virtual machines.
- First, we distribute a machine template that is completely identical for all the machines that we are going to create. This means having one master copy and creating all the instances out of it.
- Then, we pair the template with a specially crafted file that is created using cloud-init. Our template, regardless of the OS it uses, is capable of understanding different directives that we can set in the cloud-init file and will be reconfigured. This can be repeated as needed.
Let’s simplify this even more: if we need to create 100 servers that will have four different roles using the unattended installation files, we would have to boot 100 images and wait for them to go through all the installation steps one by one. Then, we would need to reconfigure them for the task we need. Using cloud-init, we are booting one image in 100 instances, but the system takes only a couple of seconds to boot since it is already installed. Only critical information is needed to put it online, after which we can take over and completely configure it using Ansible.
We are not going to dwell too much on cloud-init’s configuration; everything we need is in this example:
As always, we will explain what’s going on step by step. One thing we can see from the start is that it uses straight YAML notation, the same as Ansible. The first directive is here to make sure that our machine is updated as it enables automatically updating the packages on the cloud instance.
Lock_passwd means that we are going to permit using the password to log in. If nothing is configured, then the default is to permit logging in only using
SSH keys and disabling password login completely.
Then, we have a shell that this user will be able to use if something needs to be added to the
/etc/sudoers file. In this particular case, we are giving this user complete control over the system.
The last thing is probably the most important. This is the public
SSH key that we have on our system. It’s used to authorize the user when they’re logging in. There can be multiple keys here, and they are going to end up in the
SSHD configuration to enable users to perform a passwordless login.
There are plenty more variables and directives we can use here, so consult the
cloud-config documentation for more information.
After we have created this file, we need to convert it into an
.iso file that is going to be used for installation. The command to do this is
cloud-localds. We are using our YAML file as one parameter and the
.iso file as another.
cloud-localds config.iso config.yaml, we are ready to begin our deployment.
We are going to get it from https://cloud.centos.org/centos/7/images.
There are quite a few files here denoting all the available versions of the CentOS image. If you need a specific version, pay attention to the numbers denoting the month/year of the image release. Also, note that images come in two flavors – compressed and uncompressed.
Images are in
qcow2 format and intended to be used in the cloud as a disk.
In our example, on the Ansible machine, we created a new directory called
/clouddeploy and saved two file into it: one that contains the OS cloud image and
config.iso, which we created using
- First, we are going to copy the cloud image and our configuration onto our KVM hosts. After that, we are going to create a machine out of these and start it:
Since this is our first complicated playbook, we need to explain a few things. In every play or task, there are some things that are important. A name is used to simplify running the playbook; this is what is going to be displayed when the playbook runs. This name should be explanatory enough to help, but not too long in order to avoid clutter.
After the name, we have the business part of each task – the name of the module being called. In our example, we are using three distinct ones:
copyis used to copy files between hosts,
commandexecutes commands on the remote machine, and
virtcontains commands and states needed to control the virtual environment.
You will notice when reading this that
srcdenotes a local directory, while
destdenotes a remote one. This is by design. To simplify things,
copyworks between the local machine (the control node running Ansible) and the remote machine (the one being configured). Directories will get created if they do not exist, and
copywill apply the appropriate permissions.
After that, we are running a command that will work on local files and create a virtual machine. One important thing here is that we are basically running the image we copied; the template is on the control node. At the same time, this saves disk space and deployment time – there is no need to copy the machine from local to remote disk and then duplicate it on the remote machine once again; as soon as the image is there, we can run it.
Back to the important part – the local installation. We are creating a machine with 1 GB of RAM and one CPU using the disk image we just copied. We’re also attaching our
config.isofile as a virtual CD/DVD. We are then importing this image and using no graphic terminal.
- The last task is starting the VM on the remote KVM host. We will use the following command to do so:
We can also check this using the command line:
ansible cloudhosts -m shell -a "virsh list –all"
The output of this command should look something like this:
ansible cloudhosts -m shell -a "virsh net-dhcp-leases –-network default"
This verifies that our machines are running correctly and that they are connected to their local network on the local KVM instance. Elsewhere in this book, we will deal with KVM networking in more detail, so it should be easy to reconfigure machines to use a common network, either by bridging adapters on the KVMs or by creating a separate virtual network that will span across hosts.
Another thing we wanted to show is the machine status for all the hosts. The point is that we are not using the shell module this time; instead, we are relying on the
virt module to show us how to use it from the command line. There is only one subtle difference here. When we are calling shell (or
command) modules, we are calling parameters that are going to get called. These modules basically just spawn another process on the remote machine and run it with the parameters we provided.
In contrast, the
virt module takes the variable declaration as its parameter since we are running
command=info. When using Ansible, you will notice that, sometimes, variables are just states. If we wanted to start a particular instance, we would just add
state=running, along with an appropriate name, and Ansible would make sure that the VM is running. Let’s type in the following command:
ansible cloudhosts -m virt -a "command=info"
The following is the expected output:
There is only one thing that we haven’t covered yet – how to install multi-tiered applications. Pushing the definition to its smallest extreme, we are going to install a LAMP server using a simple playbook.