Machine ID with Ansible
Ansible is a common tool for managing fleets of Linux hosts via SSH. In order to connect to the hosts, it requires a form of authentication. Machine ID can be used to provide short-lived certificates to Ansible that allow it to connect to SSH nodes enrolled in Teleport in a secure and auditable manner.
In this guide, you will configure the Machine ID agent, tbot
, to produce
credentials and an OpenSSH configuration, and then configure Ansible to use
these to connect to your SSH nodes through the Teleport Proxy Service.
Prerequisites
You will need the following tools to use Teleport with Ansible.
-
A running Teleport cluster version 17.0.0-dev or above. If you want to get started with Teleport, sign up for a free trial or set up a demo environment.
-
The
tctl
admin tool andtsh
client tool.Visit Installation for instructions on downloading
tctl
andtsh
.
-
ssh
OpenSSH tool -
ansible
>= 2.9.6 -
Optional:
jq
to processJSON
output -
tbot
must already be installed and configured on the machine that will run Ansible. For more information, see the deployment guides. -
If you followed the above guide, note the
--destination-dir=/opt/machine-id
flag, which defines the directory where SSH certificates and OpenSSH configuration used by Ansible will be written.In particular, you will be using the
/opt/machine-id/ssh_config
file in your Ansible configuration to define how Ansible should connect to Teleport Nodes. -
To check that you can connect to your Teleport cluster, sign in with
tsh login
, then verify that you can runtctl
commands using your current credentials.For example:
$ tsh login --proxy=teleport.example.com --user=email@example.com
$ tctl status
# Cluster teleport.example.com
# Version 17.0.0-dev
# CA pin sha256:abdc1245efgh5678abdc1245efgh5678abdc1245efgh5678abdc1245efgh5678If you can connect to the cluster and run the
tctl status
command, you can use your current credentials to run subsequenttctl
commands from your workstation. If you host your own Teleport cluster, you can also runtctl
commands on the computer that hosts the Teleport Auth Service for full permissions.
Step 1/4. Configure RBAC
As Ansible will use the credentials produced by tbot
to connect to the SSH
nodes, you first need to configure Teleport to grant the bot access. This is
done by creating a role that grants the necessary permissions and then assigning
this role to a Bot.
In this example, access will be granted to all SSH nodes for the username root. Ensure that you set this to a username that is available across your SSH nodes and that will have the appropriate privileges to manage your nodes.
Create a file called role.yaml
with the following content:
kind: role
version: v6
metadata:
name: example-role
spec:
allow:
# Allow login to the user 'root'.
logins: ['root']
# Allow connection to any node. Adjust these labels to match only nodes
# that Ansible needs to access.
node_labels:
'*': '*'
Replace example-role
with a descriptive name related to your use case.
For production use, you should use labels to restrict this access to only the hosts that Ansible will need to access. This is known as the principal of least privilege and reduces damage that exfiltrated credentials can do.
Use tctl create -f ./role.yaml
to create the role.
Now, use tctl bots update
to add the role to the Bot. Replace example
with the name of the Bot you created in the deployment guide and example-role
with the name of the role you just created:
$ tctl bots update example --add-roles example-role
Step 2/4. Configure the tbot
output
Now, tbot
needs to be configured with an output that will produce the
credentials and SSH configuration that is needed by Ansible. For SSH,
we use the identity
output type.
Outputs must be configured with a destination. In this example, the directory
destination will be used. This will write these credentials to a specified
directory on disk. Ensure that this directory can be written to by the Linux
user that tbot
runs as, and that it can be read by the Linux user that Ansible
will run as.
Modify your tbot
configuration to add an identity
output:
outputs:
- type: identity
destination:
type: directory
# For this guide, /opt/machine-id is used as the destination directory.
# You may wish to customize this. Multiple outputs cannot share the same
# destination.
path: /opt/machine-id
If operating tbot
as a background service, restart it. If running tbot
in
one-shot mode, it must be executed before you attempt to execute the Ansible
playbook.
You should now see several files under /opt/machine-id
:
ssh_config
: this can be used with Ansible or OpenSSH to configure them to use the Teleport Proxy Service with the correct credentials when making connections.known_hosts
: this contains the Teleport SSH host CAs and allows the SSH client to validate a host's certificate.key-cert.pub
: this is an SSH certificate signed by the Teleport SSH user CA.key
: this is the private key needed to use the SSH certificate.
Next, Ansible will be configured to use these files when making connections.
Step 3/4. Configure Ansible
Create a folder named ansible
where all Ansible files will be collected.
$ mkdir -p ansible
$ cd ansible
Create a file called ansible.cfg
. We will configure Ansible
to run the OpenSSH client with the configuration file generated
by Machine ID, /opt/machine-id/ssh_config
. Note, example.com
here is the
name of your Teleport cluster.
[defaults]
host_key_checking = True
inventory=./hosts
remote_tmp=/tmp
[ssh_connection]
scp_if_ssh = True
ssh_args = -F /opt/machine-id/ssh_config
You can then create an inventory file called hosts
. This should refer to the
hosts using their hostname as registered in Teleport and the name of your
Teleport cluster should be appended to this. For example, if your cluster is
called teleport.example.com
and your host is called node1
, the entry in
hosts
would be node1.teleport.example.com
.
You can generate an inventory file for all your nodes that meets this requirement with a script like the following:
# Source tsh env to get the name of the current Teleport cluster.
$ eval "$( tsh env )"
# You can modify the `tsh ls` command to filter nodes based ont he label.
$ tsh ls --format=json | jq --arg cluster $TELEPORT_CLUSTER -r '.[].spec.hostname + "." + $cluster' > hosts
Not seeing Nodes?
When Teleport's Auth Service receives a request to list Teleport Nodes (e.g., to
display Nodes in the Web UI or via tsh ls
), it only returns the Nodes that the
current user is authorized to view.
For each Node in the user's Teleport cluster, the Auth Service applies the following checks in order and, if one check fails, hides the Node from the user:
- None of the user's roles contain a
deny
rule that matches the Node's labels. - At least one of the user's roles contains an
allow
rule that matches the Node's labels.
If you are not seeing Nodes when expected, make sure that your user's roles
include the appropriate allow
and deny
rules as documented in the
Teleport Access Controls Reference.
Step 4/4. Run a playbook
Finally, let's create a simple Ansible playbook, playbook.yaml
. The example
playbook below runs hostname
on all hosts.
- hosts: all
remote_user: root
tasks:
- name: "hostname"
command: "hostname"
From the folder ansible
, run the Ansible playbook:
$ ansible-playbook playbook.yaml
# PLAY [all] *****************************************************************************************************************************************
# TASK [Gathering Facts] *****************************************************************************************************************************
#
# ok: [terminal]
#
# TASK [hostname] ************************************************************************************************************************************
# changed: [terminal]
#
# PLAY RECAP *****************************************************************************************************************************************
# terminal : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
You are all set. You have provided your machine with short-lived certificates tied to a machine identity that can be rotated, audited, and controlled with all the familiar Teleport access controls.
Troubleshooting
In case if Ansible cannot connect, you may see error like this one:
example.host | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname node-name: Name or service not known",
"unreachable": true
}
You can examine and tweak patterns matching the inventory hosts in ssh_config
.
Try the SSH connection using ssh_config
with verbose mode to inspect the error:
$ ssh -vvv -F /opt/machine-id/ssh_config root@node-name.example.com
If ssh
works, try running the playbook with verbose mode on:
$ ansible-playbook -vvv playbook.yaml
If your hostnames contain uppercase characters (like MYHOSTNAME
), please note that Teleport's internal hostname matching
is case sensitive by default, which can also lead to seeing this error.
If this is the case, you can work around this by enabling case-insensitive routing at the cluster level.
- Self-hosted Teleport
- Managed Teleport Enterprise/Cloud
Edit your /etc/teleport.yaml
config file on all servers running the Teleport auth_service
, then restart Teleport on each.
auth_service:
case_insensitive_routing: true
Run tctl edit cluster_networking_config
to add the following specification, then save and exit.
spec:
case_insensitive_routing: true
Next steps
- Read the configuration reference to explore all the available configuration options.