Connection to Linux systems is all Secure Shell (hereafter: SSH) based. There are multiple underlying cryptographic algorithms to provide secure authentication. While password authentication is still supported, the main recommendation is to use key-based authentication to prevent password brute forcing. Something else you can do to increase the level of control you have as system administrator over your systems is the usage of SSH Certificate Authorities.

This blog post is partly a collection of other blog posts by other authors, which helped me a great deal in understanding how SSH Certificates Authorities (hereafter: CAs) work. I also provide a simple implementation that will hopefully show the value of using SSH Certificates on top of normal SSH key authentication. I could not find a comprehensive resource that talked about SSH CAs from start to finish, which also included an implementation. This is my attempt.

Throughout the entirety of this blog post I am using elliptic curve based cryptographic algorithms. I guess that everything what I am doing also translates to RSA based keys.

Prerequisite knowledge for this blog is how to set up public-private key-pairs and how you can use them to connect to a remote host. There are a lot of sources around about setting it up and how they work, for example:

The rest of this blog post constitutes of the following parts:

SSH Host CA - Fixing TOFU
SSH User CA - Adding Access Administration and RBAC
Conclusion
Sources

1. SSH Host CA - Fixing Trust on First Use (TOFU)

A familiar problem while using SSH keys, is that when you first connect to a certain host, the host key that is presented to you of the remote system is unknown/untrusted. Thus, you are presented with the question: Do you want to trust this key? Generally this is seen as a “please press yes to continue”, but is something that is of critical importance when connecting to hosts. What you do by pressing ‘yes’ is Trust on First Use (hereafter: TOFU).

Ideally you would like to verify this key out of band of the SSH protocol. You care, because you want to make sure you are connecting to the right host the first time and avoid Man In the Middle attacks (hereafter: MITM). From then onwards you can trust that the connection remains secure.

One solution would be to manually verify the ssh keys, using ssh-keyscan over the network, but a better way would be to dump the keys on the host itself using ssh-keyscan -lf << key_file >>.pub. As you might know, this does not scale very well. This problem can be solved neatly using a host certificate. Visually, this looks like this:

SSH Host CA

Note that this certificate is signed using the hostname of the machine and has a certain period of validity. In detail, this looks like this:

ssh-keygen -L -f /etc/ssh/ssh_host_ed25519_key-cert.pub
/etc/ssh/my_host_key-cert.pub:
        Type: ssh-ed25519-cert-v01@openssh.com host certificate
        Public key: << public key of host certificate >>
        Signing CA: << public key of host certificate >>
        Key ID: "<< host.example.com >>"
        Serial: 0
        Valid: from 2020-01-T69:11:00 to 1337-02-11T11:13:37
        Principals:
                << hostname >>
        Critical Options: (none)
        Extensions: (none)

However, a general issue with trust is that run into some form of chicken-and-egg problem. To sign the remote public key of the host, you would first need to get that public key. However, you you do not have trust to get that yet. You thus need to pre-generate the key in order to avoid TOFU.

The Basics

If you already have a connection to an existing host and want to implement a host CA, the following commands might be of value. Note that this does not fix TOFU.

We first generate a key-pair to use as Host CA:

ssh-keygen -t ed25519 -f my_host_ca -C <<tag>>

Next, we can use this host CA to sign the host key of the server with which we want to connect. First get the key from the host you would like to connect to, sign the key, update the ssh config and put it back. In more detail:

Fetch the host key (/etc/host/ssh_host« crypto_algorithm »_key.pub) from the server.
Sign the server host key:

ssh-keygen -s my_host_ca \
-I <<domain>> \
-h -n <<domain>> \
-V +<< validity>> \
<< keyname >>.pub

NB: You sign the host key from the same algorithm of the key you are connecting with. E.g. when you connect with ed25519 keys, you sign the ed25519 host key.

Put the server certificate on the server.
Add the following line to the SSH_config on the server to use the host certificate:

/etc/ssh/sshd_config

# Trust given host certificate
HostCertificate /etc/ssh/<< keyname >>-cert.pub

Add the public key of to the known hosts file on the client:

~/.ssh/known_hosts

@cert-authority <<hostname of server >> <<content my_host_ca.pub>>

If you already had a previous connection to the host, do not forget to remove this from the known_hosts file.

Now you SSH to the remote server. Use the following to verify whether the host certificate is present:

ssh -vv <<hostname>> 2>&1 | grep "Server host certificate"

It will show the following if the SSH certificate is configured correctly:

debug1: Server host certificate: ... valid from ...  to ...
debug2: Server host certificate hostname: << server hostname>>

Towards Production Proof

The best way to sign the host certificate is to integrate it with the first configuration of your system, if you want to bypass TOFU completely.

An example would be to sign the host key in a cloud-init workflow when you are creating a new virtual machine (VM) in the cloud. A good example can be found on the blog van Torsten Bøgh Köster.

Ideally, you do not want the private key of the Host CA to exist on the VM at any point in time (as he also mentions later in his blog). A way around this is by pre-generation the host keys for the system, also in cloud-init, but only writing the resulting key and certificate. This eliminates the need to sign the key ad-hoc and your private key touching disk.

My implementation of this is available in an Ansible role on Github.

In short

The problems solved by this solution are:

You can bypass TOFU, by having a clearer way to authenticate hosts.
Your ~/.ssh/known_hosts file becomes less of a mess as multiple hosts can be signed by the same CA.

The problems created by this solution are:

You have to maintain some form of Public-Key-Infrastructure (hereafter: PKI).

2. SSH User CA - Adding Access administration and Role-Based Access Control (RBAC)

Another common problem, when administrating a number of Linux systems is access administration and controlling level of access. Using the more traditional plain SSH key-based authentication, you allow a key to give access as a certain user, but that does not give you full Role-Based Access Control (hereafter: RBAC).

Additionally, in a multi-user scenario you need to track all the keys of all the users ~/.ssh/authorized_keys files (manually), which is might be doable for one or a couple of systems, but does not scale. It will probably also lead to headaches.

Instead, what you can do is sign the public key of a SSH key-pair with a Trusted User Certificate Authority. That CA then provides a layer on top of the simple key authentication by adding permissions and principals to the key the user is authenticating with. This allows for a way more centralized process of administration across multiple systems. Visually, this looks like this:

SSH User CA

An example of such a signed key is the following:

ssh-keygen -L my_public_key-cert.pub


~/.ssh/my_public_key-cert.pub:
        Type: ssh-ed25519-cert-v01@openssh.com user certificate
        Public key: << Public key associated >>
        Signing CA: << Private key associated >>
        Key ID: "<< generally email of user >>"
        Serial: 0
        Valid: from 2022-10-29T14:69:00 to 2024-10-23T18:13:37
        Principals:
                << list of allowed principals, like >>
                foo
                bar
        Critical Options: (none)
        Extensions:
                << permissions of this key like the following >>
                permit-X11-forwarding
                permit-agent-forwarding
                permit-port-forwarding
                permit-pty
                permit-user-rc

The principals are an overlay over users that exist on the system. By pointing to a template file in an AuthorizedPrincipalsFile, you can map the principal in the certificate to user on the system. When the file the principal points to includes the authenticating user, it is allowed.

Additionally, the validity of these certificates can be time-restricted, but certificates that are still valid have to be actively be revoked in a revocation file.

Setting it up

Again, we need to generate and SSH key pair we will use as a CA:

ssh-keygen -t ed25519 -f my_host_ca -C <<tag>>

Next, we can add the public key of the trusted user CA to the external system, by copying the public key of the key-pair. You can point to that file later in the SSH config.

You can add the public key of the User CA to a file (split multiple keys by newlines):

/etc/ssh/ssh_trusted_user_cas
<< user ca public key >>
<< a second user ca public key >>

Additionally, you should probably set up some principals to sign certificates with. The easiest way to do is to set up a directory tree using principal names as files and users as content:

Folder structure:

user@machine ~ tree /etc/ssh/auth_principals
/etc/ssh/auth_principals
├── foo
└── bar

0 directories, 2 files

File contents:

etc/ssh/auth_principals/foo

marktwane
johndoe

etc/ssh/auth_principals/bar

johndoe

Add the following to the configuration file:

/etc/sshd_config
TrustedUserCAKeys /etc/ssh/ssh_trusted_user_cas
...
...
AuthorizedPrincipalsFile /etc/ssh/auth_principals/%u

In this case when you sign the certificate with principal foo that person is allowed to log in as johndoe or marktwane.

An additional bonus is that in the logging, you get a very nice audit trail by using certificates in the /var/log/auth.log. The logging messages are very clear on why something was or was not allowed. For example:

Jan 22 11:46:04 << hostname >> sshd[310761]: Accepted publickey for << user >> from << ip >> port <<port>> ssh2: << alg >>-CERT SHA256:<<sha256>> ID << cert name >> (serial 0) CA << alg >> SHA256:<< sha256 >>

In short

The problems solved by this solution are:

You can limit access by time
You can limit access by role
You can control what SSH arguments a user can pass.
You get a better audit trail

The problems created by this solution are:

You have to maintain some form of PKI.
You have to track what certificates are out there (to be able to revoke them, or are forced to rotate your CA)

3. Conclusion

In this blog post, I have gone into how to set up host certificates and user certificates. They have a couple of advantages. You can bypass TOFU, limit access by time and role, clean up your known_hosts file, control what SSH arguments are allowed and get a better audit trail.

However, it introduces a new problem of having to manage PKI. Managing PKI is a complex problem and requires good maintenance and process to maintain. As you might have noticed, I have not developed a full production proof solution, which including such a process of rotation of the CAs (among other things).

Some would argue that having certificates that are permanently valid is good enough. I feel that takes some of the advantages away you could get by setting this up. My recommendation is to set up an internal process to renew your PKI once in a while, to avoid that when a certificate is compromised, your only option is to revoke it.

I hope this blog helped you out in understanding SSH CAs. If so, please let me know!

4. Sources

The excellent manuals of OpenBSDs OpenSSH, primarily:

man ssh-keygen (CERTIFICATES section)
man ssh_dconfig