Linux

Kill Your SSH Keys: Production SSH Certificate Authentication with step-ca

Diagram of step-ca SSH certificate authority issuing short-lived SSH certificates to a developer laptop via OIDC, used to authenticate against a fleet of Linux servers

Every infrastructure team I've worked with has the same dirty secret. There's a directory somewhere — usually on a shared drive, sometimes in a 1Password vault if the team is feeling responsible — that contains the SSH public keys of everyone who's ever needed access to a server. Some of those people left two years ago. Their keys are still in ~/.ssh/authorized_keys on at least one box. Probably more. Nobody knows which ones.

This is the silent technical debt that nobody talks about until an auditor asks. And by then, the answer is always the same: a frantic Ansible playbook to scrape and rewrite authorized_keys files across the fleet, followed by a promise to "do better next time" that nobody keeps.

SSH certificates solve this. Not "improve the situation" — actually solve it. The right way to manage SSH access in 2026 isn't a better key distribution tool. It's to stop using long-lived keys entirely and switch to certificates issued by your own CA, valid for a few hours, signed against an identity you actually trust.

This guide walks through standing up step-ca as your internal SSH certificate authority, configuring your Linux fleet to trust it, and wiring it up to OIDC so engineers log into servers using the same SSO they already use for everything else. By the end you should have a setup you can actually deploy in production.

What SSH certificates actually fix

OpenSSH has supported certificate-based authentication since version 5.4, which shipped in 2010. That's not a typo. The feature has been quietly sitting there for fifteen years while everyone kept emailing id_ed25519.pub files to each other.

A certificate is just a public key with metadata attached and a signature from a trusted CA. For SSH, that metadata includes:

A principal name (the username allowed to log in)
A validity window (start and end timestamps)
Optional restrictions (which source IPs, which commands, force-command, no-port-forwarding, and so on)
An identity string for audit logs

When the client presents a certificate to sshd, the server checks the signature against its trusted user CA. If it's signed by the right CA, hasn't expired, and the principal matches the user logging in, access is granted. No authorized_keys lookup. No file to maintain. The server doesn't need to know anything about individual users — it only needs to trust the CA.

Flip the model and the same thing applies to hosts. Instead of every developer's known_hosts file silently growing for years and you ignoring the "host key changed" warning when a box is rebuilt, the host presents a certificate signed by the host CA. Your laptop trusts the host CA. End of TOFU.

That's the whole shift. One trust anchor instead of N×M key relationships.

What step-ca brings to the table

step-ca is the open-source CA from Smallstep. It does X.509 and SSH out of the same daemon, supports ACME (so you can replace internal Let's Encrypt setups with it), and crucially has a clean OIDC provisioner — which means you can wire it to Google Workspace, Authentik, Keycloak, or any other identity provider you're already running, and engineers get SSH certificates by completing a normal OAuth flow. No new password to remember. Offboarding is automatic: the second IT disables their Google account, they can no longer get a new SSH cert.

It's not the only option. Teleport is the bigger, more enterprise-flavored choice with session recording and a web UI. HashiCorp Vault has an SSH secrets engine that does roughly the same thing. HashiCorp's relicense in 2023 pushed a lot of people toward step-ca for new deployments because it's still Apache 2.0, and the daemon is one binary with no clustering required for small to medium fleets.

For a typical company with 10–500 Linux hosts and an SSO provider, step-ca is the right call.

Architecture

You need three things running:

The CA box. A small VPS or VM, ideally on its own, that runs step-ca. This is the thing that holds the user CA and host CA private keys. Treat it like you'd treat your password manager — small, hardened, backed up, not exposed to the public internet.
The hosts in your fleet. Each runs sshd configured to trust certificates signed by the CA's user CA key. Each one also has its own SSH host certificate signed by the host CA key, renewed automatically by a systemd timer.
The clients. Engineer laptops with the step CLI installed. To log in to a server they run step ssh login once, which opens an OIDC flow, drops a short-lived cert in the SSH agent, and from that point until the cert expires they just run ssh user@host like nothing changed.

The mental model is worth getting clear on before touching configs: you are running a private PKI. Two CA keys (one for users, one for hosts). All trust descends from those two keys. Lose them, and you rebuild everything. Rotate them, and every host needs new configuration. Take this seriously.

Prerequisites

A clean Linux VM dedicated to running the CA. 1 vCPU, 1 GB of RAM, 10 GB disk. Ubuntu 24.04 LTS, AlmaLinux 9, or Debian 12 — pick what your team already runs.
A DNS record pointing at it. Internal record if you have internal DNS, otherwise a normal A record. The hostname matters because the CA's API serves on TLS, and clients connect by name. I'll use ca.example.com throughout.
Network reachability. Hosts and clients need to be able to reach the CA on TCP 9000 (the default port). If you have a VPN or WireGuard mesh between offices and infrastructure, put the CA inside that. If not, expose it on the public internet but firewall it tightly — the API is TLS-protected by design.
An OIDC provider you control. Google Workspace, Authentik, Keycloak, Okta, Azure AD — all work. You'll need to create an OAuth client with localhost as a redirect URI.

⚠️ Do not run step-ca on the same box as your bastion host or any other production service. The CA's private keys are the keys to your kingdom. Single-purpose box, smaller attack surface.

Step 1 — Install step-ca on the CA box

I prefer Docker for this. The binary install works fine too, but containerizing means the systemd boundary is clean and the config is in one directory you can rsync.

SSH into the CA box as a sudo user. Install Docker if you haven't already:

curl -fsSL https://get.docker.com | sh && systemctl enable --now docker

Create a working directory and pull the image:

mkdir -p /opt/step-ca && cd /opt/step-ca && docker pull smallstep/step-ca:latest

Generate a strong password for the intermediate key. Save it somewhere safe (password manager). You'll need it every time the CA starts:

openssl rand -base64 32 > /opt/step-ca/password.txt && chmod 600 /opt/step-ca/password.txt

Step 2 — Initialize the CA with SSH support

This is the part where you create the root and intermediate CA keys. Once done, do not lose them. Treat the resulting /opt/step-ca/.step directory as a crown-jewel asset.

docker run -it --rm -v /opt/step-ca:/home/step smallstep/step-ca:latest step ca init --ssh

You'll get a series of prompts. Reasonable answers:

PKI name: your company name. Goes into the cert subject.
DNS names: ca.example.com and the box's internal IP. Both, comma-separated. Clients must connect using one of these.
Address: :9000 (default).
Provisioner name: admin is fine for now. You'll add an OIDC provisioner later.
Password: paste the contents of password.txt. Or pick "generate one" and write it down.

When it finishes, capture the root CA fingerprint that gets printed. You need that fingerprint on every client to establish initial trust. Save it.

Look at what was generated:

ls /opt/step-ca/.step/secrets/ && ls /opt/step-ca/.step/certs/

You should see root_ca_key, intermediate_ca_key, ssh_host_ca_key, and ssh_user_ca_key in the secrets directory. Those four files are the entirety of your trust hierarchy.

Step 3 — Run step-ca as a service

Create a systemd unit so the CA starts on boot and restarts on failure:

cat > /etc/systemd/system/step-ca.service << 'EOF'
[Unit]
Description=step-ca certificate authority
After=docker.service
Requires=docker.service

[Service]
Restart=always
ExecStartPre=-/usr/bin/docker rm -f step-ca
ExecStart=/usr/bin/docker run --rm --name step-ca -p 9000:9000 -v /opt/step-ca:/home/step smallstep/step-ca:latest --password-file /home/step/password.txt
ExecStop=/usr/bin/docker stop step-ca

[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload && systemctl enable --now step-ca && systemctl status step-ca

Verify it's listening:

ss -tlnp | grep 9000

⚠️ The password file lives unencrypted on disk. This is unavoidable for unattended startup. Compensate by hardening the host: full disk encryption, restricted sudo, no shared logins, audit logging via auditd or Wazuh.

Step 4 — Add an OIDC provisioner

This is the part that makes the whole thing pleasant to use. You're going to tell step-ca that anyone who can prove they're a member of your Google Workspace / Authentik / whatever instance can request a short-lived SSH certificate.

Create the OAuth client in your IdP first. Set the redirect URI to http://127.0.0.1 (the step CLI listens on a random local port and OIDC providers will accept any port on 127.0.0.1). Note the client ID and client secret.

Then on the CA box:

docker exec -it step-ca step ca provisioner add google --type=OIDC --ssh --client-id=YOUR_CLIENT_ID --client-secret=YOUR_CLIENT_SECRET --configuration-endpoint=https://accounts.google.com/.well-known/openid-configuration --domain=example.com

Replace --domain=example.com with whatever email domain your team uses. This restricts certificate issuance to users in that domain only.

For Authentik or Keycloak, swap the configuration endpoint URL — it's the .well-known/openid-configuration URL of your IdP.

Restart the CA to pick up the new provisioner:

systemctl restart step-ca

Step 5 — Get the SSH user CA public key

Hosts in your fleet need to be told which public key signs valid user certificates. Pull it from the CA:

docker exec step-ca step ssh config --roots

That prints a single line starting with ecdsa-sha2-nistp256 or similar. Save it. You'll push it to every host in the next step.

Step 6 — Configure sshd on a host to trust the user CA

Pick one Linux host to test on first. Don't roll this across your fleet until you've verified it works end to end, and ⚠️ keep your existing SSH session open while you do this — if you misconfigure sshd, you don't want to be locked out.

On the host, write the user CA public key to a file sshd can read:

echo "ecdsa-sha2-nistp256 AAAA...your-actual-key..." > /etc/ssh/ca_user_key.pub && chmod 644 /etc/ssh/ca_user_key.pub

Tell sshd to trust certificates signed by that key:

echo "TrustedUserCAKeys /etc/ssh/ca_user_key.pub" >> /etc/ssh/sshd_config.d/10-step-ca.conf

Validate the config before restarting:

sshd -t && systemctl restart sshd

If sshd -t exits silently, the config is valid. If it prints anything, fix it before restarting. From this point onward, the host accepts both the existing authorized_keys entries and SSH certificates signed by your CA. You haven't removed anything yet — you've added a new path.

Step 7 — Issue a host certificate so clients can verify the host

This is the half people skip. Don't skip it. Without a host certificate, clients still get the "is this fingerprint correct?" TOFU prompt the first time they connect, and your known_hosts file keeps growing forever.

On the host, install the step CLI:

wget -O step.deb https://dl.smallstep.com/cli/docs-ca-install/latest/step-cli_amd64.deb && dpkg -i step.deb && rm step.deb

For RHEL-family use the .rpm from the same URL pattern.

Bootstrap the host's trust in the CA. You need the CA URL and the root fingerprint you saved earlier:

step ca bootstrap --ca-url https://ca.example.com:9000 --fingerprint YOUR_ROOT_FINGERPRINT

Issue a host certificate using the existing SSH host key:

step ssh certificate --host --sign $(hostname -f) /etc/ssh/ssh_host_ecdsa_key.pub

That produces ssh_host_ecdsa_key-cert.pub in the current directory. Move it where sshd expects it and tell sshd to present it:

mv $(hostname -f)-cert.pub /etc/ssh/ssh_host_ecdsa_key-cert.pub && echo "HostCertificate /etc/ssh/ssh_host_ecdsa_key-cert.pub" >> /etc/ssh/sshd_config.d/10-step-ca.conf && sshd -t && systemctl restart sshd

⚠️ Host certificates expire. Default lifetime is 30 days. You need a systemd timer to renew them automatically:

cat > /etc/systemd/system/step-ssh-host-renew.service << 'EOF'
[Unit]
Description=Renew SSH host certificate

[Service]
Type=oneshot
ExecStart=/usr/bin/step ssh renew --force /etc/ssh/ssh_host_ecdsa_key-cert.pub /etc/ssh/ssh_host_ecdsa_key
ExecStartPost=/bin/systemctl restart sshd
EOF

cat > /etc/systemd/system/step-ssh-host-renew.timer << 'EOF'
[Unit]
Description=Renew SSH host certificate daily

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target
EOF

systemctl daemon-reload && systemctl enable --now step-ssh-host-renew.timer

Step 8 — Configure a client to use the CA

On your laptop (or wherever you SSH from), install step the same way. Bootstrap the same trust:

step ca bootstrap --ca-url https://ca.example.com:9000 --fingerprint YOUR_ROOT_FINGERPRINT

Tell your local SSH client to recognise host certificates signed by the host CA. Get the host CA public key from the CA:

step ssh config --host --roots

Add it to your known_hosts as a CA marker:

echo "@cert-authority *.example.com $(step ssh config --host --roots)" >> ~/.ssh/known_hosts

Now log in via the OIDC provisioner. This opens a browser window, you complete the SSO flow, and a short-lived certificate lands in your SSH agent:

step ssh login your-email@example.com --provisioner google

SSH into the test host:

ssh your-username@hostname.example.com

No password prompt. No "verify fingerprint" prompt. No authorized_keys entry on the server. The cert in your agent is signed by the user CA, the host trusts the user CA, you're in. Run step ssh list to see the cert in the agent and its expiry.

Step 9 — Decide on certificate lifetimes

Default is 16 hours for user certs. That's a sensible starting point. Engineers run step ssh login once per workday and forget about it.

For higher-security environments, drop it to 4 or 8 hours. For lower-security internal stuff, you can push it to 24. Anything longer than 24 hours defeats most of the point — you're approaching the security profile of static keys again.

Set it per provisioner in the CA config:

docker exec step-ca step ca provisioner update google --ssh-user-min-dur=15m --ssh-user-max-dur=16h --ssh-user-default-dur=16h && systemctl restart step-ca

Step 10 — Roll out to the fleet with Ansible

Once you've validated on one host, the rollout is a small Ansible play. The pieces:

Install step CLI on the host.
Bootstrap trust with step ca bootstrap.
Write /etc/ssh/ca_user_key.pub with the user CA public key (template it from step ssh config --roots).
Drop the sshd config snippet at /etc/ssh/sshd_config.d/10-step-ca.conf.
Issue an initial host certificate.
Install the renewal timer.
Restart sshd.

That's a half-day of work for someone who knows their Ansible. Run it against staging first.

Hardening notes

A few things to do that the documentation tends to undersell:

Move the CA inside your VPN. Public internet exposure is unnecessary if you already have WireGuard or Tailscale across your environments. The CA only needs to be reachable from hosts and engineer laptops.
Backup the .step directory daily. Encrypted, offsite. The root CA key isn't needed at runtime but you need it if you ever have to issue a new intermediate. The intermediate key plus the SSH CA keys are needed every time the CA starts. Losing them means rebuilding from scratch and reconfiguring every host.
Don't leave the JWK admin provisioner active in production. After OIDC is working, remove the default admin provisioner: step ca provisioner remove admin. Use OIDC for everything.
Audit logging. step-ca logs every cert issuance. Ship those logs to your SIEM (Wazuh, Loki, Graylog). Now you have an audit trail showing exactly who got an SSH cert for which host at which time. This is the artefact your auditor actually wants.
Remove the legacy authorized_keys files once you're confident in the cert-based flow. Until you delete them, your old keys are still valid and the certificate system is just an additional path. The whole point is the eventual deletion.
2FA at the IdP layer, not at SSH. Don't try to add OTP to sshd — it's painful. Enforce 2FA on the OIDC provider instead. The SSH cert flow inherits that requirement automatically.
Use ECDSA or Ed25519 for the SSH CA keys. Default is fine; mentioning it because RSA still creeps in when people copy old configs.

When SSH certificates are not the right answer

A few honest tradeoffs:

Tiny teams with two servers and three people. The setup overhead doesn't pay back. Just use a password manager with proper key hygiene.
You don't have an OIDC provider. You can still use step-ca with the JWK provisioner (one-time tokens) or the X5C provisioner (device certs). It works but the operational experience is much less pleasant than OIDC. Get the IdP first.
You have a Windows-heavy fleet. Windows OpenSSH supports certificates but the tooling story is rougher. PuTTY only got certificate support recently. Test thoroughly on Windows clients before promising anything.
You need full session recording and command audit. That's a Teleport-shaped problem, not a step-ca-shaped one. step-ca issues certs and stops there.

For everyone else — companies running 10 to several hundred Linux hosts with an existing IdP — this is the way to do SSH access in 2026.