What I wish was covered in DNSSEC tutorials

Posted on January 1, 2024

Over the Christmas break I enabled DNSSEC on several of my domains. For a technology that’s been around over two decades and viable to actually deploy for at least one, it was surprisingly hard to find a comprehensive guide on how to do it. In this post I’ll share what I learned from numerous tutorials, blogs, and RFCs.

Who is this for?

This post if focused on authoritative DNS. If you own a domain name then you have an authoritative DNS server. This is different from a recursive DNS server which looks up names for you in other people’s domains. If you want to use DNSSEC for the latter, simply configure any validating resolver over a secure transport and you’re done.

In many cases authoritative DNS is provided by your domain registrar, who will have a 1-click process for enabling DNSSEC. If that applies to you, go find that button and click it; you don’t need any tutorials. I did this for all of my Cloudflare registered domains and it worked perfectly. If your DNS provider and registrar are separate entities, it’ll take more than 1 click, but the process is similar.

Alternatively you may run your own DNS server, either publicly or as a hidden primary feeding a third-party host. In this case you probably express your DNS records in BIND-style zone files. I do this for a few domains, and the rest of this post focuses on how to sign those zone files to make them secure.

Getting started

After some introductory reading, I find the best way to get familiar with a technology is to do something with it, end-to-end. I run NSD, an excellent authoritative DNS server from the makers of the popular Unbound, so I followed a tutorial for setting up DNSSEC on NSD. Skeptical of the 2014 date and references to vintage software versions, I didn’t publish any of the outputs yet… but it’s useful to have been through the process once, and inspected the various artifacts, before trying it in prod. A similar tutorial exists for DNSSEC on BIND.

At the end of the tutorial you’ll have:

A signed copy of your zone containing the original records plus some new DNSKEY, RRSIG, and NSEC3 records. We’ll come back to each of these record types.
A set of keys to be saved and reused when making future signatures of the zone.
A DS record to be provided to your domain registrar for publishing into the parent zone.

You can mostly ignore the distinction between zone signing key and key signing key; simply pass both to every invocation of ldns-signzone, in either order. It’ll be important to keep them straight when we get to key rotation though.

Applying modern best practices

We’ll make some modifications to the commands to follow best practices for 2024.

Cryptography

Cryptography is always evolving. The 2014 tutorial uses SHA-1 hashing, which is considered broken, and 1024 bit RSA encryption, which is probably still ok for DNS but definitely not the best choice. A good replacement in 2024 is SHA-256 hashing with ECDSA encryption; I passed ECDSAP256SHA256 as the algorithm to ldns-keygen and dropped the key length because it’s unused with ECDSA. You could also use 2048 bit RSA, but ECDSA signatures are smaller (which reduces abuse potential) and the algorithm is sufficiently well supported that other players are using it in production.

Each invocation of ldns-keygen outputs a public+private keypair. The public key will later be included into signed zones via a DNSKEY record.

Preventing zone enumeration

While DNS records are public for anyone who knows or can guess a name, allowing just anyone to enumerate all the DNS records in a zone may be undesirable. DNSSEC originally used NSEC records to prove the non-existence of a name, but this worked by specifying a range of non-existent names between every name that does exist, and so made it trivial for an adversary to “walk” the zone and enumerate its contents. NSEC3 was developed as an alternative and uses one-way hashes to make it more difficult to recover the original names.

It bears noting that NSEC3 doesn’t prevent enumeration of a zone; it just makes it harder. An adversary can still walk the zone and recover all the NSEC3 records via O(N) requests (N is the size of the zone), but must then guess or brute-force the names in order to reverse the hashes. This guessing phase is fully offline so cannot be detected or rate limited by the DNS server operator.

NSEC3 provided several defenses against cracking the hashes, but these are today regarded as ineffective and should be disabled. The first is the use of a random salt in the hashing. This only helps if the salt changes during the online phase of the attack, which is unlikely in practice. The second is to iteratvely take multiple hashes in order to increase the CPU cost for an attacker… but unfortunately this also increases the cost for legitimate resolvers to validate your responses. I replaced the tutorial’s command to sign zones with ldns-signzone -n -t 0 $ZONE $ZSK $KSK, which still enables NSEC3 but omits any salt and specifies zero additional hashing iterations (which confusingly means 1 iteration).

Of course I was interested to see whether my well-researched best practices match what others do in production. I tested one of my Cloudflare domains, secured by their 1-click DNSSEC process.

$ dig +dnssec -t aaaa no-such-host.sigiworld.com.
...
no-such-host.sigiworld.com. 1800 IN NSEC \000.no-such-host.sigiworld.com. RRSIG NSEC TYPE65283
...

What?! They’re still using NSEC? On the other hand, ldns-walk fails to enumerate the zone, just like it would with NSEC3. Something else must be going on….

It turns out Cloudflare is generating NSEC records on the fly, rather than serving pre-signed records covering a range of names. This allows each returned record to cover a “range” of just a single invalid name, and thus reveal no more information than a similar query in an unsigned zone. It’s possible to apply this technique in a self-hosted environment using CoreDNS, but the tradeoff is that you need to make the signing key available to the DNS server and it breaks zone-transfers to third-party servers (unless they explicitly also support this mode of operation, but few do). For this reason I’ll stick with NSEC3.

Signature lifetime

In the signed zone you’ll notice an RRSIG record corresponding to each record in the original zone. Anyone with access to the public DNSKEY, which is also published in the zone, can use the RRSIG to validate that a DNS answer is authentic. The RRSIGs have a time-bounded validity and ldns-signzone specifies a default of 30 days. Consequently, signing a zone is not a one-time operation but a periodic task to keep signatures fresh.

If you are relying on zone transfers to propagate changes from a primary to a secondary server, it is important to increment the zone’s serial number when re-signing. Note that the serial number is part of the SOA record, which is also signed. Thus it’s important to increment the serial number in the unsigned zone before the signing operation.

My loop for producing updated signatures looks something like:

Increment the serial number in the unsigned zone file.
Sign the zone and publish it to the primary server.
Wait for either new zone content to become available, or sufficient time to elapse since the last signature, then go back to step 1.

Establishing a chain of trust

All of the steps above may be completed, and the signed zone even published into the global DNS, but we have not yet achieved any security because a rogue intermediary could simply drop the RRSIG records (or produce their own signatures using their own key). This is where the DS record comes in: it goes in the parent zone and gets returned to resolvers alongside the NS records, and says only accept signatures from this key.

Each DS record corresponds to one DNSKEY record, but not every DNSKEY has a DS. If a DNSKEY is known (and trusted) via some other mechanism, a DS is not necessary. The most common example is the DNSKEY for the zone signing key: its RRSIG is made with the DNSKEY of the key signing key, and because the key signing key is validated by a DS in the parent zone, a chain of trust exists without the zone signing key needing to have a separate DS.

The order of first signing a zone and only then publishing the DS record is crucially important. You should also wait for the zone’s highest TTL to elapse between these steps, to avoid validation failures caused by cached data. To disable DNSSEC, follow the steps in reverse: unpublish the DS, wait for the TTL, and then drop the signatures. Getting the order wrong will cause an outage.

Key rotation

It’s good practice to rotate keys periodically, although nothing in DNSSEC forces you to do so. Unlike the signatures made with a key, the keys themselves don’t have lifetime (at least as far as the protocol is aware). The root zone has a fairly elaborate (and auditable) process for rotating keys, and every operator should have at least a plan.

An important principle is that validators should accept any valid path through the signature hierarchy. This means we can publish multiple RRSIGs for a record, or multiple DNSKEY/DS pairs, depending on how we want to orchestrate the rollover.

My plan, derived from RFC 4641, is as follows.

Zone signing key rollover

Generate a new key with ldns-keygen -a ECDSAP256SHA256 $ZONE.
Add the new DNSKEY into the unsigned zone manually. This is one case where we can’t rely on ldns-signzone to add all the required keys for us.
Sign and publish the zone, and wait for one TTL period to elapse.
Swap the old DNSKEY into the unsigned zone, removing the new one. Update the signer to use the new key. These two operations must be done atomically.
Sign and publish the zone, and wait for one TTL period to elapse.
Remove the old DNSKEY from the unsigned zone.

This sequence ensures that each record will only ever have one RRSIG at a point in time, but that its corresponding DNSKEY will be available even if key lookups are cached.

Key signing key rollover

Generate a new key with ldns-keygen -k -a ECDSAP256SHA256 $ZONE.
Sign the zone with both the old and new keys, e.g. ldns-signzone -n -t 0 $ZONE $ZSK $OLD_KSK $NEW_KSK
Publish the zone and wait for one TTL period to elapse.
Replace the DS record in the parent zone with the new one.
Wait for one TTL period to elapse.
Stop signing with the old key.

This sequence results in duplicate RRSIGs during the rollover (but only for the zone’s DNSKEYs, not for all the other records). It only requires one interaction with the parent zone. If your registrar supports multiple DS records, you could alternatively pre-publish a DNSKEY/DS pair and follow a process more like the one for zone signing key rollover. I believe mine does so might try that in the future.

Testing

Once your zone is fully secured (or at any step for debugging), you can test that DNSSEC is working correctly with the Verisign Labs DNSSEC Debugger. DNSViz provides the same information in graphical form (watch out for the color of the arrows!). A tool called Zonemaster provides a deeper analysis of your DNS, and is available from the Swedish Internet Foundation or the CZ NIC.

Happy resolving!