Eliminating cloud IPv4 costs with IPv6 and 464XLAT

Posted on December 22, 2024

The big 3 cloud providers charge for public IPv4 addresses. Going IPv6-only avoids that cost, but sometimes it’s still necessary access services that are IPv4-only. In this post we’ll look at how to bridge the gap with 464XLAT, a technology originally developed for internet service providers.

We’ll focus on the case of an IPv6 client in the cloud making connections to remote IPv4 services. If you instead want to make a server in the cloud accessible by IPv4 clients, there are other solutions for that. Of course, both can be combined, even in the same VM.

Problem statement

IPv4 addresses are scarce and cost real money to acquire. By passing these costs on to customers, cloud providers both recoup their investment and encourage efficiency in address usage.

One of the best ways to be efficient is to run an IPv6-only infrastructure. Unfortunately, there are still some important services (GitHub, sigh) that support legacy IP only. Accessing those requires a bridge that translates IPv6 on the client side to IPv4 on the server side.

Legacy IP Only: This product does not support the current generation of the
Internet Protocol, IPv6.

Introducing 464XLAT

The ISP industry faces the same scarcity as cloud providers, and often turns to CGNAT to conserve addresses. But dual-stack networks are more complex than single-stack networks, and for larger providers the private IPv4 space also becomes scarce. Sharding the network into multiple routing domains enables reuse of the private addresses, but adds yet more complexity.

A simpler approach is to build an IPv6-only network, and then add IPv4 just within the customer premises (private addresses) and on the external side of the NAT (public addresses). 464XLAT makes that possible. It requires two components, a CLAT and a PLAT.

CLAT

The customer side translator, called a CLAT, converts IPv4 traffic to IPv6 by mapping each IPv4 address to an IPv6 counterpart. A CLAT typically runs on a home or small business router.

Consider a customer who has been allocated 2001:db8:12:3400::/56 and has a dual-stack main network using 2001:db8:12:3400::/64 and 192.168.1.0/24.

To translate the source address of an outgoing packet the CLAT needs some dedicated space, let’s say 2001:db8:12:3401::/96, which is non-overlapping with the customer’s primary /64 but still within their /56 and is globally routable. It’s also sized so that the IPv6 prefix (96 bits) concatenated with an IPv4 address (32 bits) yields a valid IPv6 address (128 bits). The CLAT takes a local address like 192.168.1.5, which is c0a80105 in hex, and combines it with the prefix to make 2001:db8:12:3401::c0a8:105. For convenience we can write it as 2001:db8:12:3401::192.168.1.5, but that’s just another notation for the same thing; it’s simply an IPv6 address and may be routed without any knowledge of IPv4.

To translate the destination address, the CLAT needs a bit of provider-specific information called the NAT64 prefix. Let’s assume it’s 64:ff9b::/96, which is actually reserved for the purpose, though a provider could also choose a different prefix out of their own IP space. It needs to be routable within the provider’s network. The CLAT combines the prefix with a destination address like 198.51.100.7 to make 64:ff9b::198.51.100.7, also known as 64:ff9b::c633:6407.

The freshly minted IPv6 packet gets sent out, magic happens, and a response comes back. The CLAT undoes the translation and forwards the IPv4 packet back to 192.168.1.5, which is none the wiser but happy to see a reply from 198.51.100.7.

As a side note, the whole operation of the CLAT is stateless. It relies on the fact that the IPv6 space is so vast it can contain the entire IPv4 space (many times over!) within the address range commonly allocated to a single subnet or server (the ubiquitous /64).

PLAT

On the provider side, the afformentioned magic is actually just the common NAT64 gateway. In 464XLAT it’s referred to as a PLAT, because it’s the provider side translator and who doesn’t love a weird acronym/abbreviation hybrid.

NAT64 behaves much like conventional IPv4 NAT, but on the internal side, instead of accepting IPv4 traffic from private addresses, it accepts IPv6. It replaces the source IPv6 address with its own public IPv4, and converts the destination IPv6 address to IPv4 by stripping off the NAT64 prefix. Like a conventional NAT, it’s stateful: it needs to keep track of which TCP and UDP ports are in use by which clients so it can reverse the conversion on the return traffic.

Building a CLAT in Linux

To give an IPv6-only VM in the cloud access to the IPv4 internet, we’ll run a a CLAT on the VM (translating only its own traffic) and rely on a public NAT64 service for PLAT duties.

The Linux networking stacks for IPv4 and IPv6 are largely separate entities, and there’s no built-in way to rewrite packets from one protocol to the other the way one might rewrite addresses in conventional NAT. Instead, we need to consume a packet and create a new one of the other protocol. Several programs can do this: Jool is an out-of-tree kernel module and TAYGA is a userspace implementation. Both are available in Debian, and for this example we’ll pick Jool.

Prepare the environment

First we’ll need an IPv6-only VM. I’m using Google Compute Engine, because I work at Google and it’s the platform I’m most familiar with.

The default VPC networks in GCE are IPv4-only, so it’s necessary to create a custom one in the cloud console. For mine I selected an MTU of 1500 (will be relevant later), no private IPv6 range, and a subnet type of IPv6 (single-stack) with IPv6 access type External. I then created a VM running Debian 12 (bookworm) on that network.

Upon login, we can confirm that the IPv4 address is merely a link-local one and that external IPv6 connectivity works but IPv4 does not.

root@hexagon:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 02:56:58:e6:00:91 brd ff:ff:ff:ff:ff:ff
    altname enp0s4
    inet 169.254.1.2/32 metric 100 scope global dynamic ens4
       valid_lft 3488sec preferred_lft 3488sec
    inet6 2600:1900:4041:173:0:1::/128 scope global dynamic noprefixroute
       valid_lft 3491sec preferred_lft 3491sec
    inet6 fe80::56:58ff:fee6:91/64 scope link
       valid_lft forever preferred_lft forever

root@hexagon:~# curl http://ipv6.myip.wtf/text
2600:1900:4041:173:0:1::

root@hexagon:~# curl http://ipv4.myip.wtf/text
curl: (28) Failed to connect to ipv4.myip.wtf port 80 after 130660 ms: Couldn't connect to server

Install and configure Jool

First install Jool and some dependencies.

apt update && apt -y install jool-tools jool-dkms linux-headers-cloud-amd64 bind9-dnsutils

Then load the kernel module.

modprobe jool_siit

If this fails, it’s likely because the module wasn’t built automatically upon package installation. I’ve seen this happen if the linux-headers- package is missing or if the suffix on that package name doesn’t match the one of the installed linux-image- package.

We need to tell the VM to forward traffic to Jool. This can be done either through iptables rules or by running Jool in another network namespace and routing traffic through it. Since many people already have other tools vying for iptables control, I think a namespace is less intrusive.

Create a namespace called clat and a veth(4) pair to link it to the default namespace.

ip netns add clat
ip link add name jool type veth peer name world
ip link set dev world netns clat

The veth interface is named jool in the default network namespace, and world in the clat namespace, representing what it connects to.

Bring up both ends of the link. The one in the default namespace is easy.

ip link set up dev jool

But running the same command for world intead of jool will return an error Cannot find device "world". That’s because we moved the world interface into the clat namespace, so any programs that manipulate it need to run in that namespace as well. Fortunately the ip tool provides a helper to start a program in another network namespace… in this case, just a another invocation of ip itself.

ip netns exec clat ip link set up dev world

Add some static link-local addresses to both ends of the link. This will make it easier to specify routing nexthops later, since we won’t need to look up the autogenerated link-local addresses that change on every reboot.

ip netns exec clat ip -6 addr add fe80::1 dev world
ip -6 addr add fe80::2 dev jool

Our VM needs a private IPv4 address to use for the local end of any outgoing IPv4 connections. We could choose any address, but since the IANA reserved a specific /29 for this purpose, let’s pick 192.0.0.2 from that block. Jool will take 192.0.0.1 and the VM will point the IPv4 default route at Jool.

ip -4 addr add 192.0.0.2/29 dev jool
ip netns exec clat ip -4 addr add 192.0.0.1/29 dev world
ip -4 route add default via 192.0.0.1 dev jool mtu 1480

The MTU on the default route is reduced by 20 bytes relative to the the VM’s external interface, because the IPv6 header is 20 bytes larger than the IPv4 header. This ensures that any packets sent to Jool will still fit across the VM’s primary uplink post-translation. Be sure to double-check the MTU of the primary network, as it’s not always 1500.

We also need an IPv6 address that 192.0.0.2 will get translated to. In the ISP example this was the 2001:db8:12:3401::/96 prefix, capable of holding any IPv4 address, but it could have been smaller as only 192.168.1.0/24 needed to be translated. On this VM, we know we’re only translating 192.0.0.2 so a single IPv6 counterpart will do. GCE actually allocates a /96 to each VM and assigns the first address in that block to the public interface, so we can simply look up that address and increment it by 1. Configure that in the clat namespace, along with routes on both sides.

ip netns exec clat ip -6 addr add 2600:1900:4041:173:0:1::1 dev world
ip netns exec clat ip -6 route add default via fe80::2 dev world
ip -6 route add 2600:1900:4041:173:0:1::1 via fe80::1 dev jool

Routing won’t actually work yet because Linux disables forwarding by default. But enabling forwarding will disable acceptance of route advertisements, which the VM needs to maintain its default route to the internet. So we first need to tell it “yes, really accept route advertisements, even though you’re a router”, and then enable forwarding.

sysctl -w net.ipv6.conf.ens4.accept_ra=2
sysctl -w net.ipv6.conf.all.forwarding=1

Finally, we need the NAT64 prefix of a PLAT. The standard 64:ff9b::/96 won’t do because the traffic needs to leave the cloud provider and traverse the internet. We can find one by querying the DNS64 service of one of the public NAT64 providers for any IPv4-only hostname.

$ dig +short -t aaaa ipv4.wtfismyip.com. @2a00:1098:2c::1
2a00:1098:2b::1:8e2c:d7a1
2a01:4f8:c2c:123f:64:5:8e2c:d7a1
2a00:1098:2c::5:8e2c:d7a1

We have options and any choice should be fine. Test it by making an HTTP request through the NAT64.

$ curl http://[2a00:1098:2b::1:8e2c:d7a1]/text
46.235.231.114

Lop off the last 32 bits to get a /96. Use a subnet calculator if needed. Configure Jool to translate external addresses to that prefix, and to translate 192.0.0.2 to to the corresponding local address that we determined above.

ip netns exec clat jool_siit instance add --netfilter --pool6 2a00:1098:2b::1:0:0/96
ip netns exec clat jool_siit eamt add 192.0.0.2 2600:1900:4041:173:0:1::1

One more command is needed to make ICMP behave properly. If one of the IPv6 hops on the path to the PLAT sends back a “packet too big” or a “hop limit exceeded” message, the CLAT won’t be able to map it back to IPv4 because its source address is not within the NAT64 prefix. We can tell it to simply fill in 192.0.0.8 which is reserved for that purpose.

ip netns exec clat jool_siit global update rfc6791v4-prefix 192.0.0.8/32

Testing it out

It should now be possible to access IPv4-only services.

$ curl http://ipv4.myip.wtf/text
46.235.231.114

Congratulations! No more IPv4 charges in the cloud! Note that all the commands above, except the package installation, affect the ephemeral state of the running system. You’ll probably want to stick them in a shell script to run on every boot.

A note on DNS64

Since public NAT64 providers generally also offer a DNS64 service, it’s worth asking, why not just follow their 3 step instructions? That’s certainly easier; it avoids the need to install packages or configure a CLAT.

However, I don’t want to send all DNS requests through a third-party. Sending a small amount of legacy IP traffic, most of which will be encrypted, is much better for privacy than sending 100% of DNS requests in plain text. Furthermore, as of 2024 all of the listed providers are in Europe while I’m on the US west coast. Querying such a far flung DNS64 would be quite a latency hit, and would affect all DNS lookups, including the ones for native IPv6 hosts that are hopefully the majority.

The privacy and latency issues are solveable by running a DNS64 resolver locally on the VM, and that might be simpler and more efficient than the CLAT. It avoids a kernel module and a local translation. But it requires applications to be IPv6 aware and to use DNS; it won’t work for literals like 8.8.8.8. It’s a tradeoff and it’s also possible to combine a local CLAT + local DNS64 if you wish.

Caveats and looking ahead

For some reason, ICMP ping doesn’t work through several of the public NAT64 gateways that I tried. That’s not inherent to 464XLAT, but evidently either not configured by the operator of those gateways or blocked due to abuse. Keep in mind that it’s a free service, generously made available by an individual, so we’re still getting more than we’re paying for.

In my next post I’ll show how to run your own NAT64 gateway/PLAT. While that obviously requires at least 1 IPv4 address, it still allows many VMs to share it, and you’ll know who to yell at if ICMP doesn’t work. And unlike traditional NAT, the gateway doesn’t have to run at the same cloud provider.