Eliminating cloud IPv4 costs with IPv6 and 464XLAT
The big 3 cloud providers charge for public IPv4 addresses. Going IPv6-only avoids that cost, but sometimes it’s still necessary access services that are IPv4-only. In this post we’ll look at how to bridge the gap with 464XLAT, a technology originally developed for internet service providers.
We’ll focus on the case of an IPv6 client in the cloud making connections to remote IPv4 services. If you instead want to make a server in the cloud accessible by IPv4 clients, there are other solutions for that. Of course, both can be combined, even in the same VM.
Problem statement
IPv4 addresses are scarce and cost real money to acquire. By passing these costs on to customers, cloud providers both recoup their investment and encourage efficiency in address usage.
One of the best ways to be efficient is to run an IPv6-only infrastructure. Unfortunately, there are still some important services (GitHub, sigh) that support legacy IP only. Accessing those requires a bridge that translates IPv6 on the client side to IPv4 on the server side.
Introducing 464XLAT
The ISP industry faces the same scarcity as cloud providers, and often turns to CGNAT to conserve addresses. But dual-stack networks are more complex than single-stack networks, and for larger providers the private IPv4 space also becomes scarce. Sharding the network into multiple routing domains enables reuse of the private addresses, but adds yet more complexity.
A simpler approach is to build an IPv6-only network, and then add IPv4 just within the customer premises (private addresses) and on the external side of the NAT (public addresses). 464XLAT makes that possible. It requires two components, a CLAT and a PLAT.
CLAT
The customer side translator, called a CLAT, converts IPv4 traffic to IPv6 by mapping each IPv4 address to an IPv6 counterpart. A CLAT typically runs on a home or small business router.
Consider a customer who has been allocated 2001:db8:12:3400::/56
and has a
dual-stack main network using 2001:db8:12:3400::/64
and 192.168.1.0/24
.
To translate the source address of an outgoing packet the CLAT needs some
dedicated space, let’s say 2001:db8:12:3401::/96
, which is non-overlapping
with the customer’s primary /64
but still within their /56
and is globally
routable. It’s also sized so that the IPv6 prefix (96 bits) concatenated with an
IPv4 address (32 bits) yields a valid IPv6 address (128 bits). The CLAT takes a
local address like 192.168.1.5
, which is c0a80105
in hex, and combines it
with the prefix to make 2001:db8:12:3401::c0a8:105
. For convenience we can
write it as 2001:db8:12:3401::192.168.1.5
, but that’s just another notation
for the same thing; it’s simply an IPv6 address and may be routed without any
knowledge of IPv4.
To translate the destination address, the CLAT needs a bit of provider-specific
information called the NAT64 prefix. Let’s assume it’s 64:ff9b::/96
, which
is actually
reserved for the purpose,
though a provider could also choose a different prefix out of their own IP
space. It needs to be routable within the provider’s network. The CLAT combines
the prefix with a destination address like 198.51.100.7
to make
64:ff9b::198.51.100.7
, also known as 64:ff9b::c633:6407
.
The freshly minted IPv6 packet gets sent out, magic happens, and a response
comes back. The CLAT undoes the translation and forwards the IPv4 packet back to
192.168.1.5
, which is none the wiser but happy to see a reply from
198.51.100.7
.
As a side note, the whole operation of the CLAT is stateless. It relies on the
fact that the IPv6 space is so vast it can contain the entire IPv4 space (many
times over!) within the address range commonly allocated to a single subnet or
server (the ubiquitous /64
).
PLAT
On the provider side, the afformentioned magic is actually just the common NAT64 gateway. In 464XLAT it’s referred to as a PLAT, because it’s the provider side translator and who doesn’t love a weird acronym/abbreviation hybrid.
NAT64 behaves much like conventional IPv4 NAT, but on the internal side, instead of accepting IPv4 traffic from private addresses, it accepts IPv6. It replaces the source IPv6 address with its own public IPv4, and converts the destination IPv6 address to IPv4 by stripping off the NAT64 prefix. Like a conventional NAT, it’s stateful: it needs to keep track of which TCP and UDP ports are in use by which clients so it can reverse the conversion on the return traffic.
Building a CLAT in Linux
To give an IPv6-only VM in the cloud access to the IPv4 internet, we’ll run a a CLAT on the VM (translating only its own traffic) and rely on a public NAT64 service for PLAT duties.
The Linux networking stacks for IPv4 and IPv6 are largely separate entities, and there’s no built-in way to rewrite packets from one protocol to the other the way one might rewrite addresses in conventional NAT. Instead, we need to consume a packet and create a new one of the other protocol. Several programs can do this: Jool is an out-of-tree kernel module and TAYGA is a userspace implementation. Both are available in Debian, and for this example we’ll pick Jool.
Prepare the environment
First we’ll need an IPv6-only VM. I’m using Google Compute Engine, because I work at Google and it’s the platform I’m most familiar with.
The default VPC networks in GCE are IPv4-only, so it’s necessary to create a custom one in the cloud console. For mine I selected an MTU of 1500 (will be relevant later), no private IPv6 range, and a subnet type of IPv6 (single-stack) with IPv6 access type External. I then created a VM running Debian 12 (bookworm) on that network.
Upon login, we can confirm that the IPv4 address is merely a link-local one and that external IPv6 connectivity works but IPv4 does not.
root@hexagon:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 02:56:58:e6:00:91 brd ff:ff:ff:ff:ff:ff
altname enp0s4
inet 169.254.1.2/32 metric 100 scope global dynamic ens4
valid_lft 3488sec preferred_lft 3488sec
inet6 2600:1900:4041:173:0:1::/128 scope global dynamic noprefixroute
valid_lft 3491sec preferred_lft 3491sec
inet6 fe80::56:58ff:fee6:91/64 scope link
valid_lft forever preferred_lft forever
root@hexagon:~# curl http://ipv6.myip.wtf/text
2600:1900:4041:173:0:1::
root@hexagon:~# curl http://ipv4.myip.wtf/text
curl: (28) Failed to connect to ipv4.myip.wtf port 80 after 130660 ms: Couldn't connect to server
Install and configure Jool
First install Jool and some dependencies.
apt update && apt -y install jool-tools jool-dkms linux-headers-cloud-amd64 bind9-dnsutils
Then load the kernel module.
modprobe jool_siit
If this fails, it’s likely because the module wasn’t built automatically upon
package installation. I’ve seen this happen if the linux-headers-
package is
missing or if the suffix on that package name doesn’t match the one of the
installed linux-image-
package.
We need to tell the VM to forward traffic to Jool. This can be done either through iptables rules or by running Jool in another network namespace and routing traffic through it. Since many people already have other tools vying for iptables control, I think a namespace is less intrusive.
Create a namespace called clat
and a
veth(4) pair to link it to
the default namespace.
ip netns add clat
ip link add name jool type veth peer name world
ip link set dev world netns clat
The veth interface is named jool
in the default network namespace, and world
in the clat
namespace, representing what it connects to.
Bring up both ends of the link. The one in the default namespace is easy.
ip link set up dev jool
But running the same command for world
intead of jool
will return an error
Cannot find device "world"
. That’s because we moved the world
interface into
the clat
namespace, so any programs that manipulate it need to run in that
namespace as well. Fortunately the ip
tool provides a helper to start a
program in another network namespace… in this case, just a another invocation
of ip
itself.
ip netns exec clat ip link set up dev world
Add some static link-local addresses to both ends of the link. This will make it easier to specify routing nexthops later, since we won’t need to look up the autogenerated link-local addresses that change on every reboot.
ip netns exec clat ip -6 addr add fe80::1 dev world
ip -6 addr add fe80::2 dev jool
Our VM needs a private IPv4 address to use for the local end of any outgoing
IPv4 connections. We could choose any address, but since the
IANA reserved a specific /29
for this purpose, let’s pick 192.0.0.2
from that block. Jool will take
192.0.0.1
and the VM will point the IPv4 default route at Jool.
ip -4 addr add 192.0.0.2/29 dev jool
ip netns exec clat ip -4 addr add 192.0.0.1/29 dev world
ip -4 route add default via 192.0.0.1 dev jool mtu 1480
The MTU on the default route is reduced by 20 bytes relative to the the VM’s external interface, because the IPv6 header is 20 bytes larger than the IPv4 header. This ensures that any packets sent to Jool will still fit across the VM’s primary uplink post-translation. Be sure to double-check the MTU of the primary network, as it’s not always 1500.
We also need an IPv6 address that 192.0.0.2
will get translated to. In the ISP
example this was the 2001:db8:12:3401::/96
prefix, capable of holding any IPv4
address, but it could have been smaller as only 192.168.1.0/24
needed to be
translated. On this VM, we know we’re only translating 192.0.0.2
so a single
IPv6 counterpart will do. GCE actually allocates a /96
to each VM and assigns
the first address in that block to the public interface, so we can simply look
up that address and increment it by 1. Configure that in the clat
namespace,
along with routes on both sides.
ip netns exec clat ip -6 addr add 2600:1900:4041:173:0:1::1 dev world
ip netns exec clat ip -6 route add default via fe80::2 dev world
ip -6 route add 2600:1900:4041:173:0:1::1 via fe80::1 dev jool
Routing won’t actually work yet because Linux disables forwarding by default. But enabling forwarding will disable acceptance of route advertisements, which the VM needs to maintain its default route to the internet. So we first need to tell it “yes, really accept route advertisements, even though you’re a router”, and then enable forwarding.
sysctl -w net.ipv6.conf.ens4.accept_ra=2
sysctl -w net.ipv6.conf.all.forwarding=1
Finally, we need the NAT64 prefix of a PLAT. The standard 64:ff9b::/96
won’t
do because the traffic needs to leave the cloud provider and traverse the
internet. We can find one by querying the DNS64 service of one of the public
NAT64 providers for any IPv4-only hostname.
$ dig +short -t aaaa ipv4.wtfismyip.com. @2a00:1098:2c::1
2a00:1098:2b::1:8e2c:d7a1
2a01:4f8:c2c:123f:64:5:8e2c:d7a1
2a00:1098:2c::5:8e2c:d7a1
We have options and any choice should be fine. Test it by making an HTTP request through the NAT64.
$ curl http://[2a00:1098:2b::1:8e2c:d7a1]/text
46.235.231.114
Lop off the last 32 bits to get a /96
. Use a subnet
calculator if needed. Configure Jool to
translate external addresses to that prefix, and to translate 192.0.0.2
to
to the corresponding local address that we determined above.
ip netns exec clat jool_siit instance add --netfilter --pool6 2a00:1098:2b::1:0:0/96
ip netns exec clat jool_siit eamt add 192.0.0.2 2600:1900:4041:173:0:1::1
One more command is needed to make ICMP behave properly. If one of the IPv6 hops
on the path to the PLAT sends back a “packet too big” or a “hop limit exceeded”
message, the CLAT won’t be able to map it back to IPv4 because its source
address is not within the NAT64 prefix. We can tell it to simply fill in
192.0.0.8
which is reserved for that
purpose.
ip netns exec clat jool_siit global update rfc6791v4-prefix 192.0.0.8/32
Testing it out
It should now be possible to access IPv4-only services.
$ curl http://ipv4.myip.wtf/text
46.235.231.114
Congratulations! No more IPv4 charges in the cloud! Note that all the commands above, except the package installation, affect the ephemeral state of the running system. You’ll probably want to stick them in a shell script to run on every boot.
A note on DNS64
Since public NAT64 providers generally also offer a DNS64 service, it’s worth asking, why not just follow their 3 step instructions? That’s certainly easier; it avoids the need to install packages or configure a CLAT.
However, I don’t want to send all DNS requests through a third-party. Sending a small amount of legacy IP traffic, most of which will be encrypted, is much better for privacy than sending 100% of DNS requests in plain text. Furthermore, as of 2024 all of the listed providers are in Europe while I’m on the US west coast. Querying such a far flung DNS64 would be quite a latency hit, and would affect all DNS lookups, including the ones for native IPv6 hosts that are hopefully the majority.
The privacy and latency issues are solveable by running a DNS64 resolver locally
on the VM, and that might be simpler and more efficient than the CLAT. It avoids
a kernel module and a local translation. But it requires applications to be IPv6
aware and to use DNS; it won’t work for literals like 8.8.8.8
. It’s a
tradeoff and it’s also possible to combine a local CLAT + local DNS64 if you
wish.
Caveats and looking ahead
For some reason, ICMP ping doesn’t work through several of the public NAT64 gateways that I tried. That’s not inherent to 464XLAT, but evidently either not configured by the operator of those gateways or blocked due to abuse. Keep in mind that it’s a free service, generously made available by an individual, so we’re still getting more than we’re paying for.
In my next post I’ll show how to run your own NAT64 gateway/PLAT. While that obviously requires at least 1 IPv4 address, it still allows many VMs to share it, and you’ll know who to yell at if ICMP doesn’t work. And unlike traditional NAT, the gateway doesn’t have to run at the same cloud provider.