IPv6 (only) in the modern home
Sep 22, 2018
16 minute read

A few months ago I started experimenting with an IPv6 only network at home. There were several reasons why I wanted to try this out which included:

  • Wanting to be able to access individual devices from the outside without needing to manually handle port forwarding on the shared home router,1
  • Needing to VPN my traffic anyways to circumvent traffic shaping / DPI / recording DNS queries or DNS hijacking2,
  • Realizing running a dual stack setup meant duplicating firewall rules3, extra configuration options, and maintaining routing tables for both stacks,

Network design

             Colocation facility
+------------------------------------------+
| Reputable ISP                            |
|  ^    ^                                  |
|  |    |          +-------------------+   |
|  |    |          |                   |   |
|  |    +------->  |  Trusted server   |   |
|  |               |                   |   |
|  |               +-------------------+   |
+------------------------------------------+
   |     +-------------------+
   |     |                   |
   +---> | Internet exchange |
         |                   |
         +--------+----------+
                  ^
                  |
                  v
         Unreliable consumer ISP
                  ^
                  |DOCSIS based Coaxial
                  |connection (shared)
+------------------------------------------+
|                 v                        |
|          +------+-----------+            |
|          | Media PC as home +<-> Ethernet|
| Wifi <-->+ router           |            |
|          +------------------+            |
+------------------------------------------+
                   Home

As you can see in the figure above, the idea is to tunnel the traffic between the Media PC at home, and the trusted server in the colocation facility, which is used for internet access.

The colocation based server has a /48 IPv6 prefix assigned to it, so a /56 can be used to route to the media PC acting as a router / gateway at home. A VPN tunnel can then be configured using the IPv4 address of the server so that the media PC can reach it as it has no native IPv6 connectivity.

Network implementation

There are many ways in which such a scheme could have been implemented, starting with the choice of VPN software, network configuration protocols and management / monitoring.

VPN software

The three main contenders in this space are OpenVPN, IPSec based solutions such as StrongSwan, and Wireguard.

OpenVPN

OpenVPN is a project which realizes a secure tunneling implementation through the use of a custom protocol which can be confgured to run over UDP or TCP, and which has two modes, static key mode and TLS mode. In static key mode OpenVPN allows the administrator to specify the HMAC {send, receive}, DATA {encrypt, decrypt} keys manually in the configuration file. In TLS mode, bidirectional authentication using X.509 certificates is supported, for example verifying that client certificates are signed by the appropriate VPN certificate authority. This mode generates the HMAC keying material using OpenSSL and exchanges it inside the TLS session. The packets related to session management and the data packets themselves are multiplexed over the same connection.

Each packet which is transmitted contains:

  • A (keyed) Hash based Message Authentication Code (HMAC) over (explicit IV, encrypted envelope). The outer HMAC protects the IV and encrypted envelope from being forged (i.e. makes it harder to perform a denial of service attack by flooding invalid packets).
  • The explicit IV is the value used to initialize the encryption routine.

More information can be found at the OpenVPN security overview page4.

IPSec

Internet Protocol Security (IPSec) is actually a collection of protocols defined in various RFCs5, and is very widely supported across many operating systems, devices, and vendors. The quality of support varies because there are numerous ways in which IPSec can be used (i.e. which authentication modes are supported, which cipher suites are used, etc), and many of the implementations available in commercial hardware are closed source and confidential, so it is difficult to assess the quality of the implementation in a white box manner.

The protocol itself consists of several components, which can briefly be examined to familiarize ourselves with how this scheme actually works.

A Security Association (SA) is like a connection in the sense that it maintains the state of the current relationship between two IPSec endpoints. It contains which algorithms are being used for the packet encryption, and which MAC function is being used for the integrity checks. On a given system running IPSec, different packets can be processed differently (i.e. sent with different SA’s / different parameters) through the use of the Security Parameter Index (SPI), which is just an index into the list of security associations the host is maintaining.

The Encapsulated Security Payload (ESP) packet is the actual payload carrying packet 6, and consists of the following fields:

  • Security Parameters Index (SPI): used to identify which SA is used to encrypt / verify this packet,
  • Monotonically increasing sequence number,
  • Payload data itself,
  • Padding which is used to match the cipher block size,
  • Pad length,
  • Next header,
  • Integrity check value,

There are some interesting choices here, we can see that the next header and pad length and visible in the clear as well as a sequence number. This means that a passive observer is able to:

  • Link flows together even if they have gone through a NAT, or easily count the number of packets which have been sent and received due to the monotonic nature of the sequence number.
  • Know the length of each packet because the pad length is in the clear.
  • Know what the payload type is from the next header value.

The establishment of a SA is done through the use of the Internet Key Exchange (IKE) protocol, which uses X.509 certificates to agree on a shared session secret which is then used to bootstrap the rest of the system.

The role of StrongSwan in this scheme is as an IKE daemon which performs the key establishment and registers the SA with the Linux Kernel. The kernel then handles the actual encryption and decryption of packets based on the rules set up in the SA database. This database can be queried using the ip xfrm tool7.

Wireguard

Wireguard is a modern VPN protocol designed by Jason A. Donenfeld8, with the aims of being high performance, simple, and cryptographically sound. The most attractive feature of this scheme is the level of simplicity the protocol and its implementation achieves.

Wireguard aims to be as stateless as possible, and appears stateless to the user. Inside, however, there are timers and related session metadata which need to be kept about remote peers, but this is kept to a minimum. The concept of a peer is used instead of a server and client, because this is a clearer abstraction of what is really going on. When we connect two networks together, or connect a laptop to our home network for example, we want to establish a tunnel, and it does not matter which side is the client or the server. The protocol itself has a notion of initiator and responder for the purposes of the handshake, but apart from that the peers are equals.

The Wireguard handshake is designed to leak as little information as possible, and part of this is achieved through the pre-sharing of secrets, which must be done out of band. It has done away with the complexities of X.509 and instead builds cleanly upon the ED25519 cryptographic primitives. A key design goal is to not expose the existence of a wireguard endpoint to a host which doesn’t know that it is running there. This means that handshake packets are only responded to if the other party knows the public key of that wireguard peer. The reasoning behind this is that if a remote peer knows the public key, then it knows about the existence of the server and can thus expect a reply from initiating the handshake.

If the responder to the handshake is under load, for example from a denial of service attack, it can choose to send back a cookie that must be responded to before any expensive operations are performed. This means that we can enforce IP based rate limiting as spoofed UDP packets with fake source IP addresses can be eliminated.

All the packets in an authenticated exchange between peers has a timestamp associated with it, which allows the suppression of replay attacks as any packet with a lower or equal timestamp to one previously seen is dropped. A consequence of an attacker forging a high timestamp value (if it knows the private keys) is that the legitimate peer will no longer be able to send packets with its own timestamp, which should be an indication that the key has been compromised.

Looking at the practical aspects of this protocol, the Wireguard implementation in Linux is done through a Loadable Kernel Module (LKM) which can be configured using the userspace wg utility and is enabled on an existing tunnel interface that can be configured through standard utilities such as ip. This is a major feature for usability as compared to other solutions which try to do everything themselves and are hard to configure properly because of this.

Choosing a VPN solution

After considering the above options, Wireguard was chosen for its small attack surface, modern cryptography, and ease of configuration. IPSec was unnecessarily complicated and does not provide the same security guarantees as Wireguard, has many moving parts such as the kernel XFRM interface and running userspace modules to perform the key exchange, and has too many configuration parameters to really understand how the system is functioning. OpenVPN was not seriously considered either because of its poor performance having to context switch to userspace to process the packets, reliance on a TLS stack, and being complicated to configure.

VPN peer configuration

Let us not go into the actual configuration of the Wireguard configuration file here, you can read more about that at9. More interesting is the surrounding configuration of the interface on the Linux side.

Firstly, tunnel specific endpoint addresses are chosen from the Unique Local Address (ULA) range. This ensures clear identification of the endpoints of the tunnel with link-local addresses that are not globally routed. It can easily be seen that the IP address is on the tunnel interface and this can be used when debugging. This is less sloppy than assigning public IP addresses to the endpoints and not being sure which way our packets are being routed when debugging.

Machine Tunnel endpoint
Colo fd00:cafe:f00d::1
Home gateway fd00:cafe:f00d::2

Then appropriate rules are added on the colo machine to forward traffic for the routed prefix to the VPN endpoint, so that it reaches the home gateway. Below is a table showing these addresses.

Address
Routed to colo 2001:db8:beef::/48
Routed to home 2001:db8:beef:100::/56

To achieve the routing on the incoming path, the ip -6 route utility was used on the colo machine:

# ip -6 route add 2001:db8:beef:100::/56 dev vpn0

This will add an entry in the default IPv6 routing table sending the packets which should go to the home gateway to the wireguard interface. Wireguard takes care of sending the packets to the right peer because the AllowedIPs directive is set to let it know packets for 2001:db8:beef:100::/56 should go to the peer with the public key of the home gateway. This way routing can be done without other protocols to discover which address maps to which peer, reducing the complexity of another solution using an active routing protocol such as BGP.

On the home gateway side, a more manual configuration methodology was chosen instead of the wg-quick utility, to allow more control over how the system is configured, allowing some customizations to be made for this specific setup.

[Unit]
Description=VPN service
After=network.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/ip link add dev vpn0 type wireguard # ①
ExecStart=/usr/bin/ip addr add fd00:cafe:f00d::2 peer fd00:cafe:f00d::1 \
    dev vpn0
ExecStart=/usr/bin/wg setconf vpn0 /etc/wireguard/vpn0.conf # ②
ExecStart=/usr/bin/ip link set up vpn0
ExecStart=/usr/bin/ip route flush table vpn # ③
ExecStart=/usr/bin/ip -6 route add fd00:cafe:f00d::1 dev vpn0 \
    table vpn
ExecStart=/usr/bin/ip -6 route add default via fd00:cafe:f00d::1 \
    dev vpn0 table vpn
ExecStart=/usr/bin/ip -6 route add 2001:db8:beef:101::/64 dev enp1s0 table vpn
ExecStart=/usr/bin/ip -6 rule add from 2001:db8:beef:101::/64 lookup vpn # ④
ExecStart=/usr/bin/ip6tables -t mangle -A FORWARD -o vpn0 -p tcp \
    --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1280 # ⑤
ExecStart=/usr/bin/bash -c "echo 1 >/proc/sys/net/ipv6/conf/all/forwarding" # ⑥

ExecStop=/usr/bin/ip link del vpn0 # ⑦
ExecStop=/usr/bin/ip -6 route flush table vpn
ExecStop=/usr/bin/ip6tables -t mangle -F

[Install]
WantedBy=multi-user.target

① – First start by setting up a new Wireguard interface and assigning our endpoint addresses to the newly created interface.

② – Using the wg tool, configuration parameters are passed to the kernel via netlink, which configures the Wireguard kernel module.

③ – A new routing table10 called vpn was created to keep all our rules related to the VPN in that table. We then added rules to that table to be able to direct traffic correctly to the colo endpoint. Then, a rule to direct traffic to the colo endpoint is added, along with another rule to send traffic destined to local machines back to enp1s0 which is the network interface for the home side of the gateway.

④ – The ip rule command is used to make sure that all traffic originating from our home clients is sent to the VPN endpoint by matching on the source IP and directing that traffic to the appropriate routing table.

⑤ – There was an interesting bug that some TCP sessions were experiencing what seemed to be spurious packet loss, after some digging in with wireshark it became apparent that big packets were not being acknowledged and this eventually led to the realization that there was an MTU issue. Clients were setting the MTU themselves, and it was not immediately apparent how to make clients use a lower MTU network-wide without manual reconfiguration. Thus this hack is used to adjust MTU of TCP sessions by fiddling with the SYN packet to set a lower Maximum Segment Size (MSS) on the gateway, whilst it is forwarding traffic.

⑥ – Enable IP forwarding, because the whole point of all this is to forward packets :).

⑦ – On the teardown side the unit deletes the network interface and its state will go away, then flush the routing table so there will be no error when the unit is restarted, and delete the iptables rule so that the system does not end up with lots of duplication in the mangle table.

Network configuration

Now that the tunnel is up and operational for communicating with the internet, the machines at home must be configured to get their own addresses. In IPv6 there are two ways in which addresses and network settings can be configured. The first is DHCPv6 which is a stateful mechanism where a server assigns each client an address and provides it with configuration parameters such as router address, DNS servers, etc. The second mechanism is Stateless Autoconfiguration (SLAAC) in which router advertisements are periodically broadcast. Clients are then free to pick their own addresses within the allocated space.

Initially a SLAAC based solution using radvd was deployed to advertise the prefix, router, and DNS servers. This worked very well but one day there was a need to perform a network PXE boot and the only way that seemed possible to do that was to switch to DHCPv6. Luckily now the support for IPv6 in dnsmasq is quite mature and it supports both router advertisements and DHCPv6, so that is the solution which was deployed. The configuration is refreshingly simple:

interface=enp1s0
dhcp-range=2001:db8:beef:101::100,2001:db8:beef:101::ffff, slaac
dhcp-option=option6:dns-server,[2001:db8:123::]
enable-ra # option6:dns-server from DHCPv6 is used for RDNSS.
dhcp-authoritative

Another benefit of DHCPv6 being a stateful system is that it can be used to update DNS records to have records for the connected machines, making it easier to SSH back in to a machine at home, for example.

NAT64

Once this is all set up, user devices were finally able to access websites after connecting to the wifi or ethernet network. However there is still a small amount of websites which are unreachable because they only support IPv4! To resolve this, a NAT64 solution was deployed on the colocaiton server to allow IPv6 hosts to reach IPv4 addresses.

This works by using a special DNS server (actually BIND9 supports doing this, and that is what we are using too) implementing DNS64, which replaces NXDOMAIN entries for AAAA records with the corresponding A record formatted as an IPv6 address with the special prefix 64:ff9b::/96. The NAT64 daemon then is able to use this information encoded in the address to recreate an IPv4 packet and send it along, and do the corresponding mapping in the reverse direction.

For example, to access the host 8.8.8.8 that does not have a AAAA record, but rather only an A record, then the DNS64 server would return an AAAA record for 64:ff9b::808:808.

Tayga11 is the NAT64 daemon which was chosen, because it seems to be the only working option for a Linux host at this point in time. There was some initial confusion when it chose to silently drop packets larger than 1280 bytes, and this seems to be hardcoded in the source code. However, after some more MTU fiddling, it started working decently. This would be a component to look at implementing in a cleaner and more robust way in the future.

Issues

The main issues that I encountered was end host applications not supporting IPv6, for example when I installed Android Studio, the initial download of the SDK was not able to proceed for some reason. I did not debug it but it worked when I connected to another IPv4 network. Another case of this was a website that I was using to order PCBs online redirected me to a web server running directly on an IPv4 address, but I managed to use the site by prefixing 64:ff9b:: in front of the address.12

Future work

There is still a lot left to be desired in this setup including but not limited to: monitoring, automatic handling of link failures such as switching to HSPA+ backup, and an integrated single sign-on system. However, some good progress has been made with the core system, and it has been proven to be stable for several months of full time use now.

I would really like to replace Tayga with a custom component which handles larger MTUs and is more configurable in the way it functions (at the moment it maps to a private subnet and that subnet is NAT’ed again via IPTables).

On the authentication side, it would be great to be able to use a mechanism like EAP-TLS to mutually authenticate AP and client for WiFi clients, and use 802.1X for wired clients. However it would take some significant time and effort to set that all up, so that could be a topic for the next post :).

With respect to the monitoring aspect, for the moment I am the only human using this infrastructure, but once I add more machines and users monitoring and automatic fault detection / mitigation would become a higher priority action item.


  1. The router itself is rather unstable and runs some binary blobs which are not easily understandable / customizable, so doing away with it entirely would be a good thing to do, however due to external constraints that is not possible for now. [return]
  2. There’s an excellent post by Benjojo about The ISPs sharing your DNS query data, this highlights the fact that due to the nature of plain old RFC1035 DNS sending queries in plaintext, anyone able to passively listen to your packets can know which hosts you are communicating with. Even whilst there is important work going on implementing things like encrypted SNI in TLS, the DNS lookup still leaks the hostname you are trying to reach. [return]
  3. On Linux iptables only configures the firewall subsystem for IPv4 packets, you still need to run the rules through with ip6tables, and then have to maintain different policies for ICMPv6 vs ICMPv4, different source address ranges, etc. [return]
  4. Security overview of OpenVPN [return]
  5. Wikipedia’s list of RFC’s pertaining to IPSec [return]
  6. There is also the Authetntication Header (AH) packet, but we will only look at ESP here since that has stronger security guarantees. [return]
  7. IP XFRM man page [return]
  8. Wireguard website [return]
  9. Wireguard man page [return]
  10. This was done by adding a new line with a higher priority than the other routing tables in /etc/iproute2/rt_tables. [return]
  11. TAYGA - NAT64 for Linux [return]
  12. Fun fact: you can append the bytes of an IPv4 address to an IPv6 prefix using this notation: 64:ff9b::8.8.8.8. Other fun fact: You can omit empty IPv4 octets that are zero, so 127.0.0.1 can be written as 127.1 [return]