The Ascendance of nftables

The Sun sets on iptables (image by fdecomite, CC BY 2.0)

iptables is the default Linux firewall and packet manipulation tool. If you’ve ever been responsible for a Linux machine (aside from an Android phone perhaps) then you’ve had to touch iptables. It works, but that’s about the best thing anyone can say about it.

At Red Hat we’ve been working hard to replace iptables with its successor: nftables. Which has actually been around for years but for various reasons was unable to completely replace iptables.  Until now.

What’s Wrong With iptables?

iptables is slow. It processes rules linearly which was fine in the days of 10/100Mbit ethernet. But we can do better, and nftables does; it uses maps and concatenations to touch packets as little as possible for a given action.

Most of nftables’ intelligence is in the userland tools rather than the kernel, reducing the possibility for downtime due to kernel bugs. iptables puts most of its logic in the kernel and you can guess where that leads.

When adding or updating even a single rule, iptables must read the entire existing table from the kernel, make the change, and send the whole thing back. iptables also requires locking workarounds to prevent parallel processes from stomping on each other or returning errors. Updating an entire table requires some synchronization across all CPUs meaning the more CPUs you have, the longer it takes. These issues cause problems in container orchestration systems (like OpenShift and Kubernetes) where 100,000 rules and 15 second iptables-restore runs are not uncommon. nftables can update one or many rules without touching any of the others.

iptables requires duplicate rules for IPv4 and IPv6 packets and for multiple actions, which just makes the performance and maintenance problems worse. nftables allows the same rule to apply to both IPv4 and IPv6 and supports multiple actions in the same rule, keeping your ruleset small and simple.

If you’ve every had to log or debug iptables, you know how awful that can be. nftables allows logging and other actions in the same rule, saving you time, effort, and cirrhosis of the liver. It also provides the “nft monitor trace” command to watch how rules apply to live packets.

nftables also uses the same netlink API infrastructure as other modern kernel systems like /sbin/ip, the Wi-Fi stack, and others, so it’s easier to use in other programs without resorting to command-line parsing and execing random binaries.

Finally, nftables has integrated set support with consistent syntax rather than requiring a separate tool like ipset.

What about eBPF?

You might have heard that eBPF will replace everything and give everyone a unicorn. It might, if/when it gets enhancements for accountability, traceability, debuggability, auditability, and broad driver support for XDP. But nftables has been around for years and has most (all?) of these things today.

nftables Everywhere

I’d like to highlight the great work by members of my team to bring nftables over the finish line:

  • Phil Sutter is almost done with compat versions of arptables and ebtables and has been adding testcases everywhere. He also added a JSON interface to libnftables (much like /sbin/ip) for easier programmatic use which firewalld will use in the near future.
  • Eric Garver updated firewalld (the default firewall manager on Fedora, RHEL, and other distros) to use nftables by default. This change alone will seamlessly flip the nftables switch for countless users. It’s a huge deal.
  • Florian Westphal figured out how to make nftables and iptables NAT coexist in the kernel. He also fixed up the iptables compat commands and handles the upstream releases to make sure we can actually use this stuff.
  • And of course the upstream netfilter community!

Thanks iptables; it’s been a nice ride. But nftables is better.