Iptables REDIRECT vs. DNAT vs. TPROXY

As getting closer to the task itself (which is to extract the transparent proxy support from iptables to be available from nftables as well), different solutions come up which serve similar purposes and the difference between them is not trivial.

In what they are similar is that we want the clients not to connect directly to a service (server), but have a network entity (node) in between.

The following figure about my test topology will be useful to discuss the differences.

         +---+ .1           .2 +---+
 (proxy) | X |-----------------| Z | (server)
         +---+   10.0.4.0/24   +---+
           | .101
           |
           | 10.0.3.0/24
           |
           | .1
         +---+
         | Y | (client)
         +---+

Now see the different possibilities grouped by the iptables target that makes them available.

There are some solutions for redirecting traffic with the help of the Linux kernel and iptables. These are DNAT, REDIRECT and TPROXY.

DNAT

This target in the iptables nat table makes the function of destination nat available.

This is a quite known concept, if you are familiar with basic networking, you have probably met this. Low-cost home routers usually call it port forwarding.

What it does is changing the destination addresss (and destination port) to given values before the routing decision is made, and makes the routing decision be based on the new parameters. It is an important point here that it actually modifies the IP (and TCP) header and requires connection tracking to work, as the reply packets should be matched and translated back.

See an example use-case

Regarding the figure above, we might want to make node Z reachable from the internet without giving public IP address to it. A reason for this can be the lack of sufficient addresses or security considerations (however NAT is not considered to be security solution as far as I know). The result is a service that runs in a private network being accessible through a public gateway.

Lets see an example with iptables:

[X]$ iptables --table nat --append PREROUTING --protocol tcp --dport 80 --jump DNAT --to-destination 10.0.4.2

This command makes every incoming packets to X on port 80 to be forwarded towards Z with a changed IP header.

Note that, this solution needs ip forwarding to be enabled in the kernel as actual routing is done.

Who knows who?

In this scenario, Z knows that the sender of the request is Y, and thinks that Y sent the packet directly to it. Z therefore will send response packets to Y. Connection tracking is necessary to translate the adress of Z to that of X in the response as the source address.

Y on the other hand does not know that it is communicating with Z, it believes that X receives and replies to its request.

Who configures this solution?

When DNAT is used, it is configured by the administrator of the service. He/she wants to hide Z behind X for whatever reason.

REDIRECT

This iptables target cannot be associated with a well-known networking solution. It is like a special DNAT rule where the new destination address is mandatorily the IP address of the receiving interface.

Here, incoming packets matching the rule have their destination address changed to the receiving interface’s address and optionally their destination port changed to a specific or a random port (depending on the command). Similar to DNAT, the IP (and probably transport layer) header is modified.

See an example use-case

For instance you have a flask server listening on port 8080, but the standard HTTP port is 80, so you are receiving requests to this port. The following iptables role will redirect all tcp packets with the destination port of 80 to port 8080.

[X]$ iptables --table nat --append PREROUTING --protocol tcp --dport 80 --jump REDIRECT --to-ports 8080

What is its benefit over DNAT? When I want to redirect traffic on the local host, DNAT needs the destination address to be added which makes it hard to maintain if the interface addresses can change. Redirect does not need a specific IP address to work, so it is more flexible.

Note that, using REDIRECT leaves node Z untuched, so the service should run on X.

Who knows who?

In this scenario Y does not necessarily know who it is communicating with, X knows Y and Z is not part of the communication at all.

Who configures this solution?

REDIRECT is also configured by the administrator of the service, the users know nothing about this.

TPROXY

This solution is different from the other two in more aspects.

First, let’s see what a proxy is in general. Proxies are nodes/softwares that are used to stand between the client and the service. The client connects to the proxy server which then connects to the server through a distinct connection. This method can be used for various purposes like hiding my own identity from the server, activity logging or response caching etc.

The first significant difference can be catched here. While DNAT and REDIRECT had everything done in the kernel, being a proxy means running a specific software that does the task, the kernel “only” needs to support this.

A proxy can be transparent or non-transparent.

An example to non-transparent proxy can be when you set the address and other data of the proxy server in your web browser or any other client. In this case you know about the proxy service and you explicitly configure your device to use it.

On the other hand, a transparent proxy is invisible for the client. No configuration is needed on the OS of the client, the network parameters can be configured to use this solution.

See an example use-case

Establishing a transparent proxy is a bit more difficult than the other two solutions. There is a documentation available here. Detailed description of what happens here is out of the scope of the actual post (I will probably write a separate one about this later).

However, I’d like to describe what can be seen at different points of the network without any config, after point 1. and point 2. As this is my main interest, it will be a bit more dateiled than the ones before.

To test this functionality I used this program.

Without any config

After the tcprdr program is compiled, the following commands should be run on different nodes (the order on Y should go last).

[Z]$ nc --listen --local-port=80

[X]$ ./tcprdr 50080 10.0.4.2 80

[Y]$ telnet 10.0.3.101 50080

This solution does not require kernel support. Without -t ot -T flags, tcprdr does not set IP_TRANSPARENT option on any of the sockets, so it basically copies bytes from one socket to another.

For now it is nothing special, if run ss (formerly netstat) on X, you see the following:

[X]$ ss --tcp --numeric --processes
StateRecv-Q Send-Q  Local Address:Port    Peer Address:Port
ESTAB0      0       10.0.3.101:50080       10.0.3.1:41088  users:(("tcprdr",pid=460,fd=4))
ESTAB0      0         10.0.4.1:46190       10.0.4.2:80     users:(("tcprdr",pid=460,fd=5))

So tcprdr copies bytes from the firs socket to the second and vice versa.

With policy routing (point 1.)

[Z]$ nc --listen --local-port=80
# A static route is also necessary (later described).
[Z]$ ip route add 10.0.3.101 via 10.0.4.1

[X]$ ./tcprdr -T 50080 10.0.4.2 80

[Y]$ telnet 10.0.3.101 50080

Adding the -T flag to tcprdr enables IP_TRANSPARENT option on the outgoing socket. The output of ss is the following:

[X]$ ss --tcp --numeric --processes
StateRecv-Q Send-Q  Local Address:Port    Peer Address:Port
ESTAB0      0          10.0.3.101:50080       10.0.3.1:41144  users:(("tcprdr",pid=533,fd=4))
ESTAB0      0            10.0.3.1:41144       10.0.4.2:80     users:(("tcprdr",pid=533,fd=5))

This is still without the TPROXY support of the kernel.

The difference here is that X uses the IP and TCP source parameters from Y as its own, so Z receives packets with the source address of Y. The static route is necessary to be able to respond to these packets.

The policy routing makes it possible for tcprdr to hanle packets in response. This circumstance may be later described in a separate post.

With TPROXY support (point 2.)

Adding the iptables rule makes it possible for the proxy application (tpcrdr in our case) to receive packets with the destination port other than what the listening socket is bound to. Also application-level support is necessary, the -t flag sets the IP_TRANSPARENT option on the listening socket. This makes the following scenario possible.

[Z]$ nc --listen --local-port=80

[X]$ ./tcprdr -t -T 50080 10.0.4.2 80

[Y]$ telnet 10.0.3.101 80

The sockets on X are the following now:

[X]$ ss --tcp --numeric --processes
State Recv-Q Send-Q  Local Address:Port    Peer Address:Port
ESTAB 0   0       10.0.3.101:80         10.0.3.1:33104     users:(("tcprdr", pid=634,fd=4))
ESTAB 0   0       10.0.3.1:33104        10.0.4.2:80        users:(("tcprdr", pid=634,fd=5))

[X]$ ss --tcp --numeric --processes --listening
State Recv-Q Send-Q  Local Address:Port    Peer Address:Port
LISTEN0      20            0.0.0.0:50080        0.0.0.0:*     users:(("tcprdr",pid=560,fd=3))

As the example shows, X receives packets destined to a port that it is not listening to. TPROXY target makes this possible.

Why is it different from REDIRECT? Because TPROXY does not modify the transport layer header, it only forwards the packet without any modification. It also does not require connection tracking as the local port for the connection socket will be the original destination port.

The benefit of this function over the first two is that no exact port matching is necessary, so the users do not have to explicitly send SSH traffic to port 50080 to reach the service, just to mention an example.

There is also a fourth scenario, when only the -t flag is set, that means X uses its own address to communicate with Z.

To sum up this a bit: IP_TRANSPARENT socket option makes it possible to assign an IP address to a socket regardless of whether it is assigned to any of the network interfaces on our machine or not. None of these require connection tracking nor ip forwarding option set in the kernel, because the packets are not forwarded, their payload is only copied from one socket to another.

To decide which socket we want to set transparent, the circumstances should be known, for now it is enough that it is possible, and the iptables support is only necessary for the listening socket to work.

Who knows who?

The answer is different on all three of the described scenarios.

In the firs case, both Y and Z knows X but not each other.

In the second one, Y knows X and Z knows Y but not X, as the source address of the socket between X and Z is that of Y.

From this point of view, the third option is the same as the second one, only the destination port from Y can vary.

In the forth scenario, we have the same knowledge as in the first, only the destination port from Y can vary.

Who configures this solution?

There is another difference from the other solutions here. A proxy is something that the service provide of the client configures to track and/or improve the internet access of the client.

Written on May 2, 2018