Nicolas Le Manchet

Docker userland proxy

I was trying to make my task queue based on Redis more resilient to network errors. Both my application and Redis use TCP keepalive to detect disconnects on long lived connections.

To do that, I was simulating network partitions by dropping traffic to/from the container Redis runs in. This is easily done by entering the Redis process network namespace and adding an iptables rule:

nsenter -t <Redis PID> --net iptables -A INPUT -j DROP

This actually worked and made my application and Redis unable to talk with each other, but not detect the disconnect if no traffic was happening (as TCP keepalive should). The reason is related to how Docker publishes ports to the host.

When publishing a port from a docker container using the default bridged network, the containerized process does not actually bind on the host. The bind is done by a Docker process called the Docker userland proxy which runs on the host as root.

When an application connects to the containerized Redis, it in fact connects to another process which forwards packets to Redis. Thus my application did not detect the disconnect because the TCP keepalive probes between it and the userland proxy were successful, only the traffic between the proxy and Redis was dropped.