How Do Containers Communicate in Kubernetes? A Deep Dive into CNI and Network Namespaces
This article explains the fundamentals of container networking in Kubernetes, covering network namespaces, veth pairs, bridges, CNI plugins such as Flannel and Calico, their routing modes, and practical command‑line examples that illustrate intra‑host and cross‑host communication.
In Kubernetes, network connectivity between containers is essential, and Kubernetes relies on a plug‑in architecture rather than implementing its own container network. The basic principles are that any pod can communicate directly with any other pod across nodes without NAT, nodes can talk to pods, and each pod has an independent network stack shared by its containers.
Container Network Basics
A Linux container’s network stack is isolated in its own network namespace, which includes a network interface, a loopback device, a routing table, and iptables rules. Implementing container networking depends on several Linux networking features:
Network Namespace : isolates independent network protocol stacks.
Veth Pair : a pair of virtual Ethernet devices that connect different network namespaces; traffic sent on one end appears on the other.
Iptables/Netfilter : Netfilter runs in the kernel to apply filtering, modification, or dropping rules; iptables runs in user space to manage Netfilter rule tables.
Bridge : a layer‑2 virtual device similar to a switch that forwards frames based on learned MAC addresses.
Routing : Linux maintains a routing table to decide where IP packets should be sent.
Same‑Host Communication
On a single host, Docker creates a
docker0bridge. Containers attached to this bridge communicate via a
vethpair: one end resides in the container’s namespace, the other end appears as a virtual interface on the host.
<code>docker run -d --name c1 hub.pri.ibanyu.com/devops/alpine:v3.8 /bin/sh</code> <code>docker exec -it c1 /bin/sh</code>
<code>/ # ifconfig</code>
<code>eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02</code>
<code> inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0</code>
<code>/ # route -n</code>
<code>Destination Gateway Genmask Flags Metric Ref Use Iface</code>
<code>0.0.0.0 172.17.0.1 0.0.0.0 UG 0 0 0 eth0</code>
<code>172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0</code>The
eth0interface inside the container is one end of the veth pair. The host side can be inspected with:
<code>ifconfig</code>
<code>docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500</code>
<code> inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255</code>
<code>veth20b3dac: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500</code>The host‑side veth device (
veth20b3dac) is linked to the
docker0bridge. When a container pings another container, the ARP broadcast resolves the destination MAC address, the bridge forwards the frame to the appropriate veth peer, and the packet reaches the target container.
Cross‑Host Communication
By default, containers on different hosts cannot reach each other via IP. The Kubernetes Container Network Interface (CNI) provides a standard API for plugging in network solutions such as Flannel, Calico, Weave, and Contiv. CNI creates its own bridge (usually
cni0) to replace
docker0.
CNI supports three implementation modes:
Overlay : uses tunnels (e.g., VXLAN, IPIP) to encapsulate the entire pod network and deliver it across hosts, independent of the underlying network.
Host‑gw (layer‑3 routing) : each node installs routes to other pod CIDRs; works only when nodes share the same L2 network. Implemented by Flannel host‑gw and Calico BGP.
Underlay : pods and hosts are on the same L3 network; no tunnel is needed, but it relies on the underlying network’s capabilities.
Flannel Host‑gw Example
When a pod on
node1sends traffic to a pod on
node2, the routing rule looks like:
<code>10.244.1.0/24 via 10.168.0.3 dev eth0</code>This directs packets for the
10.244.1.0/24CIDR to the next‑hop IP
10.168.0.3(the other node), where the
cni0bridge forwards them to the destination pod.
Calico Architecture
Calico CNI plugin : invoked by kubelet to set up the pod’s network namespace.
Felix : runs on each node to program routing rules and FIB entries.
BIRD : a BGP daemon that distributes routing information among nodes.
confd : manages configuration templates.
Calico does not create a bridge; instead, it creates a veth pair for each pod and installs a host‑side route, e.g.:
<code>10.92.77.163 dev cali93a8a799fe1 scope link</code>When the pod’s IP packet reaches the host, the route sends it to the appropriate veth interface, which then traverses the underlying network. Calico’s default mode is a node‑to‑node mesh where every node runs a BGP client that peers with all other nodes, scaling quadratically with node count. For larger clusters, a Router‑Reflector (RR) topology is recommended, reducing the number of BGP sessions.
In environments where nodes are not on the same L2 segment, Calico can fall back to IPIP overlay mode, adding a tunnel device (
tunnel0) and a rule such as:
<code>10.92.203.0/24 via 10.100.1.2 dev tunnel0</code>This encapsulates pod traffic in an IPIP tunnel, allowing it to traverse L3 networks before being decapsulated on the destination node.
Choosing a networking solution depends on the deployment scenario: public clouds often use the cloud provider’s CNI or Flannel host‑gw for simplicity, while private data‑center environments may benefit from Calico’s BGP‑based routing for better performance and flexibility.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.