How do you restrict network traffic between namespaces in a Kubernetes cluster? Whether you’re running a multi-tenant cluster with strict isolation guarantees or simply want to introduce a layer of control, locking down namespaces within a cluster is a common desire for Kubernetes operators.
In this guide, we’ll show you how to prevent traffic between namespaces using Linkerd’s traffic policies. We’ll accomplish this in a way that uses Linkerd’s secure service identities, is fully zero-trust compatible, and scales automatically as new workloads are introduced to the cluster.
The traffic policy (or “server authorization”) feature introduced in Linkerd 2.11 gives Kubernetes users fine-grained control of which types of traffic are permitted within a cluster. Like Kubernetes’s built-in NetworkPolicy support, Linkerd’s traffic policies allow you to do things like restrict network connections between services according to a set of rules. Unlike NetworkPolicies, Linkerd’s traffic policies are built on the secure workload identities provided by Linkerd’s automatic mTLS feature, which means that not only do you get authenticity, confidentiality, and integrity of your communication, you can build authorization policy based on this same workload identity.
Finally, since Linkerd operates at Layer 7 in the OSI model, its traffic policies give you a lot more expressivity than NetworkPolicies, especially around how you identify workloads and namespaces—something we’ll take full advantage of in this article.
Let’s go through a quick overview of how Linkerd’s traffic policies work. As of Linkerd 2.11, traffic policy is inbound policy, meaning it governs the behavior of traffic when it is received by meshed pods. It does not govern traffic when it is sent by meshed pods—that’s coming in a later release—nor does it govern traffic to unmeshed pods. (Linkerd can only control the workloads running on its data plane, after all.)
In Linkerd 2.11, we have three basic tools for policy configuration:
Together, these tools allow you to specify a wide range of behaviors, from “everything is allowed” to “port 8080 on service Foo is only allowed to receive mTLS traffic from services that use the Bar service account”, to lots more. In this guide, we’re only going to focus on a subset of these features—just enough to isolate namespaces. (For the full set of policy features, be sure to read the complete policy docs.)
In this article, our goal is to use Linkerd’s traffic policies to deny traffic between namespaces while allowing traffic within a namespace. To do this, we also make some exceptions for traffic that is allowed across namespaces, such as health checks and metrics scrapes, as shown in the diagram below.
The way to accomplish this with Linkerd will be a set of policy configuration objects per namespace. For a given namespace, our configuration will be:
Total config required: one annotation, one Server CR per port plus one for the exceptions, and two ServerAuthorization CRs. Not too bad!
Let’s look at a concrete example with our favorite application, Emojivoto.
We’re going to assume you have Linkerd installed—see our Getting Started guide if you haven’t done that already. Install the Emojivoto application and add it to Linkerd in one fell swoop:
Optionally, you can set up a port forward to the app’s web server, at which point you can visit http://localhost:8080 and see Emojivoto in all its glory.
At this point, Emojivoto is running, but traffic is unrestricted by default. If we attach our Linkerd cluster to Buoyant Cloud and look at, say, the emoji service, we’ll see that traffic to emoji is unrestricted:
The emoji service, like all other services in this namespace, is using the all-unauthenticated default policy (Linkerd’s default, er, default policy), and thus all traffic to all ports is currently allowed. nno Let’s start locking things down. First, we’ll annotate the namespace with the default policy for denying all traffic.
Now the default policy for the namespace is deny. Note this doesn’t actually change anything for existing pods—they need to restart before they read the annotation. However, at this point, any new pods will actually be unable to start, since this policy also prevents liveness and readiness checks from completing. So before we get into application pods, let’s fix that.
We’ll do that by creating an exception to the deny rule. We’ll make a Server called admin that matches Linkerd’s admin port on every pod, as well as a corresponding ServerAuthorization called admin-everyone that allows all traffic to that port.
Conveniently, Linkerd’s admin port doesn’t just handle health checks for the pod, it’s also where the metrics data is provided. This opens up the pods for observability (e.g. via the linkerd-viz extension, or through Buoyant Cloud) as well.
We can use the linkerd authz command to ensure these policies are in place for our workloads:
Similarly, if we look at our Buoyant Cloud dashboard for the emoji service, we now see the admin and admin-everyone policy components, as well as the (correct) all-unauthenticated default policy—remember again that we need to restart the pod for the new default policy to take effect.
At this point, new pods can be created since health, readiness, and liveness checks are all unrestricted. Phew!
Before we restart our pods to apply our deny default policy, however, we need to explicitly authorize the actual application traffic.
To do that, per our plan above, we’re going to add one Server per application port. For the Emojivoto app, there are only two such ports: a gRPC port used by the emoji and voting pods, and an http port used by the web workload. Let’s create some Servers, which we’ll just call grpc and http, to capture them:
Note that we’ve used a global match on the pod selector, so any workload that’s added to the namespace later and uses one of these ports will automatically be included in this rule. (If we wanted, we could be more restrictive and lock these policies down to specific workloads by changing the spec section.)
Finally, with those Servers defined, we’ll create a single ServerAuthorization called ns-only that allows access to them from meshed workload in the same namespace:
Here, the client section here is the crucial one: it allows incoming traffic only from meshed clients whose identity ends in emojivoto.serviceaccount.identity.linkerd.cluster.local, which matches exactly the identities in the current namespace.
Our handy linkerd authz command tells us that these Server and ServerAuthorization pairs are active for our workloads:
With all that in place, we’re finally ready to roll all our pods and pick up the deny default annotation. At that point, our jail is complete.
Let’s verify our policy by taking one last look at our Buoyant Cloud policy analysis page, which now looks like this:
As you can see, we’ve allowed all traffic to port 4191 (via admin and admin-everyone); traffic to port 8080 allowed only if it is mTLS’d from an identity in the emojivoto namespace (via grpc and ns-only); and all other traffic is denied (via default:deny). If we were to look at the web service (which uses http) rather than the emoji service, we’d see a similar thing, but with our http Server rather than the grpc Server active.
Finally, we can verify that traffic is still working by looking at the logs for for the vote-bot workload:
And that’s it! Congratulations, you’ve put Emojivoto in namespace jail!
Seeing the policy analysis is one thing, but actually seeing a failure is another. Let’s check that no traffic is allowed from outside the namespace by creating a different Emojivoto installation in a new namespace, and pointing its vote-bot deployment to the original Emojivoto. If we’ve set our policy up correctly, this new vote-bot should have its traffic denied. (The other components in the emojivoto-2 namespace will also be installed, but we’ll just be ignoring them.)
We’ll do this with a little sed magic:
Due to the way this manifest is structured, this sed command installs all components in the emojivoto-2 namespace, but crucially, still instructs the workloads to communicate with the original, emojivoto namespace.
So at this point, we have a vote-bot in the emojivoto namespace that is happily running, and a vote-bot in the emojivoto-2 namespace that should have its requests denied. We can see failures in the logs for this new vote-bot:
We can also confirm this by installing the linkerd-viz extension, and querying the server authorization stats for the target pod:
The last entry reveals some inbound traffic is getting denied. For getting the exact identity of the denied traffic, you can tail the pod’s linkerd-proxy logs (omitted here for brevity).
Finally, Buoyant Cloud emails us with an alert, telling us that traffic was denied somewhere in the system. If this were an unexpected incident, we’d want to investigate further.
In this guide, we’ve shown you an example application of Linkerd’s traffic policies: locking down a namespace so that all communication within a namespace is allowed, but communication from outside the namespace is denied by default, with only certain exceptions permitted.
This is just the tip of the iceberg for Linkerd’s policies. Want to learn more? Read through our extensive traffic policy documentation and hop into the Linkerd Community Slack, where you’ll find plenty of helpful Buoyant folks ready to help.
Finally, don’t forget to register for our hands-on workshop on Linkerd’s policies, to be held on Thursday, January 13th, 2022, at 9am PT / 12pm ET / 6pm CET. Hope to see you there!