Go directly to namespace jail: Locking down network traffic between Kubernetes namespaces

Go directly to namespace jail: Locking down network traffic between Kubernetes namespaces

Alejandro Pedraza

Dec 14, 2021

How do you restrict network traffic between namespaces in a Kubernetes cluster? Whether you’re running a multi-tenant cluster with strict isolation guarantees or simply want to introduce a layer of control, locking down namespaces within a cluster is a common desire for Kubernetes operators.

In this guide, we’ll show you how to prevent traffic between namespaces using Linkerd’s traffic policies. We’ll accomplish this in a way that uses Linkerd’s secure service identities, is fully zero-trust compatible, and scales automatically as new workloads are introduced to the cluster.

What are traffic policies?

The traffic policy (or “server authorization”) feature introduced in Linkerd 2.11 gives Kubernetes users fine-grained control of which types of traffic are permitted within a cluster. Like Kubernetes’s built-in NetworkPolicy support, Linkerd’s traffic policies allow you to do things like restrict network connections between services according to a set of rules. Unlike NetworkPolicies, Linkerd’s traffic policies are built on the secure workload identities provided by Linkerd’s automatic mTLS feature, which means that not only do you get authenticity, confidentiality, and integrity of your communication, you can build authorization policy based on this same workload identity.

Finally, since Linkerd operates at Layer 7 in the OSI model, its traffic policies give you a lot more expressivity than NetworkPolicies, especially around how you identify workloads and namespaces—something we’ll take full advantage of in this article.

How do traffic policies work?

Let’s go through a quick overview of how Linkerd’s traffic policies work. As of Linkerd 2.11, traffic policy is inbound policy, meaning it governs the behavior of traffic when it is received by meshed pods. It does not govern traffic when it is sent by meshed pods—that’s coming in a later release—nor does it govern traffic to unmeshed pods. (Linkerd can only control the workloads running on its data plane, after all.)

In Linkerd 2.11, we have three basic tools for policy configuration:

  1. A default-inbound-policy Kubernetes annotation, which determines the default policy for a set of pods. This default policy captures a basic behavior, like “deny all”, or “allow only in-cluster mTLS”.
  2. A Server CRD that selects over an arbitrary set of pods in a namespace, and specifies a specific port on those pods;
  3. A ServerAuthorization CRD which selects over one or more Server CRs and specifies which types of traffic are permitted to the corresponding ports and pods.

Together, these tools allow you to specify a wide range of behaviors, from “everything is allowed” to “port 8080 on service Foo is only allowed to receive mTLS traffic from services that use the Bar service account”, to lots more. In this guide, we’re only going to focus on a subset of these features—just enough to isolate namespaces. (For the full set of policy features, be sure to read the complete policy docs.)

Restricting traffic to namespaces

In this article, our goal is to use Linkerd’s traffic policies to deny traffic between namespaces while allowing traffic within a namespace. To do this, we also make some exceptions for traffic that is allowed across namespaces, such as health checks and metrics scrapes, as shown in the diagram below.

The way to accomplish this with Linkerd will be a set of policy configuration objects per namespace. For a given namespace, our configuration will be:

  1. A default policy of deny, set across the namespace. This will lock down all traffic to that namespace.
  2. One Server CR for each application port used by services in the namespace. This will allow us to enable traffic to these ports.
  3. A single ServerAuthorization CR that allows access to the Server CRs, from all identities in this namespace.
  4. Finally, one Server and ServerAuthorization pair to handle exceptions to the “no non-namespace traffic” rule, e.g. for health checks, metrics scrapes, etc.

Total config required: one annotation, one Server CR per port plus one for the exceptions, and two ServerAuthorization CRs. Not too bad!

Let’s look at a concrete example with our favorite application, Emojivoto.

Setup

We’re going to assume you have Linkerd installed—see our Getting Started guide if you haven’t done that already. Install the Emojivoto application and add it to Linkerd in one fell swoop:

linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -

Optionally, you can set up a port forward to the app’s web server, at which point you can visit http://localhost:8080 and see Emojivoto in all its glory.


kubectl -n emojivoto port-forward svc/web-svc 8080:80 &
# now visit http://localhost:8080


Let the isolation begin!

At this point, Emojivoto is running, but traffic is unrestricted by default. If we attach our Linkerd cluster to Buoyant Cloud and look at, say, the emoji service, we’ll see that traffic to emoji is unrestricted:

The emoji service, like all other services in this namespace, is using the all-unauthenticated default policy (Linkerd’s default, er, default policy), and thus all traffic to all ports is currently allowed. nno Let’s start locking things down. First, we’ll annotate the namespace with the default policy for denying all traffic.

kubectl annotate ns emojivoto config.linkerd.io/default-inbound-policy=deny

Now the default policy for the namespace is deny. Note this doesn’t actually change anything for existing pods—they need to restart before they read the annotation. However, at this point, any new pods will actually be unable to start, since this policy also prevents liveness and readiness checks from completing. So before we get into application pods, let’s fix that.

We’ll do that by creating an exception to the deny rule. We’ll make a Server called admin that matches Linkerd’s admin port on every pod, as well as a corresponding ServerAuthorization called admin-everyone that allows all traffic to that port.


cat << EOF | kubectl apply -f -
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  namespace: emojivoto
  name: admin
spec:
  port: linkerd-admin
  podSelector:
    matchLabels: {} # every pod
  proxyProtocol: HTTP/1
---
apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: emojivoto
  name: admin-everyone
spec:
  server:
    name: admin
  client:
    unauthenticated: true
EOF

Conveniently, Linkerd’s admin port doesn’t just handle health checks for the pod, it’s also where the metrics data is provided. This opens up the pods for observability (e.g. via the linkerd-viz extension, or through Buoyant Cloud) as well.

We can use the linkerd authz command to ensure these policies are in place for our workloads:


$ linkerd authz -n emojivoto deploy
SERVER   AUTHORIZATION
 admin  admin-everyone

Similarly, if we look at our Buoyant Cloud dashboard for the emoji service, we now see the admin and admin-everyone policy components, as well as the (correct) all-unauthenticated default policy—remember again that we need to restart the pod for the new default policy to take effect.



At this point, new pods can be created since health, readiness, and liveness checks are all unrestricted. Phew!

Before we restart our pods to apply our deny default policy, however, we need to explicitly authorize the actual application traffic.

To do that, per our plan above, we’re going to add one Server per application port. For the Emojivoto app, there are only two such ports: a gRPC port used by the emoji and voting pods, and an http port used by the web workload. Let’s create some Servers, which we’ll just call grpc and http, to capture them:


cat << EOF | kubectl apply -f -
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  namespace: emojivoto
  name: grpc
  labels:
    app: emojivoto-srv
spec:
  podSelector:
    matchLabels: {} # every pod
  port: grpc
  proxyProtocol: gRPC
---
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  namespace: emojivoto
  name: http
  labels:
    app: emojivoto-srv
spec:
  podSelector:
    matchLabels: {} # every pod
  port: http
  proxyProtocol: HTTP/1
EOF

Note that we’ve used a global match on the pod selector, so any workload that’s added to the namespace later and uses one of these ports will automatically be included in this rule. (If we wanted, we could be more restrictive and lock these policies down to specific workloads by changing the spec section.)

Finally, with those Servers defined, we’ll create a single ServerAuthorization called ns-only that allows access to them from meshed workload in the same namespace:


cat << EOF | kubectl apply -f -
apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: emojivoto
  name: ns-only
spec:
  server:
    selector:
      matchLabels:
        app: emojivoto-srv
  client:
    meshTLS:
      identities:
      - "*.emojivoto.serviceaccount.identity.linkerd.cluster.local"
EOF

Here, the client section here is the crucial one: it allows incoming traffic only from meshed clients whose identity ends in emojivoto.serviceaccount.identity.linkerd.cluster.local, which matches exactly the identities in the current namespace.

Our handy linkerd authz command tells us that these Server and ServerAuthorization pairs are active for our workloads:


$ linkerd authz -n emojivoto deploy
SERVER   AUTHORIZATION
 admin  admin-everyone
  grpc         ns-only
  http         ns-only

With all that in place, we’re finally ready to roll all our pods and pick up the deny default annotation. At that point, our jail is complete.


$ kubectl -n emojivoto rollout restart deploy
deployment.apps/vote-bot restarted
deployment.apps/emoji restarted
deployment.apps/voting restarted
deployment.apps/web restarted

Let’s verify our policy by taking one last look at our Buoyant Cloud policy analysis page, which now looks like this:

As you can see, we’ve allowed all traffic to port 4191 (via admin and admin-everyone); traffic to port 8080 allowed only if it is mTLS’d from an identity in the emojivoto namespace (via grpc and ns-only); and all other traffic is denied (via default:deny). If we were to look at the web service (which uses http) rather than the emoji service, we’d see a similar thing, but with our http Server rather than the grpc Server active.

Finally, we can verify that traffic is still working by looking at the logs for for the vote-bot workload:


$ kubectl -n emojivoto logs -f -l app=vote-bot -c vote-bot
✔ Voting for :wave:
✔ Voting for :no_good_woman:
✔ Voting for :world_map:
✔ Voting for :bowing_man:
...

And that’s it! Congratulations, you’ve put Emojivoto in namespace jail!

Can we break in?

Seeing the policy analysis is one thing, but actually seeing a failure is another. Let’s check that no traffic is allowed from outside the namespace by creating a different Emojivoto installation in a new namespace, and pointing its vote-bot deployment to the original Emojivoto. If we’ve set our policy up correctly, this new vote-bot should have its traffic denied. (The other components in the emojivoto-2 namespace will also be installed, but we’ll just be ignoring them.)

We’ll do this with a little sed magic:


# mangles the emojivoto.yml file to install it in the emojivoto-2 namespace
curl -sL https://run.linkerd.io/emojivoto.yml \
  | sed 's/name: emojivoto/name: emojivoto-2/; s/namespace: emojivoto/namespace: emojivoto-2/' \
  | linkerd inject - \
  | kubectl apply -f -

Due to the way this manifest is structured, this sed command installs all components in the emojivoto-2 namespace, but crucially, still instructs the workloads to communicate with the original, emojivoto namespace.

So at this point, we have a vote-bot in the emojivoto namespace that is happily running, and a vote-bot in the emojivoto-2 namespace that should have its requests denied. We can see failures in the logs for this new vote-bot:


$ kubectl -n emojivoto-2 logs -f -l app=vote-bot -c vote-bot
unexpected end of JSON input
unexpected end of JSON input
...

We can also confirm this by installing the linkerd-viz extension, and querying the server authorization stats for the target pod:


linkerd viz install | kubectl apply -f -
linkerd viz check # wait for containers to come up

Then:


$ linkerd viz authz -n emojivoto deployment/web
SERVER  AUTHZ           SUCCESS     RPS  LATENCY_P50  LATENCY_P95  LATENCY_P99
admin   admin-everyone  100.00%  1.2rps          1ms          2ms          2ms
http    internal         90.00%  2.0rps          2ms          3ms          4ms
http    [UNAUTHORIZED]        -  1.0rps            -            -            -

The last entry reveals some inbound traffic is getting denied. For getting the exact identity of the denied traffic, you can tail the pod’s linkerd-proxy logs (omitted here for brevity).

Finally, Buoyant Cloud emails us with an alert, telling us that traffic was denied somewhere in the system. If this were an unexpected incident, we’d want to investigate further.

Conclusion

In this guide, we’ve shown you an example application of Linkerd’s traffic policies: locking down a namespace so that all communication within a namespace is allowed, but communication from outside the namespace is denied by default, with only certain exceptions permitted.

This is just the tip of the iceberg for Linkerd’s policies. Want to learn more? Read through our extensive traffic policy documentation and hop into the Linkerd Community Slack, where you’ll find plenty of helpful Buoyant folks ready to help.

Finally, don’t forget to register for our hands-on workshop on Linkerd’s policies, to be held on Thursday, January 13th, 2022, at 9am PT / 12pm ET / 6pm CET. Hope to see you there!


book
Further reading
book
Further reading
book
Further reading
book
Further reading
book
Further reading