The enterprise architect's guide to the service mesh

Download whitepaper

Relevant articles:

Scott Rigby

August 7, 2024

Linkerd

Welcome to Linkerd!

In this guide we'll walk you through a task that is increasingly common in the Kubernetes space: migrating an existing Istio deployment to Linkerd. We'll start with a general overview of our recommended strategy for approaching this task, and then dig into some of the gory details.

The good news is that most of the time, this is a pretty straightforward task that primarily consists of "removing lots of unnecessary Istio configuration". But as with all such changes, it can get a little hairy depending on the specifics of what your application does and how tightly you've (possibly accidentally) built dependencies to Istio's behavior. Happily, there is an incremental way to approach your migration which can help reduce overall risk—we'll talk about that below. Either way, be sure to read through carefully and think through your plan and strategy before you dive right in.

Why migrate?

If you're reading this guide, you probably are already aware of several reasons why you would migrate from Istio to Linkerd! But let's go through them anyway.

At the macro level, both service meshes provide a very similar level of functionality. Both Linkerd and Istio offer mutual TLS between all meshed pods, multi-cluster communication, "golden metrics" for workload health, and a host of reliability features such as retries, timeouts, circuit breaking, and more. As of the time of this writing (mid-2024), the biggest feature differences between the two are support for egress traffic monitoring and control and handling of ingress traffic. But Linkerd is a fast-moving project and both of those are on the near-term roadmap—by the time you read this, those features may already be available!

If the feature set is so similar, why migrate? The primary reason comes down to operational simplicity. Istio is notoriously complex; Linkerd, by contrast, is built with simplicity, especially operational simplicity, as an explicit goal. This means that Linkerd has a predictable and understandable operational model and security surface area; can survive sustained, long-term operation without human intervention; can gracefully handle unforeseen situations without help; and (perhaps most importantly to you)—has an extremely low cost of ownership.

You'll see those differences in practice as you go through this guide. You'll be throwing away configuration and simplifying your service mesh configuration, sometimes dramatically. Let's get started!

A bit of background

In order to talk about migrations, we need to talk a bit about how Istio and Linkerd are configured, and how their approaches to configuration differ… but we also need to talk just a touch about Gateway API.

Gateway API

The Gateway API project within Kubernetes provides a vendor-independent way to talk about routing configuration, whether for ingress, egress, or within the cluster. It’s important for this discussion because both Istio and Linkerd support Gateway API, and therefore using Gateway API is a way to make migration quite a bit simpler. However, Gateway API can’t express everything that the Istio and Linkerd native APIs can – in particular, Gateway API can’t express authorization policy at this point. Thus, even with the Gateway API, we'll still need to make use of some mesh-specific configuration resources, especially around authorization policy.

Istio

Istio provides ingress and egress functionality as well as mesh functionality, and it supports Gateway API as well as defining several of its own configuration resources. In very broad strokes, when configuring Istio using the Istio API, Istio Gateway resources define points of ingress and egress, and VirtualService resources define routing rules. VirtualServices attach to Istio Gateways. (Note that the Istio Gateway resource is not the same as the Gateway resource from Gateway API.)

In Istio, authorization policy is configured with the Istio AuthorizationPolicy, PeerAuthentication, RequestAuthentication, and JWTRule resources.

Linkerd

Linkerd also supports Gateway API, and has its own (minimal) set of configuration CRDs. In contrast to Istio, Linkerd primarily relies on Gateway API types for advanced routing configuration, and otherwise is largely "zero config"—there is no VirtualService or Gateway equivalent necessary in Linkerd.

Linkerd manages authorization policy with its Server, AuthorizationPolicy, MeshTLSAuthentication, and NetworkAuthentication resources. (Note that the Linkerd AuthorizationPolicy is not the same resource as the Istio AuthorizationPolicy resource! They have the same name and (roughly) the same role, but not the same format.)

Planning the migration

Which Istio API are you using?

You can configure Istio with either the Istio API or with Gateway API – but on the Linkerd side, you’ll need to use Gateway API. If you’re already using Gateway API with Istio, great! you may not actually need to change your service mesh configuration.

If you’re using the Istio API, there’s some translation to be done. Fortunately, Gateway API includes a tool, ingress2gateway, intended to help with migrations from other APIs to Gateway API – and one of the “other APIs” it understands is the Istio API. It’s not perfect, but it’s a great place to start.

Which Istio features are you using?

If you installed with Helm, the default values for these charts (along with any user-specified override settings) are a record of the specific configuration settings applied to each Istio component, indicating which features have been enabled for your Istio installation. You can see a dump of these combined (computed) values for any Istio chart release by doing helm get values <release> -n istio-system -a. If you want to see which of these values you specified intentionally, remove the trailing -a to get only the user-specified values.

If you installed with istioctl, you would have enabled options with istioctl install --set whether you chose a profile (--set profile=demo), enabled components (--set components.cni.enabled=true), or overrode Istio settings (--set meshConfig.enableTracing=true). See Istio config documentation for the full set of options. To find out which options you enabled, you can use istioctl analyze for a detailed list.

If you find that you've enabled additional features Istio provides that Linkerd does not, you'll want to evaluate if you really need those extra features. If you find that you do need them, you’ll need to look at other tools from the community to support your use, or possibly delay your migration

Managing ingress and egress

Two Istio features worth calling out specifically are ingress and egress. Ingress is the operation of safely providing access from outside the cluster to workloads inside the cluster; egress is its counterpart, safely providing access to things outside the cluster from inside the cluster. Ingress is a problem that every cloud-native application must solve; egress, in many situations, is not tightly managed. Workloads that provide ingress are called ingress controllers; those managing egress are called egress controllers. Ingress controllers that support Gateway API are often called gateway controllers, but we’ll just stick with “ingress controller” for the moment.

Managing ingress

Istio includes its own ingress controller (the Istio “ingress gateway”). As of the time of this writing (mid 2024), Linkerd does not provide its own ingress feature (though one is scheduled for an upcoming release) but instead works with any existing ingress controller. This means that there are two options to consider for ingress controllers:

‍

If you’re using Istio with a third-party ingress controller, you can probably continue using it with Linkerd – check out the documentation on using ingress controllers with Linkerd, as well as perhaps the list of Gateway API gateway controller implementations.
If you’re using the Istio ingress gateway, you can certainly consider replacing it with some other ingress controller (check out the docs mentioned above if you want to do that). However, there’s also a way to continue using the Istio ingress gateway with Linkerd. This is not the most common way the Istio ingress controller is used, of course, but it can definitely simplify a migration.

The Istio ingress gateway lives in a Deployment named istio-ingressgateway in the istio-system namespace. It has a single container named istio-proxy, and getting it working with Linkerd is almost as simple as annotating it for Linkerd injection – however, in order to prevent "double meshing", Linkerd currently deliberately won’t inject a workload with a container named istio-proxy. Normally this is a helpful failsafe, but in the case of Istio's ingress it gets in our way, so we need to do something a little silly: we need to change the name of istio-ingressgateway’s single container first. You can do this with kubectl edit, or if you want to do it from the command line you can use kubectl patch:

kubectl patch deploy -n istio-system istio-ingressgateway \
    --type=json \
    --patch='[ { "op": "replace", "path": "/spec/template/spec/containers/0/name", "value": "istio-gateway" } ]'

‍

Once that happens, you can edit the istio-ingressgateway Deployment further to add the linkerd.io/inject=enabled annotation to the Deployment’s Pod template. Again, you can use kubectl edit for this, or kubectl patch:

kubectl patch deploy -n istio-system istio-ingressgateway \
    --type=merge \
    --patch 'spec: { template: { metadata: { annotations: { linkerd.io/inject: enabled } } } }'

‍

(We use two different patch strategies intentionally: merge patches are simpler, but changing the container’s name with a merge patch is tricky.)

At this point you should have a working Istio ingress gateway, meshed with Linkerd.

Managing egress

Linkerd's egress observability and control is scheduled for Linkerd 2.17. If you make heavy use of Istio’s egress controls, you’ll need to wait for that version to be released—and if it's out already and we just haven't updated this paragraph, please let us know!

Metrics

Another specific thing worth calling out is metrics. Both Istio and Linkerd use Prometheus as the underlying storage mechanism for metrics, but they don’t use the same names for their metrics and they don’t necessarily cover all the same things. For example:

Istio Metric

istio_request_total

istio_request_duration_milliseconds

istio_request_bytes

Similar Linkerd Metric

request_total

response_latency_ms

none

There is no clever way to manage metrics during a migration: any significant metrics pipeline is going to need work when moving from Istio to Linkerd. The Linkerd proxy metrics documentation is a critical resource here: it provides the details about the various metrics produced by the Linkerd proxy that one needs to build a metrics pipeline. For example, Linkerd makes the Golden Metrics visible as follows:

Traffic rate: calculate a rate from request_total or response_total, depending on whether you’re interested in requests issued or requests completed. In PromQL, this is something like irate(response_total) or irate(request_total), though of course you’ll want additional qualifiers to select only events for the specific workload you’re interested in (using authority, namespace, direction, etc).
Success rate: calculate a percentage by counting response_total events with a classification: success label and dividing by the total number of response_total events. In PromQL this is something like (irate(response_total{classification=\"success\")) / irate(response_total).
Latency: the response_latency_ms is directly useful for this Golden Metric, since it’s already a histogram.

Both Istio and Linkerd have powerful (and complex) metrics from which a variety of pipelines can be constructed, and a full exploration of migrating between the two is outside the scope of this guide.

Migrating using ingress2gateway

Linkerd uses the Gateway API resources to configure advanced routing and route-based policy – so if you’re already using Gateway API with Istio, you’re in great shape. If not, the ingress2gateway tool from the Gateway API project can help out. ingress2gateway started as a tool to migrate from Ingress resources to Gateway API, but has been expanded to handle (among other things) Istio Gateways and VirtualServices.

ingress2gateway has some significant limitations:

Since Gateway API doesn’t currently support authorization policy, neither does ingress2gateway.
ingress2gateway currently translates what it can, and silently ignores what it can’t. There’s work underway to improve this.

As a simple example, here’s a simple setup using Istio to support the Faces demo. First, we have a Gateway describing the point of ingress into our cluster:

---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: faces-gateway
  namespace: faces
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 8080
      name: http
      protocol: HTTP
    hosts:
    - "*"

‍

Next, we have a VirtualService that routes /gui/ to the Faces GUI:

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: faces-gui
  namespace: faces
spec:
  hosts:
  - "*"
  gateways:
  - faces-gateway
  http:
  - match:
    - uri:
        prefix: /gui/
    rewrite:
      uri: /
    route:
    - destination:
        host: faces-gui

‍

and another to route /face/ to the face workload:

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: faces
  namespace: faces
spec:
  hosts:
  - "*"
  gateways:
  - faces-gateway
  http:
  - match:
    - uri:
        prefix: /face/
    route:
    - destination:
        host: face

‍

Finally, let’s use a third VirtualService to split traffic sent to the color Service 50/50 between the color workload and the color2 workload:

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: color-split
  namespace: faces
spec:
  hosts:
  - "color.faces.svc.cluster.local"
  http:
  - route:
    - destination:
        host: color
      weight: 50
    - destination:
        host: color2
      weight: 50

‍

This is 100% supportable by Gateway API, and ingress2gateway manages it almost correctly: it doesn’t quite translate the color VirtualService correctly. Linkerd requires east-west routing rules like this one to conform to the GAMMA initiative, and ingress2gateway doesn’t quite do that.

Here’s the raw ingress2gateway output (note that I’ve deleted empty status fields and the Kubectl last-applied-configuration annotation):

$ ingress2gateway -n faces print
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: faces-gateway
  namespace: faces
spec:
  gatewayClassName: istio     # Edit for your gateway controller!
  listeners:
  - name: http-protocol-wildcard-ns-wildcard
    port: 8080
    protocol: HTTP
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: faces-idx-0
  namespace: faces
spec:
  hostnames:
  - '*'
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: faces-gateway
  rules:
  - backendRefs:
    - name: face
      namespace: faces
      weight: 0
    matches:
    - path:
        type: PathPrefix
        value: /face/
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: faces-gui-idx-0-prefix-match
  namespace: faces
spec:
  hostnames:
  - '*'
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: faces-gateway
  rules:
  - backendRefs:
    - name: faces-gui
      namespace: faces
      weight: 0
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          replacePrefixMatch: /
          type: ReplacePrefixMatch
    matches:
    - path:
        type: PathPrefix
        value: /gui/
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: color-split-idx-0
  namespace: faces
spec:
  hostnames:
  - color.faces.svc.cluster.local
  rules:
  - backendRefs:
    - name: color
      namespace: faces
      weight: 50
    - name: color2
      namespace: faces
      weight: 50

‍

and here’s what that final HTTPRoute needs to look like for Linkerd:

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: color-split-idx-0
  namespace: faces
spec:
  parentRefs:
  - name: color
    kind: Service
    group: core
    port: 80
  rules:
  - backendRefs:
    - name: color
      namespace: faces
      weight: 50
    - name: color2
      namespace: faces
      weight: 50

‍

So VirtualService rules for east-west routing will need some massaging. Likewise, suppose we add another VirtualService to inject delays when talking to the smiley workload:

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: smiley-delay
  namespace: faces
spec:
  hosts:
  - "smiley.faces.svc.cluster.local"
  http:
  - fault:
      delay:
        percentage:
          value: 100
        fixedDelay: 5s
    route:
    - destination:
        host: smiley

‍

This is not something that Gateway API can currently support, and ingress2gateway will ignore the fault stanza, producing:

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: smiley-delay-idx-0
  namespace: faces
spec:
  hostnames:
  - smiley.faces.svc.cluster.local
  rules:
  - backendRefs:
    - name: smiley
      namespace: faces
      weight: 0

‍

In summary, just remember to review ingress2gateway’s output and test as you go! And, as noted above, remember that you’ll decide what to do about ingress – this best done before you do anything else.

Of course, if you’re using Gateway API to configure Istio, you may be able to completely skip this step and just use the Gateway API resources you already have. Again, read the configuration carefully, and carefully check out what’s supported and what’s not.

Migrating authorization policy

As mentioned before, ingress2gateway doesn’t support authorization policy, because Gateway API doesn’t yet support it. An additional complexity is that Linkerd and Istio approach authorization policy differently.

In Linkerd, authorization policy starts with the choice of a “default inbound policy”, which has the effect of choosing which traffic is denied by default. This can be defined by annotating namespaces or pods, or by creating Server resources. Linkerd AuthorizationPolicy resources then allow specifying traffic that is allowed when the default would deny it. Ultimately, the default inbound policy denies traffic, and AuthorizationPolicy resources allow traffic.

By contrast, Istio AuthorizationPolicy resources can specify traffic to allow (the ALLOW rule), traffic to deny (the DENY rule), or traffic where a custom external authorization provider will make the decision (the CUSTOM rule). Istio AuthorizationPolicies can also take action based on JWTs. There is currently no Linkerd equivalent to this CUSTOM or JWT functionality, so users of those policies could consider how value they are to the application’s overall security and compliance posture before migrating.

As an example, in Istio you might see a setup like this to restrict what traffic is allowed in the faces namespace:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: restrict-faces
  namespace: faces
spec:
  action: ALLOW
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account
        - cluster.local/ns/faces/sa/default
    to:
    - operation:
        methods: ["GET"]

‍

This will restrict traffic even though it’s an ALLOW rule: as long as there are no authorization resources that match a workload, Istio will permit all traffic, but once any authorization resources match the workload, every request must match some ALLOWed rule (or a CUSTOM rule where the external authorization server allows the traffic) or the traffic will be denied. The end result is that any meshed traffic from the Istio ingress gateway, or from the default ServiceAccount in the faces namespace, will be allowed – but all other traffic will be denied.

The Linkerd equivalent is rather different. First, you’ll need to annotate the faces namespace with

config.linkerd.io/default-inbound-policy: deny

Then you’ll need to apply authorization policy resources to describe the policy:

---
apiVersion: policy.linkerd.io/v1alpha1
kind: MeshTLSAuthentication
metadata:
  name: ingress-or-faces
  namespace: faces
spec:
  identities:
    - "ingress-sa.ingress-ns.serviceaccount.identity.linkerd.cluster.local"
    - "default.faces.serviceaccount.identity.linkerd.cluster.local"
---
apiVersion: policy.linkerd.io/v1alpha1
kind: AuthorizationPolicy
metadata:
  name: ingress-or-faces
  namespace: faces
spec:
  targetRef:
    kind: Namespace
    name: faces
  requiredAuthenticationRefs:
  - group: policy.linkerd.io
    kind: MeshTLSAuthentication
    name: ingress-or-faces

‍

Here, we use two different resources: the MeshTLSAuthentication describes which identities will be accepted by the AuthorizationPolicy, and the AuthorizationPolicy describes what those identities are allowed to do.

One more note for both meshes: liveness and readiness probes always come from the kubelet itself and cannot be meshed, so you have to take extra precautions. Istio provides a Pod annotation that allows probes to be excluded from policies, where Linkerd requires you to define network-based authorization policies to explicitly authorize the probes.

Actually running the migration

The safest way to migrate from Istio to Linkerd is to spin up an entirely new cluster running your application with Linkerd, then gradually shift traffic from the Istio cluster to the Linkerd cluster. This multicluster migration does the best job of preserving secure communications across your entire application all the time, but of course it can be extremely operationally challenging if you’re not already accustomed to a multicluster world.

It’s also possible to do a single-cluster migration by taking advantage of the fact that Istio and Linkerd can coexist perfectly well within the same cluster, as long as they don’t both try to operate in the same namespace at the same time. You will need to think carefully about the order in which you migrate application namespaces because of this: at the boundary between the two meshes (for example, if a workload that’s still meshed with Istio calls a workload newly meshed with Linkerd) you will not have mTLS.

Multicluster Migration

The overall sequence for a multicluster migration goes like this:

Bring up the Linkerd cluster and install Linkerd into it.
Install your chosen ingress controller (which might be the Istio ingress gateway!) and get it meshed with Linkerd.
Configure your ingress controller for your application.
For each of your application’s namespaces:some text
1. (Istio API only) Run ingress2gateway for the namespace as a starting point.
2. Install your application microservices for the namespace.
3. Install the Gateway API resources for the namespace, plus whatever Linkerd-specific configuration you need.
4. TEST.
When the entire application is installed and tested, gradually move traffic from the Istio cluster to the Linkerd cluster.
Tear down the Istio cluster.

Steps marked “(Istio API only)” can be skipped if you’re configuring Istio with Gateway API.

Single-Cluster Migration

For a single-cluster migration, things are a bit different.

Install the Linkerd CRDs and control plane into the cluster.
Deal with ingress:some text
1. If you’re going to bring in a new ingress controller, now is the time to install it, get it meshed with Linkerd, and configure it for your application.
2. If you’re going to stick with the Istio ingress controller, it probably makes sense to leave it alone at this point.
For each namespace:some text
1. (Istio API only) Run ingress2gateway for the namespace as a starting point.
2. Install Gateway API resources for the namespace, plus whatever Linkerd-specific configuration you need.
3. Switch the namespace from Istio to Linkerd:some text
  1. If the namespace was marked for injection, remove the istio-injection=enabled label from the namespace and add the linkerd.io/inject=enabled annotation, then restart all the workloads in the namespace.
  2. If individual workloads were instead marked for injection, remove the istio-injection=enabled label from each workload and add the linkerd.io/inject=enabled annotation. This will restart the workload.
4. TEST.
Finally, if you’re using the Istio ingress gateway, switch it from Istio to Linkerd as described above.

Steps marked “(Istio API only)” can be skipped if you’re configuring Istio with Gateway API.

Conclusion

Migrating from Istio to Linkerd is possible to do safely and in some cases easily – in fact, we have a migration demo post that shows how to tackle a simple migration start to finish! Like all such tasks, the exact steps depend on the details, especially how invested you are in the specific implementations of features that Istio provides.

Of course, we're always available to help! Buoyant Enterprise for Linkerd is our enterprise distribution of Linkerd designed for sustained production use—it's the same distribution of Linkerd that we run in our own production systems, and you can get it up and running in under 5 minutes. Give us a shout and we'll help guide you through your Istio to Linkerd migration.

‍

Migrating from Istio to Linkerd

The enterprise architect's guide to the service mesh

Scott Rigby

Welcome to Linkerd!

Why migrate?

A bit of background

Gateway API

Istio

Linkerd

Planning the migration

Which Istio API are you using?

Which Istio features are you using?

Managing ingress and egress

Managing ingress

Managing egress

Metrics

Migrating using ingress2gateway

Migrating authorization policy

Actually running the migration

Multicluster Migration

Single-Cluster Migration

Conclusion

Book a meeting