Buoyant’s Linkerd Production Runbook
A guide to running the world's most advanced service mesh in production
Note: Buoyant Cloud provides fully automated upgrades of control plane and data plane components. If you're using Buoyant Cloud, you may be able to skip this section.
Generally speaking, Linkerd is designed for safe, in-place upgrades with no application downtime, when upgraded between consecutive stable versions—for example, from 2.11.1 to 2.12.0. (Upgrades that skip stable versions are sometimes possible, but are not always guaranteed; see the version-specific release notes for details.)
Note that, due to constraints that Kubernetes imposes, true zero-downtime upgrades are only possible if application components can themselves be “rolled” with zero downtime, as upgrading the data plane involves rolling injected workloads.
Upgrading Linkerd is done in two stages: control plane first, then data plane. To accomplish this, Linkerd’s data plane proxies are compatible with a control plane that is one stable version ahead; e.g. 2.11.1 data plane proxies can safely function with a 2.12.0 control plane.
Upgrading the control plane can be done via the linkerd upgrade command or via Helm. In either case, this will trigger a rolling deploy of control plane components, which should allow critical components to be upgraded without downtime. (Note that, in the event something does go wrong, Linkerd’s data plane proxies will continue functioning even if the control plane is unreachable; however, they will not receive service discovery updates.)
Once the control plane has been updated, the proxy-injector component will start injecting data plane proxies from the corresponding (newer) version. Since Kubernetes treats pods as immutable, upgrading the data plane thus requires rolling application components Fortunately, because of the forward compatibility between data plane proxy and control plane described above, these data plane upgrades can be done “lazily”—it is not necessary to immediately roll data plane deployments after upgrading the control plane.
Thus, our recommended steps for upgrading are:
As with all modifications of critical system software, extreme care should be taken during the upgrade process. In our experience, human error is almost always the source of software failures.
Version-specific upgrade notes are published in the Linkerd Upgrade documentation.