Kubernetes Upgrade Skew Calculator
Check Kubernetes version skew before a minor upgrade, including API server HA windows, node lag, kubectl fit, staged hops, and blockers.| {{ header }} | Copy |
|---|---|
| {{ cell }} |
{{ commandNotes }}
{{ formattedJson }}
Introduction:
A Kubernetes minor upgrade is not just a question of which release should run next. The cluster spends part of the maintenance window with mixed component versions, and those temporary combinations must stay inside the supported version skew policy. The policy defines how far the API server, kubelet, kube-proxy, controller-manager, scheduler, cloud-controller-manager, and kubectl may differ while they continue to talk to each other.
The API server is the reference point because every other core component depends on it. In a highly available control plane, the oldest and newest API servers can both be reachable through the same load balancer during a rolling upgrade. A controller or client that looks valid against the newest API server may still be too new for the oldest API server it can hit.
- Version skew
- The supported minor-version distance between Kubernetes components that operate together.
- Rollout window
- The mixed API server range that exists while control-plane nodes are upgraded one at a time.
- Steady state
- The final version window after every API server has reached the target minor.
- Node lag
- The allowed distance between kubelet or kube-proxy and the API server. Modern upstream policy allows a wider lag than older pre-v1.25 node components.
The common mistake is checking only the desired final version. Upgrade risk often appears earlier, when one API server is already new, another is still old, nodes have not been drained yet, and automation uses a kubectl image from a CI runner or admin shell. Version skew policy answers whether those combinations are supported while the upgrade is in progress.
Minor versions should also be treated as a path, not just a destination. Kubernetes upgrade guidance expects clusters to move through each minor release in sequence, and removed APIs, webhooks, add-ons, storage drivers, and provider-managed components can add work that a skew check cannot discover. A clean skew result is a gate for component compatibility, not approval for the whole maintenance plan.
A practical upgrade inventory records ranges rather than single numbers: the oldest and newest API server minors, node-pool kubelet range, kube-proxy range, control-plane component minors, and the kubectl version used for the runbook. Those ranges reveal whether the target can be reached in one maintenance window, needs staged minor hops, or is blocked by lagging components.
How to Use This Tool:
- Enter a
Cluster labelthat will make the exported matrix and JSON easy to match to a maintenance ticket. - Choose
Policy profile. UseUpstream Kubernetes policyfor modern self-managed clusters,Legacy node skew guardwhen kubelet or kube-proxy versions below v1.25 are in scope,Managed platform one-minor nodesfor stricter provider runbooks, andLockstep platformonly when the platform requires matching component minors. - Set
Target control-plane minorto the Kubernetes minor you are planning, such as36for v1.36. - Enter the current
API server range. In an HA cluster, use the oldest and newest API server minors that can receive traffic during the rollout. - Fill in the control-plane component minors, the kubelet range, the kube-proxy range, and the
kubectl client minorused by the runbook or automation. - Open
Advancedwhen cloud-controller-manager should be checked separately or when the node drain batch size should appear in the upgrade order notes. - Read
Skew Matrixfirst. Then useUpgrade Order,Compatibility Chart,Policy Notes, andCommand Notesto build the evidence trail for the change plan.
Treat Blocked rows as preflight work. Treat a staged-hop warning as a sequencing issue: plan each intermediate minor, verify health after that hop, and rerun the check before moving again.
Interpreting Results:
Skew Matrix compares each current component range with the allowed range during the target API rollout and the expected range after API convergence. Ready means the whole current range fits. Partial means there is overlap, but at least one instance sits outside the allowed window. Blocked means the component has no safe overlap or violates a hard rule such as HA API server skew.
kube-apiserverblockers usually mean the current HA API server range is already too wide, the target is older than the newest API server, or the target requires staged minor hops.kube-controller-manager,kube-scheduler, andcloud-controller-managermust not be newer than any reachable API server. A load-balanced control plane can narrow their safe version to the oldest API server minor.kubeletresults are driven by node lag and by the rule that kubelets cannot be newer than the API server they talk to.kube-proxyis checked against both the API server window and the kubelet relationship on the same node.kubectlwarnings matter for CI jobs, maintenance containers, and admin workstations because a mixed API server range can shrink the client window.
Upgrade Order converts the matrix into a maintenance sequence. It starts with inventory, names skew blockers, stages multi-minor targets, and includes node drain batches when the advanced setting is used. Command Notes gives a compact command checklist for collecting version evidence.
Do not read a clean matrix as a workload-safety result. Deprecated APIs, CRDs, admission webhooks, etcd support, CNI and CSI add-ons, CoreDNS, backups, and provider-specific checks still need separate review before production maintenance.
Technical Details:
Kubernetes version skew is a set of directional rules, not a single distance value. API server instances in an HA cluster have their own maximum spread. Components that communicate with API servers must fit the API range they can actually reach, and node-side components have additional lag rules because worker nodes often move after the control plane.
The rollout window is different from the steady-state window. During an HA API server upgrade, a cluster moving to v1.36 may expose both v1.35 and v1.36 API servers. Once every API server has converged, the steady-state API window becomes v1.36 only. Components that contact the load balancer must satisfy the temporary mixed window before the rollout can proceed.
Rule Core:
Allowed component windows are built from the API window and the selected profile's lag values. Forward lag is normally zero, except for kubectl, which can be one minor older or newer than every API server under upstream policy.
| Component | Rule reflected in the check | Planning consequence |
|---|---|---|
| kube-apiserver | Newest and oldest HA API servers must stay within the selected API skew limit. Upstream policy uses one minor. | Upgrade control-plane nodes one at a time without widening the live API range beyond the allowed window. |
| controller-manager, scheduler, cloud-controller-manager | They must not be newer than reachable API servers and may lag by the profile's control-plane allowance. | A load balancer that can route to an older API server can force these components to stay on the older minor until API convergence. |
| kubelet | Kubelets cannot be newer than API servers and may trail by the profile's node lag. | The oldest node pool can block the target even when newer node pools are already close to the control plane. |
| kube-proxy | Kube-proxy follows API-server skew and must also remain within the profile's kubelet relationship window. | Check add-on upgrade order against both API server and node versions, especially in managed or mixed node pools. |
| kubectl | Kubectl may be older or newer by the profile's client lag. Upstream policy uses plus or minus one minor. | Mixed API server versions narrow the usable client range because the client must fit every reachable API server. |
Range comparison uses inclusive boundaries. A range passes when both its oldest and newest values sit inside the allowed window. It warns when the current range overlaps the allowed window but includes out-of-window versions. It fails when there is no overlap or when a hard rule, such as API server HA skew, is broken.
| Policy profile | Main lag settings | When it fits |
|---|---|---|
| Upstream Kubernetes policy | API HA 1, control plane 1, kubelet 3, kube-proxy 3, kubectl 1. | Modern clusters following the public upstream skew policy. |
| Legacy node skew guard | Node and kube-proxy lag are narrowed to 2 when versions below v1.25 are in scope. | Older clusters or upgrade paths that still include pre-v1.25 node-side components. |
| Managed platform one-minor nodes | Node, kube-proxy, and kubelet relationship lag are limited to 1. | Provider runbooks that keep nodes close to the control plane. |
| Lockstep platform | All lag values are 0. | Vendor platforms that require component minors to match during the planned operation. |
For example, targeting v1.36 from API servers at v1.35 creates an upstream rollout window of v1.35 to v1.36. Kubelets from v1.33 to v1.35 can fit the modern upstream node window, but the same node range fails under a managed one-minor node profile because v1.33 and v1.34 are too old for that stricter policy.
The results are calculated from the entered version numbers and do not connect to the cluster. Copy and download actions export the matrix, policy notes, command checklist, and JSON that the user chooses to take from the page.
Limitations:
This is a version-skew planner, not a live cluster audit. It does not inspect Kubernetes objects, discover API server pods, read node versions, validate admission webhooks, check etcd support, scan deprecated APIs, or verify provider-managed add-ons.
- Run deprecated API checks before each minor hop, especially when workloads use older API groups or custom resources.
- Follow managed-service and distribution runbooks when they are stricter than upstream policy.
- Confirm backups, rollback procedures, workload disruption budgets, and node capacity before draining or replacing nodes.
- Use current patch releases for the source and target minors when the platform allows it.
Worked Examples:
Single-hop upstream upgrade. A cluster has API servers at v1.35, kubelets from v1.33 to v1.35, kube-proxy from v1.33 to v1.35, and a v1.35 kubectl client. Targeting v1.36 under the upstream profile should produce a v1.35 to v1.36 API rollout window and a node window that still includes v1.33.
Managed profile with older nodes. The same inputs under Managed platform one-minor nodes should flag the older node-side versions. The target may still be valid, but the maintenance plan needs node or add-on work before the API rollout can be treated as clean.
Skipped minor target. If the newest API server is v1.34 and the target is v1.36, the result should name a staged path through v1.35 before v1.36. Each hop needs its own preflight and health verification.
Client drift during HA rollout. If API servers are reachable at v1.35 and v1.36, kubectl must fit both. A client that is acceptable against v1.36 alone can still be outside the mixed API server window.
FAQ:
Why does an HA control plane narrow other component windows?
Components behind a load balancer can reach the oldest or newest API server. They must be valid against every API server they can contact, so a mixed control plane can narrow the safe range.
Why does kube-proxy fail when kubelet passes?
Kube-proxy has two checks. It must fit the API server window and remain close enough to the kubelet on the same node. Passing one relationship does not guarantee the other.
Can a clear skew matrix still hide upgrade risk?
Yes. Version skew does not check removed APIs, add-on support, storage drivers, provider limits, backups, or workload disruption. Treat it as one gate in the upgrade runbook.
What should I do when the target is more than one minor ahead?
Plan each intermediate minor as its own upgrade step. Verify the cluster after the next minor, update the inventory, and rerun the skew check before continuing.
Glossary:
- Minor version
- The middle number in a Kubernetes version such as v1.36.
- Version skew
- The supported difference between Kubernetes component minor versions while those components run together.
- API server range
- The oldest and newest kube-apiserver minors reachable during the modeled rollout.
- Legacy node guard
- The narrower kubelet and kube-proxy lag applied when pre-v1.25 node-side versions are involved.
- Staged hop
- An intermediate minor upgrade required before the selected target can be reached.
References:
- Version Skew Policy, Kubernetes.
- Upgrading kubeadm clusters, Kubernetes.
- Deprecated API Migration Guide, Kubernetes.