Container Image Pull Time Calculator
Calculate online container image pull time from image size, cache share, node count, bandwidth, pull lanes, and unpack speed for rollout planning.{{ result.summaryTitle }}
| Metric | Value | Detail | Copy |
|---|---|---|---|
| {{ row.metric }} | {{ row.value }} | {{ row.detail }} |
| Wave | Nodes | Transfer | Network | Unpack/setup | Reserve | Finish | Copy |
|---|---|---|---|---|---|---|---|
| {{ row.wave }} | {{ row.nodes }} | {{ row.transfer }} | {{ row.network }} | {{ row.local }} | {{ row.reserve }} | {{ row.finish }} |
| Lever | Modeled change | Time saved | Operator note | Copy |
|---|---|---|---|---|
| {{ row.lever }} | {{ row.change }} | {{ row.saved }} | {{ row.note }} |
Introduction:
Container image pull time is the delay between scheduling a workload and having the image ready on the nodes that need to run it. The delay matters most during rollouts, emergency patches, node replacements, and autoscaling bursts, because a pod cannot start until its required image is present and usable on the node. A slow pull can make a healthy application look stuck even when the scheduler, probes, and application code are fine.
The main question is not only how large the image is. Pull time also depends on how many nodes need the image, how much of the image is already cached, how many pulls run at the same time, how much bandwidth the registry path can sustain, and how long the runtime spends unpacking layers on each node. A 950 MiB image can start quickly on warm nodes and painfully slowly on cold nodes behind a narrow registry path.
Image caches change the planning picture. If a node already has the exact digest or reusable layers, the runtime may need only a manifest or metadata check before it can start the container. Cold nodes still have to fetch compressed data, verify or prepare layers, and write unpacked data to local storage. Those two populations should not be averaged away, because one cold node pool can dominate the rollout window.
An estimate is still a model. It does not prove that credentials, registry availability, admission policy, image architecture, or storage health are correct. It helps size the pull portion of a rollout, compare likely bottlenecks, and decide whether pre-pulling, smaller images, better layer reuse, or registry mirrors are worth doing before a release window.
Technical Details:
A container image pull starts from a reference such as a tag or digest, resolves metadata from a registry, and then downloads any layers that are missing from the node's local image store. Container image layers are reusable, so a node that already has shared base layers may transfer only the changed application layers. A digest gives the cleanest planning target because it names one immutable image version rather than a tag that can later point somewhere else.
The modeled time is built from cold nodes, not total nodes. Cached nodes are rounded from the cache percentage and removed from the transfer count. The remaining nodes are grouped into pull waves based on the active pull lane count. Each wave downloads its share of compressed data, spends local time unpacking and setting up touched layers, optionally subtracts a download/unpack overlap credit, and optionally adds a retry reserve.
The conversion constant 8.388608 changes mebibytes into megabits. The effective bandwidth for a wave is the lower of the shared registry bandwidth and any per-lane cap multiplied by the nodes in that wave. If no per-lane cap is set, the shared registry bandwidth is the only network limit in the model.
Modeled Inputs and Bounds:
| Input | Accepted range or rule | How it changes the estimate |
|---|---|---|
Compressed image size |
Positive MiB or GiB value. | Sets the base payload before the new layer share is applied. |
Nodes needing image |
Whole number from 1 upward. |
Defines the rollout population before cached nodes are subtracted. |
Cached nodes |
0% to 100%, rounded to a node count. |
Reduces cold nodes, registry transfer volume, and pull wave count. |
New layer share |
0% to 100%. |
Models reusable layers by shrinking per-node transfer and touched layer count. |
Registry bandwidth |
Positive Mbps value. | Caps aggregate download speed for the active pull lanes. |
Parallel pull lanes |
Whole number from 1 upward. |
Sets how many cold nodes can be in a wave before the queue advances. |
Unpack throughput |
Positive MiB/s value. | Converts compressed transfer size into per-node unpack time. |
Layer setup cost |
Non-negative seconds per touched layer. | Adds local filesystem, verification, and metadata overhead. |
Advanced Timing Controls:
| Control | Rule | When to use it |
|---|---|---|
Per-lane cap |
0 disables the cap; positive values limit each active lane in Mbps. |
Use it when nodes, proxies, or registry connections are individually throttled. |
Retry reserve |
0% to 200%, added to every wave after base timing. |
Use it when logs show transient registry, network, or layer retry delays. |
Download/unpack overlap |
0% to 90% of the smaller network or unpack phase. |
Use it only when the runtime extracts earlier layers while later layers still download. |
Manifest check |
Non-negative seconds per node. | Use it when digest or manifest lookup latency is visible even for cached images. |
Everyday Use & Decision Guide:
Start with the rollout batch, not the whole cluster. Put the image tag or digest in Image reference, enter the compressed size reported by the registry or image listing, and set Nodes needing image to the nodes that may actually schedule the workload during this rollout or scale-up.
Use Cached nodes and New layer share to separate warm nodes from cold work. If the exact digest has already been pre-pulled, cached nodes can be high. If only a base image is shared, keep cached nodes lower and reduce New layer share only for the layers you believe are reusable by digest.
- Use measured registry throughput for
Registry bandwidth. Link speed is often higher than the sustained pull path through mirrors, proxies, or private registries. - Set
Parallel pull lanesto the real concurrency allowed by kubelet, runtime, registry limits, and rollout policy. Extra lanes do not remove bytes from the registry path. - Raise
Unpack throughputonly when node logs show fast extraction. Slow disks, snapshotters, and many small filesystem changes can make local time dominate. - Open
Advancedwhen logs show a per-connection throttle, repeated retries, partial overlap between download and unpack, or visible manifest lookup time. - Check
Input auditinPull Metricsbefore using the result in a release plan.
The estimate is a good fit for comparing rollout options before a deployment window. It is not a registry health check, credentials test, admission controller check, or proof that the image exists for every node architecture. If the result looks acceptable but a live rollout still stalls, compare the model against kubelet events and pull logs before changing the application.
Use Optimization Brief to pick the next experiment. If Pre-pull cache saves the most time, warm the digest before deployment. If Layer reuse or Image trim wins, focus on build structure. If Registry path dominates, a closer mirror or pull-through cache may help more than tuning pod rollout settings.
Step-by-Step Guide:
Work from the artifact and node population first, then add timing details from pull logs where you have them.
- Enter
Image referencewith the tag or digest used for the rollout. The same value appears in exports so the estimate stays tied to the deployment artifact. - Set
Compressed image sizein MiB or GiB. Use compressed registry size, not expanded filesystem size, because the network phase transfers compressed layers. - Set
Nodes needing image,Cached nodes, andNew layer share. Watch the summary change to show how many nodes pull data and how much leaves the registry path. - Enter
Registry bandwidth,Parallel pull lanes,Unpack throughput,Layer count, andLayer setup cost. TheTotal rollout pull timeandDominant bottleneckrows update immediately. - Open
Advancedif you needPer-lane cap,Retry reserve,Download/unpack overlap, orManifest check. Invalid or blank numeric entries are normalized to safe bounded values rather than producing a negative pull plan. - Review
Wave Queue Ledgerto see each wave's nodes, transfer size, network time, local time, reserve, and finish time. Large wave counts may be condensed in the table while the total model still uses all waves. - Open
Pull Phase StackandCache Sensitivity Curvewhen you need a visual check. Export chart images or chart CSV when the estimate needs to go into a rollout review. - Use
Copy JSON,Download JSON, CSV, DOCX, or chart exports only after the inputs match the deployment plan you intend to communicate.
Interpreting Results:
Total rollout pull time is the headline estimate, but it should be read with Nodes pulling layers, Registry transfer volume, and Dominant bottleneck. A short total with a high transfer volume may still be risky if the registry path is shared with other workloads. A long total with low transfer volume often points to unpack throughput, layer setup, or artificial lane limits.
The color behind the status badge follows elapsed time: under 5 minutes, 5 to under 15 minutes, and 15 minutes or more. That is a planning cue, not an operational service level objective. A 4 minute pull can be unacceptable for a tight incident rollback, while a 20 minute pre-warm can be fine if it finishes before traffic moves.
| Output cue | What it means | Useful follow-up |
|---|---|---|
registry/network bound |
Download time exceeds local unpack/setup time by more than the model's balance margin. | Check registry throughput, mirrors, pull-through cache, lane caps, and image size. |
unpack/setup bound |
Local extraction and per-layer work dominate the wave time. | Check node disk speed, snapshotter behavior, layer count, and setup cost. |
balanced phases |
Network and local work are close enough that neither side clearly dominates. | Use Optimization Brief or Cache Sensitivity Curve to test likely improvements. |
0 pull wave(s) |
All nodes are modeled as cached or no new layer transfer remains. | Confirm the exact digest and manifest check assumption before treating this as instant startup. |
Retry reserve is nonzero |
Extra time is being added to every wave, so the result includes a reliability cushion. | Record why the reserve is present so future estimates do not double-count retries. |
A low estimate does not mean the rollout will succeed. It only says the modeled pull path is short under the entered assumptions. Verify the image reference, digest, registry credentials, architecture support, and recent pull logs before using the result as a release gate.
Worked Examples:
Cold rollout with partial cache:
A 950 MiB image rolling to 24 nodes with 35% cached nodes leaves 16 nodes pulling layers after rounding. With 800 Mbps shared registry bandwidth, 6 pull lanes, 120 MiB/s unpack throughput, 18 layers, and 0.4 sec/layer setup, Total rollout pull time is about 3 min 25 sec. Wave Queue Ledger shows three waves: two six-node waves and one four-node wave. Dominant bottleneck reads Registry/network, so improving the registry path or reducing transfer size is more promising than only raising unpack speed.
Warm nodes with small application delta:
An 80-node group with a 1.8 GiB image, 75% cached nodes, and 20% new layer share has 20 cold nodes and about 369 MiB per uncached node. With 1200 Mbps, 12 lanes, 180 MiB/s unpack throughput, 10% retry reserve, 30% overlap, and a 1.2 sec/node manifest check, Total rollout pull time is about 1 min 5 sec. The Cache Sensitivity Curve is useful here because a modest cache miss change can move several nodes into or out of the cold population.
Fully cached image:
A 600 MiB image across 10 nodes with 100% cached nodes produces 0 pull wave(s), Registry transfer volume of 0 MiB, and a total equal to the entered manifest check, such as 0.8 sec. That does not prove the image can run. It means the model found no uncached layer transfer, so the next check belongs in Kubernetes events, registry credentials, and the exact image digest used by the workload.
Per-lane cap surprise:
A 900 MiB image going to 30 nodes with 10% cached nodes leaves 27 cold nodes. With 2000 Mbps shared bandwidth and 10 lanes, the rollout might look wide enough at first. Setting Per-lane cap to 100 Mbps means each 10-node wave can use only 1000 Mbps, not the full shared bandwidth. The estimate lands around 5 min 16 sec with registry/network bound, and the first fix to test is the lane cap or registry path, not the layer setup cost.
FAQ:
Should I use compressed size or unpacked image size?
Use compressed size for Compressed image size. The network phase transfers compressed layers from the registry path, while Unpack throughput separately models local decompression and extraction time.
Why does increasing pull lanes sometimes save little time?
More lanes reduce queueing only when concurrency is the limit. If Registry bandwidth or Per-lane cap already controls the wave, extra lanes can change wave shape without removing much total transfer time.
What does 100% cached mean in the result?
It means the model found no uncached layer transfer. If Manifest check is greater than zero, the total can still show that lookup time, and you should still confirm that the exact image reference is present on the nodes.
Why does the tool round cached nodes?
Cached nodes is entered as a percentage, but nodes are whole machines. The model rounds the percentage to a cached node count, subtracts that from Nodes needing image, and uses the remaining cold nodes for pull waves.
Where does the calculation run?
The calculation and exports run in the browser. Treat image references and rollout assumptions as operational details if you place them in shared URLs, copied JSON, downloaded CSV files, DOCX exports, or chart images.
Glossary:
- Image reference
- The registry path, tag, or digest that identifies the image being pulled.
- Digest
- A content-based identifier that pins an image version more tightly than a mutable tag.
- Layer
- A reusable compressed part of an image that may already exist on a node or need to be downloaded.
- Cold node
- A node that still needs to pull one or more layers for the modeled image.
- Pull wave
- A group of cold nodes that pull at the same time under the configured lane count.
- Registry bandwidth
- The sustained aggregate throughput available from the registry, mirror, or pull-through cache to the nodes.
- Unpack throughput
- The effective node-side speed for decompression, extraction, and snapshot setup.
References:
- Images, Kubernetes Documentation, last modified November 18, 2025.
- docker image pull, Docker Docs.
- docker image ls, Docker Docs.