Cache TTL Hit Rate Calculator

Cache name:

Names the cache population in the exported plan.

Requests per minute:

Use the cacheable request rate after routing and bypass rules.

/min

Unique keys per minute:

Use distinct normalized cache keys, not raw URLs when query normalization applies.

keys/min

TTL:

Set the freshness window being considered for this cache rule.

min

Target hit rate:

Set the operating target for the guardrail ledger and current-TTL marker.

TTL behavior:

Choose the behavior closest to the cache being modeled.

Origin change interval:

Used to flag freshness risk when TTL approaches update cadence.

min

Invalidation penalty:

Leave at 0 when purges are rare or already reflected in observed miss data.

Origin latency:

Use measured origin response latency for a representative miss.

Cache latency:

Use observed edge/cache latency for a representative hit.

Check cache inputs

{{ error }}

Metric	Value	Readout	Copy
{{ row.metric }}	{{ row.value }}	{{ row.readout }}

TTL	Hit rate	Origin pulls	Freshness note	Copy
{{ row.ttl }}	{{ row.hitRate }}	{{ row.originPulls }}	{{ row.freshness }}

Guardrail	Current	Decision	Next step	Copy
{{ row.guardrail }}	{{ row.current }}	{{ row.decision }}	{{ row.nextStep }}

Export to PDF Fullscreen

Embed:

Customize

Include current inputs

Size

Advanced

Width

Height

Aspect ratio

Max height

Collapsible embed

Allow fullscreen

Referrer policy

Sandbox tokens

Cache TTL hit rate estimates how often repeat requests can be served from a cache while a stored response is still fresh. The useful unit is not just total traffic. It is traffic per cache key, because a busy route with thousands of one-off keys can miss more often than a smaller route with the same few keys repeated all day.

Time to live, usually shortened to TTL, is the freshness window attached to a cached object. Longer TTLs usually improve reuse and reduce origin work, but they can also keep old data visible for longer when the origin changes. Shorter TTLs protect freshness, but they push more requests back to the origin and reduce the latency benefit of the cache.

Hit rate also depends on cache-key shape. Query strings, cookies, headers, locale choices, device variants, and user-specific values can split traffic into many separate objects. When those values do not actually change the response, they reduce hits without improving correctness.

A TTL hit-rate estimate is a planning model, not a replacement for CDN logs or live cache analytics. It is strongest when the request rate, normalized key count, invalidation rate, and latency gap come from measured traffic or a realistic launch forecast. It is weakest when traffic is bursty, personalized, or governed by cache rules that the model does not represent.

Technical Details:

A cache hit needs three things at the same time: the incoming request must map to a stored cache key, the stored object must still be inside its freshness lifetime, and the object must not have been removed by a purge, bypass rule, or invalidation. The model treats request reuse as an average per-key arrival rate and then applies a TTL formula to estimate the chance that a repeated request lands inside the freshness window.

Fixed TTL and sliding TTL answer different cache behaviors. A fixed TTL starts when the object is filled and expires after that duration even if more hits happen. A sliding TTL refreshes the expiry timer on access, so hot keys can stay resident longer. Many HTTP and CDN caches behave closer to fixed freshness for response reuse, while application caches sometimes use sliding expiry.

\begin{array}{lcl} reuse intensity & = & \frac{requests per minute}{unique keys per minute} \\ fixed TTL hit rate & = & \frac{λ \times T}{1 + λ \times T} \\ sliding TTL hit rate & = & 1 - e^{- λ \times T} \\ effective hit rate & = & modeled hit rate \times (1 - invalidation penalty) \end{array}

Effective hit rate drives the origin and latency readouts. Miss rate is the remainder after effective hits. Origin pulls are the cacheable request rate multiplied by miss rate, and blended latency is the weighted average of cache latency on hits and origin latency on misses. The displayed latency saved is the difference between origin-only latency and that blended estimate.

The target TTL solver works backward from the selected target hit rate. If the target is higher than the post-invalidation ceiling, it is marked unreachable because no TTL can recover hits removed by purges, bypasses, or manual clears. Otherwise, the solver returns the TTL needed under the selected fixed or sliding model.

Cache TTL model inputs and result effects
Quantity	How it affects the estimate	Practical boundary
Requests per minute	Raises hit opportunity when the requests repeat the same normalized keys.	Use cacheable traffic after routing, auth bypass, and no-store rules.
Unique keys per minute	Divides traffic into separate object populations. More unique keys lower reuse intensity.	Count normalized cache keys, not raw URLs if query or header normalization is applied.
TTL	Defines the reuse window for a filled object.	A zero-minute TTL produces no modeled hits.
Invalidation penalty	Subtracts hit opportunity after the TTL model.	A 5% penalty caps the effective hit rate below 95%, even with very high reuse.
Origin change interval	Sets the freshness comparison used by the exposure label.	Use the object class update cadence, not the deployment schedule for unrelated assets.
Cache and origin latency	Turn hit rate into blended latency and estimated milliseconds saved.	Origin latency should be at least the cache latency for the model to describe a useful cache path.

Freshness exposure is a guardrail, not a proof of stale content. It scales the TTL against the origin change interval and weights the result by the effective hit rate. The label changes at 35% and 70%, which makes long TTLs on frequently changing, high-hit objects stand out before the hit-rate number becomes the only decision cue.

Freshness exposure labels for cache TTL planning
Freshness label	Exposure range	How to read it
Freshness steady	< 35%	The TTL is small compared with the change interval, or the effective hit rate is low enough that stale exposure stays limited.
Freshness watch	35% to < 70%	TTL changes should be checked against real update cadence before they are treated as safe.
High freshness pressure	≥ 70%	The modeled TTL may keep many repeat requests on cached content while origin data is likely to change.

Everyday Use & Decision Guide:

Start with one cache population, such as a CDN behavior, API route, rendered product-card fragment, or application cache namespace. Put its label in Cache name, then enter the cacheable request rate that actually reaches that rule. Exclude traffic that is known to bypass cache because private responses, authorization, no-store headers, or routing rules make it a different population.

The most important estimate is often Unique keys per minute. If every request has a session cookie, raw tracking query, or user-specific header in the key, a high request rate can still produce weak reuse. If the response is genuinely shared and the cache key can be narrowed safely, the same TTL can produce a much higher hit rate.

Use Fixed TTL after fill for most CDN-style freshness planning.
Use Sliding TTL on access only when the cache really extends expiry after each hit.
Set Target hit rate to the operating goal you would use in a capacity or latency review.
Use Invalidation penalty for purge churn, bypass loss, manual clears, or deployment invalidations that erase would-be hits.
Set Origin change interval from the object class being cached, not from the route name alone.
Enter measured Origin latency and Cache latency when possible, because the same hit rate has different value on a 40 ms origin and a 600 ms origin.

Cache Snapshot is the first readout for a single setting. TTL Scenario Grid shows nearby TTL choices, including zero, half, current, longer, change-interval, and target-TTL candidates. TTL Guardrail Ledger turns the result into checks for target runway, origin pressure, freshness budget, invalidation drag, and latency payoff. TTL Hit Curve is useful when a small TTL increase gives most of the latency gain and later increases mostly add freshness risk.

A high effective hit rate does not prove the cache rule is correct. Check whether the cache key includes only values that change the response, whether origin changes are covered by purge rules, and whether the result would still be acceptable after a deployment or product-data update. If target blocked appears, lower the target or reduce invalidation loss before simply extending TTL.

Step-by-Step Guide:

Build the model from traffic shape first, then tune TTL only after the key count and invalidation assumptions are credible.

Name the cache population with Cache name. Use a route, object class, CDN behavior, or cache tier name that will still make sense in exported rows.
Enter Requests per minute for cacheable traffic and Unique keys per minute for distinct normalized cache keys. Recheck these two numbers before interpreting any target result.
Set the current TTL and choose TTL behavior. Pick fixed TTL unless the cache refreshes expiry on access.
Set Target hit rate, Origin change interval, and Invalidation penalty. If purges are rare or already reflected in miss data, leave the penalty at zero.
Enter representative hit and miss latency with Cache latency and Origin latency. If origin latency is lower than cache latency, correct the measurements before using the saved-latency result.
Read Cache Snapshot for effective hit rate, origin pulls, target TTL, blended latency, origin load reduction, and freshness exposure.
Open TTL Scenario Grid and TTL Guardrail Ledger before changing production cache headers. The grid shows the nearby tradeoffs, and the ledger names the first caution to resolve.
Use table exports, chart downloads, or JSON only after the inputs match the same cache population. Sharing a result with mixed routes or mixed key rules can make the TTL recommendation misleading.

Interpreting Results:

Effective hit rate is the headline value because it includes the invalidation penalty. Modeled hit rate before invalidation can look excellent while purge churn still removes enough hits to miss the target. The difference between those two values is the hit-rate cost of invalidations, bypasses, and manual clears.

Origin pulls converts miss rate back into request volume. A 94% hit rate on 12,000 requests per minute still sends roughly 720 requests per minute to the origin. That may be fine for static assets and dangerous for an expensive API, so read origin pulls alongside hit rate.

Target runway answers whether the selected target is reachable and how much TTL is needed. When the current TTL is already above the target TTL, the extra margin may be better spent on freshness. When the target TTL is far above the current TTL, the cache key, purge behavior, or object class may need review before the number becomes a production setting.

Blended latency is a weighted average, not a percentile. It helps compare cache policy options, but it does not show tail latency, fill collapse, backend saturation, or regional variance. Treat it as a planning estimate, then compare the chosen TTL with real cache logs, origin metrics, and user-facing latency percentiles.

Freshness exposure should slow down TTL increases when the object changes often. A high freshness-pressure label does not prove users are seeing stale data, but it says the model is depending on cached responses during a large share of the origin update window.

Worked Examples:

Product-card edge cache

The default example uses 12,000 requests per minute, 1,800 unique keys per minute, a 15 minute fixed TTL, a 90% target, 5% invalidation penalty, 220 ms origin latency, and 35 ms cache latency. Reuse intensity is about 6.67 requests per key per minute. The effective hit rate lands near 94.1%, origin pulls are about 713/min, blended latency is about 46 ms, and the target TTL is only about 2.7 minutes. That result suggests the current TTL has target cushion, so freshness and purge coverage deserve attention before chasing a higher hit rate.

High traffic with weak key reuse

A route with 1,200 cacheable requests per minute but 6,000 unique keys per minute has only 0.2 requests per key per minute. With a 10 minute fixed TTL and 5% invalidation penalty, the effective hit rate is about 63.3% and origin pulls stay near 440/min. Extending TTL toward the roughly 26.7 minutes needed for an 80% target may be less useful than fixing key fragmentation first.

Sliding TTL application cache

An application cache with 4,000 requests per minute, 2,000 unique keys per minute, a 5 minute sliding TTL, and a 2% invalidation penalty reaches about 98.0% effective hit rate under the sliding model. The same numbers should not be copied to a fixed-freshness CDN rule without changing the behavior setting, because sliding expiry keeps hot objects alive in a way fixed HTTP freshness usually does not.

Blocked hit-rate target

A 98% target with a 5% invalidation penalty is unreachable in the model because the post-invalidation ceiling is 95%. Raising TTL cannot solve that. The useful next check is purge frequency, bypass logic, cache-key duplication, and whether the target should be set against effective hit rate rather than the pre-invalidation hit rate.

Freshness pressure despite strong hits

A busy cache with 8,000 requests per minute, 1,000 unique keys per minute, a 50 minute fixed TTL, and a 45 minute origin change interval can show about 97.8% effective hit rate while freshness exposure reaches the high-pressure label. The hit-rate result is operationally attractive, but the TTL is long enough compared with update cadence that explicit purge coverage or a shorter TTL should be checked before rollout.

Responsible Use Note:

Cache names, request rates, unique-key counts, origin latency, and purge assumptions can reveal traffic levels and service architecture. Use neutral labels when sharing exports outside the team, and avoid putting sensitive endpoint names, customer identifiers, or unreleased launch volumes into copied tables or URLs.

The calculation runs in the browser session after the page loads. The page does not test the live cache or call the origin service, and exported rows can still contain operational details that deserve the same handling as performance logs.

FAQ:

Why can a longer TTL stop helping?

Once most repeat requests already hit cache, more TTL adds little latency or origin relief. The remaining misses may come from new keys, invalidations, bypasses, or key fragmentation rather than expiry.

Why does sliding TTL show a higher hit rate than fixed TTL?

Sliding TTL refreshes expiry when a key is accessed, so repeated hits can keep a hot object alive. Fixed TTL expires from the fill time and does not extend just because the object was read again.

What should I enter for unique keys?

Use the number of distinct normalized cache keys per minute for the population being modeled. If a CDN strips tracking parameters or ignores most cookies, count the post-normalization keys rather than raw request URLs.

Does high hit rate mean the content is safe to cache longer?

No. Hit rate measures reuse, not correctness. Check freshness exposure, origin change cadence, purge coverage, and whether personalized or restricted responses can enter the shared cache.

Why is my target unreachable?

The target is unreachable when the invalidation penalty leaves too little hit-rate ceiling. For example, a 5% penalty makes any effective target at or above 95% impossible in the model.

Can this replace CDN cache analytics?

No. Use it for planning and what-if comparisons, then confirm with real cache logs, hit and miss status, origin request volume, purge history, and latency percentiles from the same cache population.

Glossary:

TTL: Time to live, the freshness window during which a cached object can be reused without being treated as expired.
Cache key: The normalized identifier used to decide whether an incoming request matches a stored object.
Hit rate: The share of cacheable requests served from cache rather than sent to origin.
Invalidation penalty: The modeled share of would-be hits lost to purges, bypass rules, manual clears, or other cache removal events.
Origin pull: A request that misses cache and has to reach the origin or upstream service.
Blended latency: The weighted average of cache-hit latency and origin-miss latency under the modeled hit rate.
Freshness exposure: A planning cue that compares TTL with the origin change interval and weights it by effective hit rate.

References:

RFC 9111: HTTP Caching, RFC Editor.
Cache-Control header, MDN Web Docs.
Increase the proportion of requests that are served directly from the CloudFront caches, Amazon CloudFront Documentation.
Understand the cache key, Amazon CloudFront Documentation.
Edge and Browser Cache TTL, Cloudflare Documentation.