API Pagination Window Calculator
Plan an API pagination window from item count, page size, rate limits, latency, workers, retry reserve, and payload risk checks.| Metric | Value | Detail | Copy |
|---|---|---|---|
| {{ row.label }} | {{ row.value }} | {{ row.detail }} |
| Check | State | Recommendation | Copy |
|---|---|---|---|
| {{ row.check }} | {{ row.state }} | {{ row.recommendation }} |
| Plan | Workers | Window | Use case | Copy |
|---|---|---|---|---|
| {{ row.plan }} | {{ row.workers }} | {{ row.window }} | {{ row.useCase }} |
A long API export is not just a question of how many records exist. The practical question is how those records move through a page boundary, a provider limit, a request allowance, and the client's own waiting time. A list with 125,000 records may be harmless when it can be pulled in large cursor pages under a generous quota, or disruptive when the endpoint caps each page, returns bulky payloads, and starts throttling the client halfway through the job.
Pagination turns one large collection into a sequence of smaller responses. Page size controls the amount requested at once, while an API page cap controls the amount the service will actually allow. Many APIs also expose a continuation value, next link, page number, offset, or incremental checkpoint so the client knows where to resume. Those mechanisms sound similar, but they behave differently when records are inserted, removed, or reordered while the export is still running.
| Pagination style | Best fit | Planning risk |
|---|---|---|
| Cursor or next-token | Large lists where the service returns a continuation value after each page. | Tokens can expire or become invalid when filters, ordering, selected fields, or authorization context changes. |
| Offset or page number | Small, stable lists where jumping to a known position is useful. | Deep offsets can slow down, hit provider depth limits, or skip and duplicate records when the collection changes. |
| Incremental window | Exports sorted by a stable timestamp or ID, with checkpoints saved after successful pages. | Late-arriving records and non-monotonic sort keys can leave gaps unless the overlap and retry policy are deliberate. |
The retrieval window also has a pacing side. A rate limit describes how many requests may be sent in a time window. Latency describes how long each page takes to return. Concurrency lets several page requests wait at the same time, but it does not increase the provider's request allowance. Extra workers help only while the client is latency-bound; once the quota is slower than the page-response wait, more workers mainly increase queue pressure and the chance of 429 responses.
Payload size is the quieter limit. A fast run can still be a poor plan when every page carries heavy records, wrapper overhead, or uncompressed response bodies. Large payloads affect disk space, network transfer, parser memory, downstream queue depth, and retry cost. A safer export often comes from smaller partitions with stored checkpoints rather than one maximum-size pull that has to restart from the beginning.
A pagination estimate is therefore a preflight model, not a provider guarantee. It helps choose a page size, budget an execution window, spot quota or payload risks, and decide when the job needs sharding, but the live run still has to obey provider headers, token expiry, retry backoff, and the service's own pagination rules.
How to Use This Tool:
Start with the collection size and endpoint rules, then use the review rows and charts to decide whether the modeled export window is safe enough to run.
- Set
Total itemsto the number of records in the export window. If the API only gives an approximate count, enter the best current estimate and treatModeled retrieval windowas a planning value rather than a promise. - Enter
Page sizeandAPI page cap. WhenAPI page capis greater than zero and lower than the requested page size,Page requestsis calculated from the cap andWindow Reviewasks you to lower the sent limit. - Add the pacing values with
Rate limit,Average page latency, andConcurrent workers. TheWindow Metricstab shows bothRate-limit windowandLatency window, and the slower one becomes the active ceiling. - Choose
Pagination style. Use cursor mode for next-token or next-link responses, offset mode for page-number or offset lists, and incremental mode for stable timestamp or ID windows. - Use
Average item sizeand the advancedResponse overheadfield when transfer volume matters.Payload estimateshould be checked before large backfills, not only after timing looks acceptable. - Add
Retry reservewhen429responses, transient5xxerrors, lock waits, or bounded backoff are likely. The reserve multiplies the final modeled window instead of changing the clean-run page count. - Read
Window ReviewandShard Planbefore using the number. Fix cap warnings, split very large payloads, and compare the current worker count with the single-worker, retry-reserved, and burst scenarios.
The result is ready to use when the summary badge says window ready and the review rows match the provider behavior you expect. A review plan badge means at least one input or assumption needs attention first.
Interpreting Results:
Modeled retrieval window is the headline timing estimate. Read its detail text before acting on it: a rate-limit ceiling means the request allowance dominates, while a latency ceiling means response time and worker count still affect the result.
Page requests, Payload estimate, and Effective item rate are the main sanity checks. A short window can still be risky when the payload is measured in many GiB, the provider caps page size below the planned request, or the retry path is not idempotent.
- When
Rate-limit windowis slower, addingConcurrent workerswill not shorten the modeled time unless the provider allowance also increases. - When
Latency windowis slower, better connection reuse, lower page latency, or a modest worker increase can help until the rate limit becomes slower. Page Size Tradeoffidentifies the fastest tested page size, but the fastest candidate may still be too bulky for memory, streaming, or retry safety.Shard Plancompares timing scenarios. It does not prove that the API will accept the burst worker count or the same token in parallel requests.
Do not treat a clean estimate as proof that a production pull will complete on schedule. Run a small sample and compare observed latency, response size, throttle headers, and retry frequency with the values entered here.
Technical Details:
A pagination plan starts by converting records into page requests. The requested page size is rounded down to a whole number and compared with any known endpoint maximum. A partial final page still counts as a request, so the request count uses a ceiling division rather than ordinary division.
The timing model compares two ceilings for the same request count. The rate-limit ceiling spreads requests across the allowed requests per minute. The latency ceiling groups requests into worker batches and multiplies those batches by average page latency. The larger ceiling controls the base window, and the retry reserve is applied after that comparison.
Formula Core
| Quantity | Unit | Technical role |
|---|---|---|
| Total items | records | Rounded down to a whole count and used only for the selected export window. |
| Page size | records per request | Rounded down to a whole number and never allowed below one record. |
| API page cap | records per request | A cap of zero means no known cap; a positive cap limits the effective page size. |
| Rate limit | requests per minute | Converted to seconds for the request count; provider headers should override estimates when they differ. |
| Concurrent workers | in-flight page requests | Rounded down to a whole number and never allowed below one worker. |
| Retry reserve | percent | Clamped from 0% to 300% and applied to the final modeled window. |
With 125,000 records, a 500-record effective page size, 480 requests per minute, 180 ms page latency, and 4 workers, the request count is 250. The rate-limit ceiling is 31.25 seconds. The latency ceiling is 63 worker batches times 0.18 seconds, or 11.34 seconds. The modeled clean-run window is therefore 31.25 seconds because the rate limit is slower. A 20% retry reserve raises the modeled window to 37.5 seconds.
| Rule | Boundary | Effect on interpretation |
|---|---|---|
| Page-size cap | API page cap > 0 and requested Page size > cap |
The effective page size is reduced to the cap and the review recommends lowering the sent limit. |
| Rate-limited run | Rate-limit window >= Latency window |
Request allowance controls the modeled time. |
| Latency-limited run | Latency window > Rate-limit window |
Worker count and average page latency can still change the modeled time. |
| Deep offset warning | Offset mode and more than 100 page requests | The review recommends cursor or keyset pagination when the provider supports it. |
| Payload review | Payload estimate above 5 GiB | Confirm response streaming, compression, disk space, and retry behavior. |
| Very large payload warning | Payload estimate above 1 TiB | Split the export into smaller windows before one continuous run. |
The page-size comparison tests 50, 100, 250, 500, 1000, the current effective size, twice the current size, and four times the current size. Positive API caps limit those candidates before timing is calculated. When two candidates finish in the same modeled time, the candidate with fewer page requests is favored.
The fetch-time curve is a straight projection from zero items to the modeled window. It is useful for planning checkpoints and status updates, but it does not model uneven page arrivals, retry backoff, provider pauses, token expiration, or shifting offsets.
Accuracy Notes:
The calculation runs from the numbers entered on the page. It does not call the target API, inspect live response headers, validate a cursor, or know whether provider tokens will remain valid for the entire job.
- Update
Average page latencyandAverage item sizefrom a recent sample before planning a large backfill. - Keep filters, sorting, selected fields, authorization context, and page-size policy stable when comparing runs.
- Treat provider documentation, rate-limit headers, and
Retry-Aftervalues as authoritative when they conflict with the model. - Use smaller partitions when a failed export would be expensive to replay from the beginning.
Worked Examples:
Rate-limited order backfill
For 125,000 records at Page size 500, API page cap 1,000, Rate limit 480 req/min, Average page latency 180 ms, and Concurrent workers 4, Page requests is 250 and Modeled retrieval window is about 31.3 seconds. The result is rate-limited, so doubling workers would not make this clean-run estimate meaningfully shorter.
Requested size above the provider cap
If the same export requests 2,000 records per page while API page cap is 1,000, the request count is calculated from 1,000 records per page. Window Review reports the page-size mismatch, and the practical correction is to lower the sent limit before deploying the job.
Offset depth review
For 60,000 records at 500 records per page, offset mode produces 120 page requests. That crosses the offset warning boundary, so Window Review recommends cursor or keyset pagination if the provider offers it. If offset is the only option, a smaller date or ID window reduces the depth of each run.
Retry-heavy endpoint
Adding a 20% Retry reserve to the default 31.3-second plan raises Modeled retrieval window to about 37.5 seconds. The reserve does not model the exact backoff sequence, but it keeps a clean-run estimate from being mistaken for a realistic throttled job window.
Advanced Tips:
- Use
Endpoint or resourceto label the job when comparing multiple collections in copied rows or JSON output. - Set
Response overheadwhen responses include large envelopes, metadata, link arrays, or uncompressed wrappers. - Compare
Fetch Time Curvewith the real job's checkpoint cadence; large gaps between expected and observed progress usually mean latency, throttling, or payload assumptions are stale. - Read
Page Size Tradeoffwith the provider cap in mind. A larger page is not safer if it increases memory pressure or makes retries too expensive. - Keep the
Retry-reserved windowinShard Planclose to the real runbook when the endpoint often returns429or transient5xxresponses.
FAQ:
Does concurrency increase the API rate limit?
No. Concurrent workers changes how many page requests can wait at once, but Rate limit still controls the allowed request pace.
Why was my page size reduced?
A positive API page cap lower than the requested Page size becomes the effective page size. The review flags the mismatch so the real request limit can be corrected.
Why does offset mode warn after 100 pages?
Offset mode is flagged when Page requests exceeds 100 because deep offset pulls are more likely to hit page-depth limits or shift when records change during the run.
Why does the modeled window differ from the real export?
The estimate uses average latency, a fixed request allowance, and a straight progress curve. Real exports can vary because of server load, cache state, compression, network path, changing data, token expiry, and retry backoff.
What happens when total items is zero?
Window Review reports an empty retrieval scope. Some APIs still require one confirmation request to learn that the collection is empty, so use provider behavior when planning that final check.
Glossary:
- Cursor
- A continuation value or next link returned by an API so the next request can resume from a service-defined position.
- Offset
- A page number or item position supplied by the client to skip earlier records in a collection.
- Page cap
- The maximum number of records an endpoint will allow in one page response.
- Rate limit
- The allowed number of requests in a time window, often scoped to an account, token, endpoint, or IP address.
- Latency ceiling
- The time implied by page-response latency and the number of workers waiting on page responses at once.
- Retry reserve
- An added percentage of time used to keep the plan from assuming every request succeeds on the first attempt.
References:
- AIP-158: Pagination, Google API Improvement Proposals, updated July 8, 2025.
- Pagination with $after in REST, Microsoft Learn, updated October 14, 2025.
- Pagination, Zendesk Developer Docs.
- 429 Too Many Requests, MDN Web Docs, updated July 4, 2025.
- RFC 8288: Web Linking, IETF, October 2017.
- How to scrape a JSON API with Scrapy, Simplified Guide.