{{ result.summaryTitle }}
{{ result.primaryDisplay }}
{{ result.secondaryText }}
{{ result.statusText }} {{ result.pageBadge }} {{ result.limitBadge }} {{ result.payloadBadge }}
Items Cursor {{ apiPaginationStage.limitLabel }} Pages
API pagination window inputs
Enter the record count to retrieve from the collection.
Use the page size you will send as limit, page[size], per_page, or equivalent.
Set the endpoint maximum so the modeled request count never assumes an invalid limit.
Enter allowed page requests per minute after any shared-client allowance.
req/min
Use a measured average or a conservative sample from staging logs.
ms
Use the number of page requests that can be in flight at once.
workers
Use a sampled response payload divided by item count.
bytes/item
Choose cursor when the API returns next tokens; choose offset for page/offset collections.
Optional label for exports, copied rows, and the JSON payload.
Use this for metadata envelopes, headers, or gzip-disabled wrappers.
KiB/page
Enter 0 for a clean run; add 5-25% when retries are common.
%
MetricValueDetailCopy
{{ row.label }} {{ row.value }} {{ row.detail }}
CheckStateRecommendationCopy
{{ row.check }} {{ row.state }} {{ row.recommendation }}
PlanWorkersWindowUse caseCopy
{{ row.plan }} {{ row.workers }} {{ row.window }} {{ row.useCase }}

        
Customize
Advanced
:

A long API export is not just a question of how many records exist. The practical question is how those records move through a page boundary, a provider limit, a request allowance, and the client's own waiting time. A list with 125,000 records may be harmless when it can be pulled in large cursor pages under a generous quota, or disruptive when the endpoint caps each page, returns bulky payloads, and starts throttling the client halfway through the job.

Pagination turns one large collection into a sequence of smaller responses. Page size controls the amount requested at once, while an API page cap controls the amount the service will actually allow. Many APIs also expose a continuation value, next link, page number, offset, or incremental checkpoint so the client knows where to resume. Those mechanisms sound similar, but they behave differently when records are inserted, removed, or reordered while the export is still running.

API pagination styles and planning risks
Pagination style Best fit Planning risk
Cursor or next-token Large lists where the service returns a continuation value after each page. Tokens can expire or become invalid when filters, ordering, selected fields, or authorization context changes.
Offset or page number Small, stable lists where jumping to a known position is useful. Deep offsets can slow down, hit provider depth limits, or skip and duplicate records when the collection changes.
Incremental window Exports sorted by a stable timestamp or ID, with checkpoints saved after successful pages. Late-arriving records and non-monotonic sort keys can leave gaps unless the overlap and retry policy are deliberate.

The retrieval window also has a pacing side. A rate limit describes how many requests may be sent in a time window. Latency describes how long each page takes to return. Concurrency lets several page requests wait at the same time, but it does not increase the provider's request allowance. Extra workers help only while the client is latency-bound; once the quota is slower than the page-response wait, more workers mainly increase queue pressure and the chance of 429 responses.

API pagination window inputs and active time ceiling Records and page limits determine request count. Rate limit and latency produce competing time ceilings, and the slower ceiling sets the modeled retrieval window. Records total items payload size Page requests page size endpoint cap ceil(items / size) cursor or offset Rate limit requests/min Latency ms and workers Window slower ceiling plus retry

Payload size is the quieter limit. A fast run can still be a poor plan when every page carries heavy records, wrapper overhead, or uncompressed response bodies. Large payloads affect disk space, network transfer, parser memory, downstream queue depth, and retry cost. A safer export often comes from smaller partitions with stored checkpoints rather than one maximum-size pull that has to restart from the beginning.

A pagination estimate is therefore a preflight model, not a provider guarantee. It helps choose a page size, budget an execution window, spot quota or payload risks, and decide when the job needs sharding, but the live run still has to obey provider headers, token expiry, retry backoff, and the service's own pagination rules.

How to Use This Tool:

Start with the collection size and endpoint rules, then use the review rows and charts to decide whether the modeled export window is safe enough to run.

  1. Set Total items to the number of records in the export window. If the API only gives an approximate count, enter the best current estimate and treat Modeled retrieval window as a planning value rather than a promise.
  2. Enter Page size and API page cap. When API page cap is greater than zero and lower than the requested page size, Page requests is calculated from the cap and Window Review asks you to lower the sent limit.
  3. Add the pacing values with Rate limit, Average page latency, and Concurrent workers. The Window Metrics tab shows both Rate-limit window and Latency window, and the slower one becomes the active ceiling.
  4. Choose Pagination style. Use cursor mode for next-token or next-link responses, offset mode for page-number or offset lists, and incremental mode for stable timestamp or ID windows.
  5. Use Average item size and the advanced Response overhead field when transfer volume matters. Payload estimate should be checked before large backfills, not only after timing looks acceptable.
  6. Add Retry reserve when 429 responses, transient 5xx errors, lock waits, or bounded backoff are likely. The reserve multiplies the final modeled window instead of changing the clean-run page count.
  7. Read Window Review and Shard Plan before using the number. Fix cap warnings, split very large payloads, and compare the current worker count with the single-worker, retry-reserved, and burst scenarios.

The result is ready to use when the summary badge says window ready and the review rows match the provider behavior you expect. A review plan badge means at least one input or assumption needs attention first.

Interpreting Results:

Modeled retrieval window is the headline timing estimate. Read its detail text before acting on it: a rate-limit ceiling means the request allowance dominates, while a latency ceiling means response time and worker count still affect the result.

Page requests, Payload estimate, and Effective item rate are the main sanity checks. A short window can still be risky when the payload is measured in many GiB, the provider caps page size below the planned request, or the retry path is not idempotent.

  • When Rate-limit window is slower, adding Concurrent workers will not shorten the modeled time unless the provider allowance also increases.
  • When Latency window is slower, better connection reuse, lower page latency, or a modest worker increase can help until the rate limit becomes slower.
  • Page Size Tradeoff identifies the fastest tested page size, but the fastest candidate may still be too bulky for memory, streaming, or retry safety.
  • Shard Plan compares timing scenarios. It does not prove that the API will accept the burst worker count or the same token in parallel requests.

Do not treat a clean estimate as proof that a production pull will complete on schedule. Run a small sample and compare observed latency, response size, throttle headers, and retry frequency with the values entered here.

Technical Details:

A pagination plan starts by converting records into page requests. The requested page size is rounded down to a whole number and compared with any known endpoint maximum. A partial final page still counts as a request, so the request count uses a ceiling division rather than ordinary division.

The timing model compares two ceilings for the same request count. The rate-limit ceiling spreads requests across the allowed requests per minute. The latency ceiling groups requests into worker batches and multiplies those batches by average page latency. The larger ceiling controls the base window, and the retry reserve is applied after that comparison.

Formula Core

Effective page size = requested page size, or min(requested page size, API page cap) when the cap is above zero Page requests = total itemseffective page size Rate-limit seconds = page requestsrequests per minute×60 Latency seconds = page requestsworkers×average latency ms1000 Modeled window = max(rate-limit seconds, latency seconds)×(1+retry reserve percent100) Payload bytes = total items×average item bytes+page requests×response overhead KiB×1024
API pagination variables and exact roles
Quantity Unit Technical role
Total items records Rounded down to a whole count and used only for the selected export window.
Page size records per request Rounded down to a whole number and never allowed below one record.
API page cap records per request A cap of zero means no known cap; a positive cap limits the effective page size.
Rate limit requests per minute Converted to seconds for the request count; provider headers should override estimates when they differ.
Concurrent workers in-flight page requests Rounded down to a whole number and never allowed below one worker.
Retry reserve percent Clamped from 0% to 300% and applied to the final modeled window.

With 125,000 records, a 500-record effective page size, 480 requests per minute, 180 ms page latency, and 4 workers, the request count is 250. The rate-limit ceiling is 31.25 seconds. The latency ceiling is 63 worker batches times 0.18 seconds, or 11.34 seconds. The modeled clean-run window is therefore 31.25 seconds because the rate limit is slower. A 20% retry reserve raises the modeled window to 37.5 seconds.

API pagination warning and review rules
Rule Boundary Effect on interpretation
Page-size cap API page cap > 0 and requested Page size > cap The effective page size is reduced to the cap and the review recommends lowering the sent limit.
Rate-limited run Rate-limit window >= Latency window Request allowance controls the modeled time.
Latency-limited run Latency window > Rate-limit window Worker count and average page latency can still change the modeled time.
Deep offset warning Offset mode and more than 100 page requests The review recommends cursor or keyset pagination when the provider supports it.
Payload review Payload estimate above 5 GiB Confirm response streaming, compression, disk space, and retry behavior.
Very large payload warning Payload estimate above 1 TiB Split the export into smaller windows before one continuous run.

The page-size comparison tests 50, 100, 250, 500, 1000, the current effective size, twice the current size, and four times the current size. Positive API caps limit those candidates before timing is calculated. When two candidates finish in the same modeled time, the candidate with fewer page requests is favored.

The fetch-time curve is a straight projection from zero items to the modeled window. It is useful for planning checkpoints and status updates, but it does not model uneven page arrivals, retry backoff, provider pauses, token expiration, or shifting offsets.

Accuracy Notes:

The calculation runs from the numbers entered on the page. It does not call the target API, inspect live response headers, validate a cursor, or know whether provider tokens will remain valid for the entire job.

  • Update Average page latency and Average item size from a recent sample before planning a large backfill.
  • Keep filters, sorting, selected fields, authorization context, and page-size policy stable when comparing runs.
  • Treat provider documentation, rate-limit headers, and Retry-After values as authoritative when they conflict with the model.
  • Use smaller partitions when a failed export would be expensive to replay from the beginning.

Worked Examples:

Rate-limited order backfill

For 125,000 records at Page size 500, API page cap 1,000, Rate limit 480 req/min, Average page latency 180 ms, and Concurrent workers 4, Page requests is 250 and Modeled retrieval window is about 31.3 seconds. The result is rate-limited, so doubling workers would not make this clean-run estimate meaningfully shorter.

Requested size above the provider cap

If the same export requests 2,000 records per page while API page cap is 1,000, the request count is calculated from 1,000 records per page. Window Review reports the page-size mismatch, and the practical correction is to lower the sent limit before deploying the job.

Offset depth review

For 60,000 records at 500 records per page, offset mode produces 120 page requests. That crosses the offset warning boundary, so Window Review recommends cursor or keyset pagination if the provider offers it. If offset is the only option, a smaller date or ID window reduces the depth of each run.

Retry-heavy endpoint

Adding a 20% Retry reserve to the default 31.3-second plan raises Modeled retrieval window to about 37.5 seconds. The reserve does not model the exact backoff sequence, but it keeps a clean-run estimate from being mistaken for a realistic throttled job window.

Advanced Tips:

  • Use Endpoint or resource to label the job when comparing multiple collections in copied rows or JSON output.
  • Set Response overhead when responses include large envelopes, metadata, link arrays, or uncompressed wrappers.
  • Compare Fetch Time Curve with the real job's checkpoint cadence; large gaps between expected and observed progress usually mean latency, throttling, or payload assumptions are stale.
  • Read Page Size Tradeoff with the provider cap in mind. A larger page is not safer if it increases memory pressure or makes retries too expensive.
  • Keep the Retry-reserved window in Shard Plan close to the real runbook when the endpoint often returns 429 or transient 5xx responses.

FAQ:

Does concurrency increase the API rate limit?

No. Concurrent workers changes how many page requests can wait at once, but Rate limit still controls the allowed request pace.

Why was my page size reduced?

A positive API page cap lower than the requested Page size becomes the effective page size. The review flags the mismatch so the real request limit can be corrected.

Why does offset mode warn after 100 pages?

Offset mode is flagged when Page requests exceeds 100 because deep offset pulls are more likely to hit page-depth limits or shift when records change during the run.

Why does the modeled window differ from the real export?

The estimate uses average latency, a fixed request allowance, and a straight progress curve. Real exports can vary because of server load, cache state, compression, network path, changing data, token expiry, and retry backoff.

What happens when total items is zero?

Window Review reports an empty retrieval scope. Some APIs still require one confirmation request to learn that the collection is empty, so use provider behavior when planning that final check.

Glossary:

Cursor
A continuation value or next link returned by an API so the next request can resume from a service-defined position.
Offset
A page number or item position supplied by the client to skip earlier records in a collection.
Page cap
The maximum number of records an endpoint will allow in one page response.
Rate limit
The allowed number of requests in a time window, often scoped to an account, token, endpoint, or IP address.
Latency ceiling
The time implied by page-response latency and the number of workers waiting on page responses at once.
Retry reserve
An added percentage of time used to keep the plan from assuming every request succeeds on the first attempt.

References: