Log Sampling Rate Calculator
Calculate the log sampling rate needed to fit a daily storage target, with event-size math, protected reserves, burst planning, and keep-policy cues.| Metric | Value | Detail | Copy |
|---|---|---|---|
| {{ row.metric }} | {{ row.value }} | {{ row.detail }} |
| Policy lever | Setting | Recommendation | Operator action | Copy |
|---|---|---|---|---|
| {{ row.lever }} | {{ row.setting }} | {{ row.recommendation }} | {{ row.action }} |
Introduction
High-volume logging becomes a storage problem before it becomes a search problem. Access logs, debug messages, health checks, retries, and routine success events can arrive by the thousands every second, while the records people need during an incident may be only a small part of the stream. Sampling gives teams a way to keep baseline telemetry without paying to retain every routine event forever.
A log sampling rate is the share of selected log events that should survive before the stream reaches the retained storage tier. A 10% rate means the sampled portion is expected to keep roughly one event in ten across a large enough volume. It does not guarantee that every rare error, unusual user journey, audit event, or billing record appears in the saved data.
| Control | Typical use | Main caution |
|---|---|---|
| Full retention | Security, audit, billing, and incident-critical records that must be complete. | Cost grows directly with event count, event size, and index overhead. |
| Filtering | Drop known noise such as repetitive health checks or low-value debug messages. | A broad filter can remove the only clue for a future investigation. |
| Probabilistic sampling | Keep a representative baseline from routine high-volume logs. | Rare events can disappear unless they have separate always-keep rules. |
| Consistent sampling | Keep or drop related events together by trace ID, request ID, user, pod, or another stable attribute. | The same sampling key and rate must be used where consistent grouping matters. |
The basic storage arithmetic is simple, but the planning answer changes quickly. Event rate decides how many records arrive in a day. Average event size controls the bytes attached to each record. Burst planning raises the modeled rate above the measured average. Indexes, labels, metadata, and vendor accounting can add overhead. A daily target then forces a choice: keep everything, lower the sampleable volume, or redesign which categories stay at full fidelity.
Good sampling plans separate log categories before applying a probability. Routine access logs and repetitive success messages can tolerate sampling when aggregate trends are the goal. Security findings, audit trails, payment records, and low-volume errors usually need deterministic keep rules, because a statistically fair sample can still miss the single record that matters later.
The most common mistake is treating the daily storage target as the only constraint. A plan also needs a realistic event size, a peak or average traffic assumption, and a clear definition of which records bypass sampling. Without those inputs, a keep rate can fit the storage budget while still producing an investigation gap.
How to Use This Tool:
Use the calculator when you know the approximate incoming rate and want a storage-sized sampling percentage for one stream, service, index, or environment.
- Enter
Incoming log ratebefore sampling and choose the matching unit. Use events per second, minute, hour, or day from collector counters, usage pages, or recent log metrics. - Enter
Average event sizein bytes, KiB, or MiB per event. If your vendor already reports stored bytes, keepIndex overheadneutral unless you are deliberately modeling extra labels, indexes, or metadata. - Set
Target daily volumeto the retained daily budget for this stream. Do not use an account-wide quota unless this stream owns the whole quota. - Open
Advancedfor rollout assumptions. UseFull-fidelity reservefor always-kept records,Burst multiplierfor peak planning, andIndex overheadfor stored-byte expansion not already included in event size. - Check the summary badge and
Sampleable keep rate. If the badge saysneeds input, make the incoming rate, event size, target volume, and burst multiplier positive values before trusting the output. - Use
Sampling Budgetfor retained and dropped volume,Keep Policyfor rollout guidance, andSampling Budget Curveto see how retained GiB/day changes across candidate keep rates.
For a handoff, record the stream label, input units, Sampleable keep rate, Agent sampling value, Overall retained rate, retained daily volume, and any protected categories that must stay outside the probability rule.
Interpreting Results:
Sampleable keep rate is the probability to apply to records outside the full-fidelity reserve. Overall retained rate is the total share of modeled bytes kept after the reserve and sampled portion are combined. When the reserve is greater than zero, these two percentages usually differ.
| Status badge | Boundary | What to do next |
|---|---|---|
needs input |
Required numeric inputs are zero or invalid | Fix Incoming log rate, Average event size, Target daily volume, or Burst multiplier. |
budget conflict |
Target daily volume is less than protected volume |
Lower the reserve, raise the target, or route protected records to a separate retention path. |
full fidelity fits |
Sampleable keep rate is at least 99.999% |
Keep all modeled events unless another cost or privacy policy requires filtering. |
sampling planned |
>= 10% and below full fidelity |
Use the rate after exact keep rules for errors, audit, security, and other protected categories. |
tight budget |
>= 1% and < 10% |
Check whether routine trends still have enough events and protect rare incidents separately. |
sparse baseline |
< 1% |
Treat the sample as a thin baseline; do not rely on it for rare-event discovery. |
A retained-volume match is not proof that the policy is safe. After rollout, compare the modeled retained daily volume with actual ingest or index counters, then confirm that critical categories appear through deterministic rules rather than chance.
Technical Details:
Daily log volume is the product of event frequency, bytes per event, and time. Sampling changes only the portion of the stream that is eligible for probability-based dropping. Records held in a full-fidelity reserve are counted before the probability is solved, so a large reserve can consume the target even when the sampleable traffic is dropped completely.
The calculation uses binary storage units for display. 1 GiB is 1,073,741,824 bytes, and 1 TiB is 1,024 GiB. The burst multiplier is applied to the incoming event rate before daily volume is calculated, and index overhead expands the estimated stored bytes when the average event size does not already include that overhead.
Formula Core:
The keep rate is solved by letting the protected volume use the daily target first, then assigning the remaining budget to the sampleable volume.
| Symbol | Meaning | Unit or range |
|---|---|---|
E | Incoming event rate after unit conversion | Events per second |
B | Burst multiplier | Greater than 0 |
S | Average event size after unit conversion | Bytes per event |
O | Index overhead as a fraction | 0 to 3, matching 0% to 300% |
R | Full-fidelity reserve as a fraction | 0 to 0.95 |
T | Target daily volume after unit conversion | Bytes per day |
K_sampleable | Sampleable keep rate before percent formatting | 0 to 1 |
With the default values, 18,000 events/sec at 950 bytes/event produces about 1,375.97 GiB/day before sampling. A 120 GiB/day target with no reserve gives 120 / 1,375.97 = 0.0872, so the Sampleable keep rate is about 8.72% and the Agent sampling value is 0.0872.
The one-in-N cue is 1 / K_sampleable when the keep rate is above zero. It is a planning shorthand, not a promise about the exact position of retained records. Hash-based and trace-aware samplers may keep related records together, while independent random sampling can split related events unless the collector supports a consistent sampling key.
Accuracy Notes:
The model is a capacity estimate, not a vendor bill. Real retained volume can differ because of compression, field indexing, parsing failures, ingest filters, archive routing, billing rounding, retention tiers, and traffic patterns that do not match the measured average.
- Use a peak-aware
Burst multiplierwhen the daily target must survive traffic spikes rather than average days. - Keep the same event-size method across comparisons; mixing line bytes, compressed bytes, and billed bytes can distort the keep rate.
- Check low keep rates against incident response needs before rollout, especially below
10%. - Review full-fidelity rules separately when the status is
budget conflict; changing the sampling probability cannot reduce protected volume.
Worked Examples:
Busy access stream. A stream at 18,000 events/sec with 950 bytes/event, no reserve, no burst uplift, and a 120 GiB/day target produces about 1.34 TiB/day before sampling. Sampleable keep rate is 8.72%, Agent sampling value is 0.0872, and the status is tight budget.
Reserve conflict. The same stream with a 10% Full-fidelity reserve protects about 137.60 GiB/day before any sampling. Because that protected volume is already above the 120 GiB/day target, the status changes to budget conflict and probability changes cannot make the modeled stream fit.
Peak and overhead plan. A service at 4,200 events/sec with 1.2 KiB/event, a 1.4x burst multiplier, 25% index overhead, a 5% reserve, and a 260 GiB/day target models about 726.75 GiB/day raw. Sampleable keep rate is about 32.40%, so the policy keeps roughly one in 3.1 sampleable events after the reserve.
Suspiciously low rate. A 75,000 events/sec stream at 1,400 bytes/event with 20% overhead and an 80 GiB/day target needs a 0.79% Sampleable keep rate. Before using that sparse baseline, recheck whether the event rate was entered in the right unit, whether the target belongs only to this stream, and whether low-volume error categories have deterministic keep rules.
FAQ:
What value should I put into a collector or agent?
Use Agent sampling value when the collector expects a decimal probability, and use Sampleable keep rate when it expects a percentage.
Why can protected logs exceed the target?
Full-fidelity reserve is counted before sampling. If that reserved volume is larger than Target daily volume, the calculator reports budget conflict.
Does the calculator upload my logs?
No. It works from the numbers and optional stream label you enter; it does not read, parse, or upload log files.
Why does my vendor bill not match retained daily volume exactly?
The estimate can differ from billing because vendors may apply compression, indexes, metadata charges, archive rules, ingest filters, daily quotas, or rounding that are not captured by a single overhead percentage.
Should errors and audit logs use the same sampling rate?
Usually no. Keep critical categories with deterministic rules, then apply the calculated probability to routine sampleable logs.
What does a needs-input status mean?
At least one required numeric value is missing or invalid. Make Incoming log rate, Average event size, Target daily volume, and Burst multiplier greater than zero.
Glossary:
- Incoming log rate
- The unsampled event frequency converted to events per second.
- Average event size
- The expected bytes per log event before optional overhead is added.
- Target daily volume
- The retained storage budget for the modeled stream.
- Full-fidelity reserve
- The share of raw volume kept before probability sampling is applied.
- Sampleable keep rate
- The probability applied to logs outside the full-fidelity reserve.
- Overall retained rate
- The total share of modeled raw volume kept after reserve and sampling are combined.
- Agent sampling value
- The sampleable keep rate expressed as a decimal probability.
- Index overhead
- Extra stored bytes for labels, metadata, indexes, or similar accounting overhead.
References:
- Processors, OpenTelemetry documentation, March 16, 2026.
- Probabilistic Sampling Processor, OpenTelemetry Collector Contrib documentation.
- Resource: projects.exclusions, Google Cloud Logging documentation.
- Indexes, Datadog Log Management documentation.