Syslog Storage Calculator
Calculate online syslog storage from EPS, event size, retention, compression, index overhead, copies, reserve, and disk budget for SIEM capacity planning.{{ result.summaryTitle }}
| Metric | Value | Detail | Copy |
|---|---|---|---|
| {{ row.metric }} | {{ row.value }} | {{ row.detail }} |
| Guardrail | Setting | Reading | Operator action | Copy |
|---|---|---|---|---|
| {{ row.guardrail }} | {{ row.setting }} | {{ row.reading }} | {{ row.action }} |
Introduction:
Syslog storage planning estimates how much usable disk a searchable log tier needs for events sent by network devices, servers, appliances, applications, and forwarders. The basic pressure comes from events per second and average message size. Retention then decides how many days of that flow stay searchable before older data leaves the tier or moves to archive.
The estimate matters because syslog traffic can grow quietly. A firewall rule change, verbose authentication logging, debug messages, or a new fleet of switches can raise daily ingest without changing the number of collectors. A retention target that worked for 50 EPS can become too small when the same index starts receiving 500 EPS or when message size doubles after structured fields are added.
Good sizing separates searchable retention from archive retention. Searchable storage is the faster tier used for investigations, dashboards, and alert follow-up. Archive storage can be cheaper and slower, but it is not a substitute when operators need to search the last 30, 90, or 180 days without restoring data first.
A syslog storage estimate is not a log-quality review. It cannot prove that messages are parsed correctly, that timestamps are synchronized, that severity values are meaningful, or that compliance retention is enough. It gives a defensible capacity model so storage, retention, and copy assumptions can be checked before a collector, SIEM index, or search tier fills up.
Technical Details:
Syslog capacity starts with a message stream. RFC 5424 defines a structured syslog message with a header, structured data, and an optional message body, while older BSD-style syslog is less uniform. For storage planning, the exact format matters mainly through average retained message size. A short router notice and a structured firewall event can have very different byte counts even if both are one syslog event.
Events per second is a sustained rate, not a brief burst. A peak multiplier can be applied when busy-hour traffic should be included in the retained-window model. Retention then multiplies daily ingest by days kept, and storage factors such as compression, index overhead, replicas, and reserve turn raw retained bytes into usable searchable storage.
Formula Core:
The model first chooses the syslog rate, then applies byte, retention, and storage assumptions in order.
Index overhead is calculated as a percentage of compressed retained data before stored copies are multiplied. This keeps the searchable primary copy, its index files, and any replica or clustered copies distinct enough to audit.
Inputs and Bounds:
| Input | Accepted rule | Sizing effect |
|---|---|---|
Ingest input |
Choose Devices and EPS per device or Measured aggregate EPS. |
Controls whether rate is derived from source count or entered from a collector, parser, or SIEM counter. |
Log sources and Average EPS per source |
Positive values in fleet mode; source count is treated as a whole number. | Multiplied together to produce base EPS before any peak multiplier. |
Aggregate EPS |
Positive measured total in aggregate mode. | Used directly as the base syslog rate. |
Average event size |
At least 1 byte per event. |
Scales raw daily ingest linearly. |
Searchable retention |
At least 1 day; fractional input is rounded down to whole retained days. |
Multiplies raw daily ingest before compression and indexing. |
Compression ratio |
At least 1:1. |
Divides raw retained bytes before index overhead is added. |
Index overhead |
0% to 500%. |
Adds searchable index, metadata, parsed fields, or acceleration storage as a percentage of compressed data. |
Stored copies |
At least 1; fractional input is rounded down. |
Multiplies the compressed-plus-index footprint for primary and replica copies. |
Planning reserve |
0% to 300%. |
Adds usable space for filesystem headroom, merges, rollover, bursts, and sizing variance. |
Status Rules:
| Status | Rule | Meaning |
|---|---|---|
needs input |
A required value is zero, negative, or outside its accepted range. | Fix the validation banner before using the estimate. |
over disk |
Required usable storage is greater than the optional usable disk budget. |
The modeled searchable tier does not fit the configured capacity. |
over target |
The estimate fits raw disk but exceeds Budget target. |
The plan may fit the volume while leaving less operating headroom than requested. |
headroom watch |
Budget target use is at least 80%. |
Growth, burst traffic, or low compression could push the tier close to the target line. |
capacity planned |
Inputs are valid and no disk or target warning applies. | The estimate is internally consistent under the entered assumptions. |
Everyday Use & Decision Guide:
Start with measured aggregate EPS when a collector, parser, broker, or SIEM input already reports a stable sustained rate. Use the fleet mode when a project is earlier and the best available numbers are device count plus average EPS per source. Keep the first run to one searchable tier, index, tenant, or environment so the result maps to a real storage decision.
Use measured message size when possible. A sample of retained syslog records is better than a default guess because structured firewall events, VPN logs, Linux auth messages, and application logs can vary widely. The default 400-900 bytes hint is useful for a first network-syslog estimate, but a procurement plan should replace it with local evidence.
- Set
Searchable retentionto the hot or searchable window, not a long cold archive requirement. - Use
Compression ratioas raw-to-stored compression. A value of2means compressed raw data is about half the original size. - Set
Stored copiesto the total searchable copy count. Use2for primary plus one replica. - Raise
Peak multiplieronly when the retained window must absorb sustained busy-hour or seasonal load, not a brief spike that the platform buffers elsewhere. - Enter
Usable disk budgetandBudget targetwhen the plan must fit an existing volume, virtual appliance, or storage pool.
The main false-confidence risk is treating Required usable storage as a complete platform design. It includes searchable copies and reserve under the entered assumptions, but it does not prove parser health, query speed, index-shard layout, backup coverage, or compliance retention. It also does not read live logs, so bad EPS or message-size inputs will produce a tidy but wrong result.
After the summary shows capacity planned, compare Required usable storage, Disk budget use, and the Sizing Guardrails actions. If the status shows over disk, over target, or headroom watch, adjust retention, ingest scope, compression evidence, copy count, reserve, or storage allocation before sharing the plan.
Step-by-Step Guide:
Choose the rate path first, then add retention, storage factors, and budget checks.
- Set
Ingest input. ChooseDevices and EPS per devicefor source-count planning, orMeasured aggregate EPSwhen collector or SIEM counters already give a sustained total rate. - If fleet mode is selected, enter
Log sourcesandAverage EPS per source. The summary badge should show the derivedEPSvalue after any peak multiplier. - If aggregate mode is selected, enter
Aggregate EPS. Use the total rate for the same searchable store being sized. - Enter
Average event sizeandSearchable retention. CheckRaw daily ingestandRaw retained volumeinStorage Footprintbefore adding storage factors. - Set
Compression ratio,Index overhead,Stored copies, andPlanning reserve. WatchCompressed retained data,Index overhead,Searchable copies, andRequired usable storageupdate. - Open
Advancedwhen the estimate needs a handoff label, aPeak multiplier, a fixedUsable disk budget, or aBudget targetpercentage. - If the banner says
Syslog storage inputs need review, fix the listed field before copying results. Common fixes are positive EPS, event size above zero, retention of at least one day, compression of at least1:1, and a budget target from1%to100%. - Use
Retention Runway Curveto compare alternate retention windows,Storage Component Mixto see compressed data, index overhead, and reserve shares, andJSONwhen the plan needs a machine-readable handoff.
Interpreting Results:
Required usable storage is the headline capacity number. It combines compressed retained data, index overhead, stored copies, and planning reserve. Raw daily ingest is the better number for finding noisy sources, while Daily required bytes in the JSON output shows the storage cost of one more retained day under the current assumptions.
Do not overread a clean status. capacity planned means the entered numbers pass the calculator's guardrails. It does not mean the compression ratio has been proven, the log messages are valuable, or the platform can search the retained data at the expected speed.
| Output cue | Read it as | Check next |
|---|---|---|
Required usable storage |
The modeled capacity needed for the searchable tier. | Compare it with actual usable storage, not vendor raw capacity. |
Planning EPS |
The sustained EPS after source math and peak multiplier. | Verify it against collector or SIEM counters for the same store. |
Compression assumption |
A storage-saving assumption, not a guarantee. | Index a representative sample when the ratio is below 1.5:1 or above 8:1. |
over target |
The estimate fits disk but crosses the target headroom line. | Reduce retention, add storage, or justify why less headroom is acceptable. |
Planning reserve below 15% |
The model leaves little room for merges, rollover, and filesystem safety. | Increase reserve unless platform-specific evidence proves separate headroom exists. |
When the result feels high, separate rate, size, retention, and copy count before changing everything at once. A high Raw daily ingest points to noisy sources or large events. A high Searchable copies points to replica policy. A high Planning reserve points to headroom choices rather than log volume itself.
Worked Examples:
Default fleet estimate
250 log sources at 0.8 EPS each produce 200.00 EPS. With 700 bytes/event, 180 days, 2:1 compression, 35% index overhead, one stored copy, and 20% reserve, Raw daily ingest is about 11.27 GiB and Required usable storage is about 1.60 TiB. With no disk budget set, the status should be capacity planned.
Small branch collector
40 sources at 0.35 EPS each produce 14.00 EPS. At 500 bytes/event, 30 days, 3:1 compression, 25% overhead, one copy, and 20% reserve, Required usable storage is about 8.45 GiB. Against a 2 TiB usable disk budget with an 85% target, Disk budget use stays below 1%.
Busy firewall index over target
A measured 1,200 EPS firewall stream at 850 bytes/event keeps 90 searchable days, uses 2:1 compression, adds 35% index overhead, stores 2 copies, and keeps 25% reserve. The estimate reaches about 12.17 TiB. With a 13 TiB usable disk budget and an 85% target, Disk budget use is about 93.6% and target use is about 110.2%, so the status should show over target.
Invalid compression input
If Compression ratio is entered as 0.8, the summary changes to Syslog storage inputs need review and the validation list reports Compression ratio must be at least 1:1. The recovery path is to enter 1 for no compression or a measured ratio such as 2 for a 2:1 planning assumption, then re-check Required usable storage.
FAQ:
What should I use for average event size?
Use a representative sample from the same source mix and parsing path whenever possible. Include the raw syslog message bytes before compression, normalization, and searchable index storage. If only a rough estimate is available, keep the number conservative and revisit it after the first real ingest sample.
Should I use source count or aggregate EPS?
Use Measured aggregate EPS when a collector, parser, or SIEM input already reports the sustained rate for the store being sized. Use Devices and EPS per device when planning starts from inventory and average source behavior.
Why does stored storage exceed compressed data?
The estimate adds Index overhead, multiplies by Stored copies, and then adds Planning reserve. A compressed primary copy is only one part of the searchable tier, especially when replicas, metadata, parsed fields, or acceleration files are retained.
What does a disk budget target mean?
Usable disk budget is the capacity available to this syslog workload. Budget target is the share of that capacity you are willing to consume. A 13 TiB disk budget at 85% creates an 11.05 TiB target line.
Why do I get a validation warning?
The validation banner appears when a required numeric input is missing or outside the accepted range. Common causes are zero EPS, event size below one byte, retention below one day, compression below 1:1, stored copies below one, or a budget target outside 1% to 100%.
Does the calculator inspect my syslog data?
No. It uses numeric inputs and an optional label typed into the page. It does not read log files, connect to collectors, query a SIEM, or upload syslog messages for analysis.
Glossary:
- Events per second
- The sustained rate of syslog events entering the modeled searchable store.
- Average event size
- The average raw byte size of one syslog message before compression and index storage.
- Searchable retention
- The number of days kept in the fast tier used for search, dashboards, and investigations.
- Compression ratio
- The raw-to-stored reduction applied before index overhead and stored copies are added.
- Index overhead
- Extra storage for searchable terms, metadata, normalized fields, parsed values, or acceleration files.
- Stored copies
- The total number of searchable copies, including the primary copy and any replicas or clustered copies.
- Planning reserve
- Usable storage added for filesystem headroom, merges, rollover, bursts, and estimate uncertainty.
- Budget target
- The portion of an optional usable disk budget allowed for this syslog workload.
References:
- The Syslog Protocol, RFC Editor, March 2009.
- The BSD Syslog Protocol, RFC Editor, August 2001.
- Guide to Computer Security Log Management, National Institute of Standards and Technology, September 2006.
- Estimate your storage requirements, Splunk Enterprise documentation, January 2026.
- Calculating storage requirements, Amazon OpenSearch Service Developer Guide.