RAID Planning Suite
Compare online RAID layouts for capacity, IOPS, rebuild pressure, and growth runway with parity, spare, and fill targets for safer storage sizing.{{ result.summary.title }}
| Area | Metric | Value | Copy |
|---|---|---|---|
| {{ row.area }} | {{ row.metric }} | {{ row.value }} |
| Layout | Usable | Efficiency | Mixed IOPS | Rebuild | Risk | Copy |
|---|---|---|---|---|---|---|
| {{ row.layout }} | {{ row.usable }} | {{ row.efficiency }} | {{ row.mixedIops }} | {{ row.rebuild }} | {{ row.risk }} |
| Priority | Signal | Detail | Copy |
|---|---|---|---|
| {{ row.priority }} | {{ row.signal }} | {{ row.detail }} |
| Horizon | Target usable | Current runway | Drives needed | Action | Copy |
|---|---|---|---|---|---|
| {{ row.horizon }} | {{ row.target }} | {{ row.runway }} | {{ row.drives }} | {{ row.action }} |
Introduction
RAID planning turns a shelf of drives into a usable storage design. The raw drive count is only the starting point. Parity, mirrors, hot spares, idle members, filesystem overhead, reserved space, and operating fill targets all reduce the capacity that can safely be promised to users.
The same layout choice also changes performance and recovery exposure. RAID 5 may look efficient on capacity, RAID 10 may serve random writes better, and RAID 6 or RAID 60 may be easier to defend when large drives make rebuilds slow. A good plan keeps those tradeoffs visible before hardware is ordered or a production array is reconfigured.
Capacity runway matters because data rarely stays still. A design that covers today's usable target can become tight after snapshots, retention growth, backup churn, or a new workload lands on the same pool. Planning against a fill target gives operations teams room for rebuilds, scrubs, migrations, and normal growth instead of treating the last byte as available.
RAID is still only one part of storage protection. It helps an array survive certain drive failures, but it does not protect against deletion, corruption, malware, controller faults, enclosure loss, site loss, or a bad change. RAID planning should narrow the layout choice and expose risk, then the final decision should be checked against platform support, backup policy, and restore testing.
Technical Details
RAID geometry starts with active members. Installed drives become active drives after dedicated hot spares are removed. The selected layout then decides how many active drives hold data, how many hold parity or mirror copies, and whether any drives are left idle because the geometry needs complete pairs or complete parity groups.
Decimal and binary units are not interchangeable. A drive sold as 18 TB is measured with powers of 1000, while many operating systems report TiB with powers of 1024. The planning model converts both forms to bytes before applying reserves, overhead, and target fill, so a TB input and a TiB input do not silently mean the same amount of storage.
Nested layouts add another boundary. RAID 50 stripes across RAID 5 groups, and RAID 60 stripes across RAID 6 groups. The group count controls both usable capacity and recovery exposure. A group shape that leaves extra active drives outside complete groups should be treated as idle capacity, not hidden usable space.
Active drives = installed drives - hot spares
Effective member bytes = drive bytes - per-drive reserve
Gross usable = data drives x effective member bytes
Protected usable = gross usable - filesystem overhead
Plannable capacity = protected usable x target fill
Effective rebuild speed = rebuild rate x (1 - rebuild contention)
Rebuild window = drive bytes / effective rebuild speed
Random I/O planning uses read and write service rates rather than raw drive count alone. Reads can usually be spread across engaged members. Writes depend on layout. Mirroring writes data twice, while small random parity writes need additional read and write work to update parity. Full-stripe writes reduce that penalty for parity RAID because aligned sequential writes can update complete stripes more efficiently.
Reliability planning is a directional model. Rebuild time expands as drive size grows or contention reduces effective rebuild speed. Annual failure rate, engaged drive count, RAID tolerance, and the amount read during rebuild shape the rebuild-pressure estimate. Unrecoverable read error pressure is treated separately, then folded into the rebuild signal so large HDD arrays do not look safer than their recovery workload suggests.
| Layout | Minimum active drives | Data-drive rule | Fault tolerance modeled | Write penalty baseline |
|---|---|---|---|---|
| RAID 0 | 2 | N |
None | 1x |
| RAID 1 | 2 | 1 |
N - 1 in one mirror set |
2x |
| RAID 5 | 3 | N - 1 |
1 drive | 4x |
| RAID 6 | 4 | N - 2 |
2 drives | 6x |
| RAID 10 | 4 | N / 2 for complete mirror pairs |
1 per mirror pair, more in favorable patterns | 2x |
| RAID 50 | 6 | groups x (drives per group - 1) |
1 per parity group | 4x |
| RAID 60 | 8 | groups x (drives per group - 2) |
2 per parity group | 6x |
| Boundary | How it affects the result | Practical caution |
|---|---|---|
| Hot spares | Removed from the active set before capacity and performance are calculated. | A spare helps recovery readiness but does not add usable capacity. |
| Target fill | Caps protected usable capacity at the chosen operating fill percentage. | Planning at 100% usable capacity leaves no margin for churn or rebuild work. |
| Filesystem overhead | Subtracts a percentage from gross usable capacity before target fill. | Use zero only when you truly want raw RAID math. |
| Rebuild contention | Reduces effective rebuild speed and lengthens degraded exposure. | Busy production arrays rarely rebuild at clean benchmark speeds. |
| URE rate | Estimates unrecoverable read pressure from the amount read during rebuild. | Large parity arrays can have visible read-error pressure even when annual failure assumptions look modest. |
Everyday Use & Decision Guide
Start with the physical shelf and the layout under review. Enter installed drives, drive size, hot spares, and RAID level before changing performance assumptions. The default setup models a 12-drive, 18 TB enterprise HDD shelf with one spare, RAID 6, file-services traffic, 80% target fill, a 100 TiB target, and 25% annual growth.
Pick the drive profile that most closely matches the media. NAS HDD, enterprise HDD, SATA SSD, NVMe SSD, and custom measured drive profiles load different random IOPS, sequential speeds, rebuild rates, annual failure rates, and URE assumptions. The custom profile is useful when you have measured drive data or platform guidance that should override the presets.
Choose the workload profile before comparing RAID levels. File services, virtualization, database or OLTP, backup or archive, and analytics workloads use different read ratios, target IOPS, throughput targets, and full-stripe write assumptions. A backup target with large sequential writes can make parity RAID look more attractive than a database profile with many small random writes.
Use the advanced controls when a rough profile is not enough. The most useful overrides are read ratio, full-stripe write share, target IOPS, target throughput, rebuild rate, rebuild contention, annual failure rate, URE rate, filesystem overhead, and per-drive reserve. For RAID 50 and RAID 60, nested parity groups can also be fixed manually instead of allowing an automatic group count.
Read the summary first, then compare the tabs. A "Layout needs correction" or "Capacity target is short" summary should be resolved before the plan is shared. When the summary shows planning runway, the Candidate Matrix and RAID Tradeoff Map help explain whether another valid RAID level gives a better balance of plannable capacity, mixed IOPS, and rebuild exposure.
Treat high rebuild pressure, negative IOPS headroom, negative throughput headroom, or a future Growth Path shortfall as stop-and-verify signals. The result does not prove that a controller, filesystem, or storage appliance supports the design. It gives you a clean comparison set to take into vendor sizing, platform limits, procurement, and backup review.
Step-by-Step Guide
- Select the RAID layout and enter installed drives, drive size, drive unit, and hot spare count.
- Choose the closest drive profile, then choose the workload profile that best matches the steady-state I/O mix.
- Set target fill, current target usable capacity, and annual growth so the plan is measured against an operating target, not raw capacity.
- Open Advanced for measured drive IOPS, sequential speeds, rebuild rate, rebuild contention, AFR, URE rate, filesystem overhead, per-drive reserve, or nested group count.
- Check the Plan Ledger for the selected layout's capacity, performance, and recovery numbers.
- Compare Candidate Matrix and RAID Tradeoff Map before changing RAID level, drive count, or drive size.
- Use Recovery Brief to turn the result into review points for capacity pressure, rebuild pressure, IOPS headroom, and URE pressure.
- Check Growth Path for current, year-one, year-two, and year-three runway before treating the design as final.
Interpreting Results
Plannable capacity is the main capacity number. It is protected usable capacity after spares, layout overhead, per-drive reserve, filesystem overhead, and target fill have been applied. It is intentionally smaller than raw installed capacity.
Protected efficiency compares protected usable capacity with engaged raw capacity before target fill. It helps explain why RAID 6 can look inefficient on a small set but more acceptable as the drive count grows.
Mixed IOPS is a modeled service rate. It blends read and write paths using the selected read ratio and layout write penalty. Compare it across layouts with the same assumptions, not against a benchmark from a specific controller.
Rebuild risk pressure combines rebuild window, fault tolerance, annual failure assumptions, and URE pressure. Long rebuild windows deserve attention even when the annual data-loss estimate is small, because the array remains degraded for longer and the surviving drives do more work.
Growth Path compounds the current target for three years and searches for a drive count that covers each horizon. If year two or year three shows a shortfall, the storage decision is about future runway, not just today's capacity.
| Result surface | Best use | Common misread |
|---|---|---|
| RAID Plan Ledger | Auditing the selected layout's capacity, layout, performance, and reliability rows. | Reading raw installed capacity as usable service capacity. |
| RAID Candidate Matrix | Comparing RAID levels from the same drive set and assumptions. | Comparing rows after changing profile, target, or advanced settings. |
| RAID Capacity Allocation | Seeing how plannable capacity, fill headroom, reserves, parity or mirrors, spares, and idle drives divide the shelf. | Treating every non-usable slice as parity overhead. |
| RAID Tradeoff Map | Explaining capacity and mixed IOPS tradeoffs across valid layouts. | Reading the chart as a controller benchmark. |
| RAID Recovery Brief | Finding the review points that should slow down a storage change. | Ignoring a warning because the capacity number looks large enough. |
| RAID Growth Path | Checking whether the current design covers the current target and the next three growth horizons. | Treating growth as certain when it depends on retention and workload forecasts. |
Worked Examples
Twelve 18 TB enterprise HDDs in RAID 6
With one hot spare, 12 installed drives become 11 active drives. RAID 6 uses two drive equivalents for parity, so the data-drive share is nine drives before filesystem overhead and target fill. The plannable result is far below the raw shelf total, but the layout keeps two-drive fault tolerance and avoids the single-parity exposure of a wide RAID 5 set.
The same active set in RAID 10
RAID 10 needs complete mirror pairs. With 11 active drives, it engages 10 and reports one active drive as idle. Capacity drops because half of the engaged members mirror the other half, but random write service and rebuild behavior may be more attractive for virtualization or database traffic.
A RAID 60 group-count mistake
A 14-active-drive RAID 60 plan with three nested groups can only place four drives in each group, leaving two active drives idle. Changing the group count to two creates two seven-drive groups and engages all active drives. The idle-drive row catches the issue before it becomes an expensive bay-population plan.
Limitations And Safety Notes
The planner assumes identical drives in the merged comparison surface. Mixed-drive shelves, vendor-specific equalization, controller limits, declustered parity, erasure coding, cache policy, and platform-specific spare behavior need separate confirmation.
Performance numbers are directional. Queue depth, cache mode, stripe size, filesystem layout, network path, block size distribution, and application concurrency can move real IOPS or throughput away from the model. Use the modeled values to compare choices under one assumption set, then validate the chosen design with platform guidance and workload testing.
RAID is not backup. Keep separate backup copies, test restore paths, and document failure procedures. The planner can identify a weak layout, but it cannot recover deleted files or protect against a storage platform failure outside the RAID group.
FAQ
Why is plannable capacity lower than raw capacity?
Raw capacity counts every installed drive. Plannable capacity removes hot spares, parity or mirrors, per-drive reserve, filesystem overhead, and the selected fill reserve. That smaller number is closer to what can be promised safely.
Why does RAID 5 often look efficient but risky?
RAID 5 uses one parity drive equivalent, so it preserves more capacity than RAID 6 or RAID 10. The risk comes during recovery. A large single-parity set has less margin for another drive problem or an unrecoverable read error while rebuilding.
Why can RAID 10 or nested RAID show idle drives?
RAID 10 needs complete mirror pairs. RAID 50 and RAID 60 need complete parity groups. When the active count does not fit the required shape, the extra active drives are left idle instead of being counted as usable.
What does full-stripe write share change?
Small random writes on parity RAID carry a heavier write penalty because old data and parity may need to be read before new data and parity are written. More full-stripe writes reduce the effective penalty because aligned sequential writes update complete stripes more cleanly.
Can the result choose the final production layout?
It can narrow the decision and document the assumptions. Final approval still needs vendor support limits, controller and filesystem guidance, backup requirements, rebuild policy, and a check against the actual workload.
Glossary
- Active drives
- Installed drives that remain in the array after dedicated hot spares are removed.
- Engaged drives
- Active drives that the selected RAID geometry can actually use.
- Plannable capacity
- Protected usable capacity after the target fill reserve is applied.
- URE
- Unrecoverable read error, a drive read failure that matters during rebuild because surviving members must be read to reconstruct missing data.
- MTTDL
- Mean time to data loss, an approximate reliability model output that should be used for comparison, not as a guarantee.