Home, Optimizer, Benchmarks, Server Systems, Systems Architecture, Processors, Storage,
Storage Overview, System View of Storage, SQL Server View of Storage, File Layout,

Storage Overview

There is a perplexing lack of material on overall storage performance strategy for database servers. Too much of the content is narrowly focused on tuning techniques to improve performance over an unknown base. First, storage system IO performance is well understood from the science and engineering perspective. Second there is sufficient information on SQL Server IO patterns. It follows that the overall storage performance objective is then to achieve the proper balance between capability and cost. Storage performance is achieved by distributing load across several disk controllers, and a (very) large number of disk drives. After which, the storage capacity is likely to be many times larger than the database. Any document that only discusses sizing storage to meet the space requirement is written by someone who should not be giving advice on database performance. A key element in this storage performance strategy is keeping the amortized cost per disk reasonable. Testing and tuning is the final step to verifying that the storage system performance meets the design specification and requirements.

Storage performance is not inherently a complicated subject. The concepts are relatively simple. In fact, scaling storage performance is far easier compared to the difficulties encounters in scaling processor performance in NUMA systems. Storage performance is achieved by properly distributing IO over:

  1. 1) multiple independent PCI-E ports (system memory and IO bandwidth is key)
  2. 2) multiple RAID controllers or host bus adapters (HBAs)
  3. 3) multiple storage IO channels (SAS or FC, complete path)
  4. most importantly,
  5. 4) a large number of disk drives (15K or SSD?)
  6. 5) with the short-stroke effect

with consideration for random and sequential IO patterns, and in certain cases possibly also separation of low-queue and high-queue patterns, but this is not always possible. It helps to know how to estimate the theoretical performance in IOPS and bandwidth for a given number of disks and IO channel, and then test to see how your configuration compares to the expected characteristics.

It is also necessary to have a basic idea of the capabilities and limitations of each component or bus in this chain. Storage performance cannot be achieved with magic/secret registry settings or other incantations.

A dozen 1TB 7200RPM supporting data, temp and log files, however impressive the capacity seems to be, will have poor performance by database standards no matter what secret settings are applied.

Nor is performance achieved with a grossly overpriced SAN storage system, with relatively few big capacity disk drives, configured in complete disregard of the principals of disk system performance.

Reference Configuration

Without getting deep into concepts, I will provide a simple example of what I consider a balanced storage system configuration. The objective for this reference configuration is the ability to sustain transaction processing throughput with no more than minor degradation during a moderately heavy reporting query. The configuration is also suitable for data warehouse workloads.

A 4-way server, that is a system with four processor sockets, on which the current generation Intel Xeon 7400 (now 7500) series and Opteron 8400 series processors have six (or 8) cores per socket, the reference storage configuration is 4-5 controllers and 120 15K disk drives as detailed below. (Intel finally announced the Xeon 7500/6500 series in the middle of writing this, so I will have make adjustments later.)

ProcessorsIntel Xeon X7560
or Intel Xeon X7460
or Opteron 8439
Cores4 x 8 = 32 (X7560)
4 x 6 = 24
Memory64 x 4GB = 256GB (X7500)
32 x 4GB = 128GB
HBA
Controllers
4-5 Dual-Port 4/8 Gbit/s FC
or 4-5 6Gb/s SAS with 2x4 ports
IO Channels8-10
Disk Drives120 x 15K
Disks per channel12-15

This is only a reference configuration. With direct-attach storage, eminently suitable for data warehouse, should have an amortized cost per disk of $500-600, for a total of $60-70K. In a SAN, the cost per disk might range from $1500-3000 per disk, for a total cost of $180-360K. A SAN vendor will probably attempt to substitute 600GB 15K disk instead of the low capacity models. This will push cost per disk to over $6K, usually resulting in a storage system with far too few disks.

At this time, a SAN is required for clustering. In the past, Windows supporting clustering on SCSI, with two hosts on the same SCSI bus. But this capability was removed as customers seemed anxious to buy very expensive SAN storage. The SAS protocol also supports two hosts connected to the same SAS network, so it should also be possible to enable clustering, but Microsoft does not currently support this.

A really high-end storage system could have over 1000 disk drives. This does need not be a single storage system, it could be multiple systems. Of course, for exceptional random IO needs, a serious effort should be made to determine if solid-state storage can be implemented to keep the spindle count manageable.

Sizing Guidance

If your storage vendor opens with a question as to your capacity requirements, don't waste anymore time. Just throw the rep out and proceed to the next vendor.

Performance Targets

For calculation purposes, I am going to assume 100 of 120 disks are allocated for data and temp, and the remaining 20 for other purposes including logs. In actuality, if only 4 disks are required for logs, then 116 disks would be allocated to data and temp.

Random IOPS
  low queue, full-stroke185 per 15K 3.5in disk,
205 per 15K 2.5in disk,
17-22K tot
  high queue, full-stroke300+
  low queue, short-stroke250+ per disk, 25K tot
  high queue, short-stroke400+
Sequential
  4Gb/s FC330-360MB/s per port,
720MB/s per dual-port HBA,
2.6-3.0GB/s+ total
  2x4 3Gb/s SAS RAID controller
  in x8 PCI-E Gen 1 slot
0.8GB/s per x4 port,
1.6GB/sec per adapter
6GB/s+ or system limit*
  2x4 6Gb/s SAS RAID controller
  in x8 PCI-E Gen 2 slot
1.6GB/s per x4 port,
2.8GB/sec per adapter
10GB/s+ or system limit*
Summary

Modern server systems have considerable IO performance capability. This requires distributing load over multiple storage controllers and very many disk drives. If the resulting configuration has far more storage capability than needed by the primary database, this is a very important performance bonus, providing the capability of handling heavy IO surges while maintaining reasonably IO latency. The IO governor in the SQL Server engine helps, but cannot hide weak storage performance. Always test to verify the actual storage performance is in line with the expected performance level.