Home, Parent: SQL Server Cost Based Optimizer
AdventureWorks, Index Seek, Key Lookups, Table Scan, Joins, Insert, Update, Delete,
Parallelism I, Parallelism II, Parallelism III,

Table Scan

The query below has an execution plan with a table scan.

SELECT SalesOrderID, TotalDue FROM Sales.SalesOrderHeader

The Clustered Index Scan IO cost is 0.520903. This is almost exactly 0.003125 plus 0.00074071 (1/1350) per additional data page. The CPU cost is 0.0347685, which is approximately 0.000001 per row plus the base cost.

Per the definition of the SQL Server plan cost, this implies that the reference system can perform a table scan at 1350 pages per sec, or 10.54MB/s (1M = 1024KB). The most recent generation 15K disk has a sustained pure sequential transfer rate of over 150MB/sec on the out tracks. For disks in an enclosure, typically 15 disks sharing one x4 SAS port (1,100MB/sec) or one 4Gbit/s FC port (390MB/s) the sustained sequential transfer per disk is less. Of course, at the time when these formulas were set, the sequential transfer rate of a high-performance was much less, possibly in the 4-10MB/sec range.

It should also be considered that a table scan is not necessary or even frequently a pure sequential operation at the disk level. The actual blocks allocated to a table could be highly non-contiguous. So another interpretation of the 1350 pages per second might the number of 64KB extents per second (168.75x8 = 1350). This cost structure might also reflect the IO rate in the presence of a mix of 8K and 64K IO.

The bigger issue is that the SQL Server Cost Based Optimizer uses the fixed model for random and pseudo-sequential (or large block) IO performance. There is no consideration for whether data is in memory, or the actual storage system performance characteristics. The IO cost structure is based on random 8K IO at 320 IOPS and sequential at 1350 IOPS, meaning each page of a key lookup or loop join inner source index seek (with adjustment for the percentage of pages that require IO) has approximately the same IO cost of 4.2 pages in a scan operation.

Execution Plan Cross-Over

For any database engine, at some point, the execution plan for a table scan has a lower cost than an index seek followed by a key lookup. For SQL Server, the cross-over point occurs when the number of key lookups is approximately equal to the number data pages divided by 4.2 (or 4.43 based on a full key lookup of 0.003125 + 0.0001581). Some people have guessed that the cross-over point occurs when the number of rows requiring a key lookup exceeds a certain percentage of the rows in the table. In fact, a close examination of the cost structure indicates that it is actually based on the ratio of key lookup rows to data pages, not total rows.

A curious coincidence occurs when the clustered index has depth is 4. Each key lookup requires 4 logical reads. So the point at which the execution plan cross-over occurs is also approximately the point at which the number of logical reads crosses over. This is coincidence when the index depth is 4, and does not occur when the index depth is not 4.

The queries below illustrate the Index Seek + Key Lookup to Table Scan cross-over.

SELECT SalesOrderID, TotalDue FROM Sales.SalesOrderHeader
WHERE CustomerID BETWEEN 11091 AND 11151

SELECT SalesOrderID, TotalDue FROM Sales.SalesOrderHeader
WHERE CustomerID BETWEEN 11091 AND 11152

The execution plans and details are shown below.

The estimated rows for the first query parameters are 199.8 and 202.3 rows for the second query parameters. The actual rows are 199 and 200. Notice that the plan cost for the first query is actually slightly higher than for the second query (0.577118 for the Index Seek + Key and 0.555671 for the Clustered Index Scan). Notice that the plan cost for the second query is exactly the same as the unconstrained table scan for all 31,465 rows. The full Key Lookup cost is 0.572781, which works out 0.5380676 after subtracting out 0.003125 and 199.8 times the CPU cost of 0.0001581, which is 172.2 times the base IO cost of 0.003125, meaning the IO cost is assessed for 86% of rows.

Now let’s examine the actual cost versus logical reads. All tests here are done with all data in memory. The simplest way to get cost is from the DMV dm_exec_query_stats. It is should be pointed out that this measurement method exhibits significant variations from run to run. A full load test with a multi-threaded load generator driving the database server to near 100% negates the need to measure individual SQL statement level CPU and yields much more consistent results from run to run.

The examples here use the following statement to display CPU and logical reads.

SELECT SUBSTRING(t.text, statement_start_offset/2 + 1,
 (CASE WHEN statement_end_offset =-1 THEN LEN(CONVERT (nvarchar( max),text)) * 2
  ELSE statement_end_offset END - statement_start_offset)/2) AS [Stmt]
  , q.execution_count AS ExecCnt, q.total_worker_time AS WorkerTime, q.total_logical_reads AS LogicalReads
FROM sys.dm_exec_query_stats q
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS t

The query execution statistics for the query with slightly different search arguments demonstrating execution plan cross-over are shown below along with an unconstrained table scan. The plan costs for all three are about same or exactly the same. Notice that the Logical Reads is also comparable between the two plans (63656 and 70400).

The Worker Time (CPU) in microseconds is much higher for the scan than for the index seek with key lookup. Of course, there is no physical disk IO in this test run, while the plan cost is based on IO and plan cost represents estimated elapsed time, not CPU (worker) time. Also note that the table scan returning all rows is substantially more expensive than the query returning 200 rows (while using the table scan in the execution plan).

The execution stats below are for queries with the same search parameters, except that the result set is aggregated. The plan costs are slightly higher with a Compute Scalar operation for the aggregation, but the actual worker time is now lower. When a very low volume of results are sent to the client connection, and all data is in memory, and the execution plan is not a parallel plan, and there is no contention for resource, then the worker time is essentially the same as the elapsed time.

The execution stats below are the query set except that NOLOCK or TABLOCK is specified.

Below are the execution stats for another sequence of tests. This time, the test sequence leads with a dummy query to make sure that the query order is not important. A query has been added with same search parameters as the table scan plan, with a hint to force the index seek. The query forcing the index seek with BETWEEN 11091 AND 11152 is slightly less expensive than the unforced index seek plan with BETWEEN 11091 AND 11151. One should probably not interpret too much from this due to the run to run variations of this methodology.

The main conclusion is that the SQL Server plan cost model makes fixed assumption on disk IO time, and is not a good representation of actual worker time or elapsed time when all data is in memory. Presumably, if the disk system has exactly the 4.2:1 sequential to random 8K IO ratio, and sufficient buffer cache for the index root and intermediate levels (but not the leaf level), then the plan cost model would be accurate.