Indexes - a Quantitative Basis

Parent: SQL Server Cost Based Optimizer


Indexes are one of the cornerstones of databases. Practically every general course, and most performance articles will cover indexing, not to mention the numerous articles dedicated to indexes. So why is another article indexes warranted? Because almost all of the material on database indexes are qualitative: Selectivity is good, do not have too many indexes. This is followed by: good luck on your specific database environment.

Indexes inherently involves trade-offs. Good indexes are necessary to support SQL queries that have one or more useable search arguments (SARG). Too many indexes put too much burden on writes. Fundamentally, this calls for a quantitative model to determine the right or good set of indexes and the appropriate number of indexes.

Another issue in SQL Server is that many authors rely on the Execution Plan cost to assess goodness of an index. There is a valid point on this, in that if an index is not used, then there is no point to that index. Also, the query optimizer is supposed to pick the best or good enough execution plan for a query based on compile parameters and data distribution statistics. The problem is that the formulas for component operations in an execution plan are based on a seriously antiquated cost model that do not reflect the cost of operations in modern processers and memory environments, see SQL Server Execution Plan Cost Model or here. Here, modern probably means anything from the last 15 years or so.

The discussion here largely pertains to traditional indexes on row-stored tables, but does include Included Columns, and Filtered Indexes. The topics of Column Store, Spatial, XML, and indexes for memory-optimized tables are outside the scope.

Clustered Index vs Heap
Index Key
Included Columns
Filtered Indexes

The Primary Key

The intent of this article is to focus narrowly on indexes, and not the greater topics of the relational database, including the primary key in full. As such, the discussion here is only encompasses aspect pertinent to indexes.

Automatically Generated Keys - Identity

Some or even many introductory courses on SQL Server use examples in which every table has an identity column as the primary key, often clustered. The identity property is definitely a very useful feature for automatically generating key values. There are situations in which we seriously do not want to have generate an primary key value.

Manually Generated Key

That said, there are situations in which we might want to generate our own primary key. One, a table in which new rows are rarely inserted. The relational database purist view is that the primary key is an abstract entity meant only to uniquely identify a particular having no real-world meaning. The fact is real-world organizations like to communicate in a code specific to their field. It may take time for an outsider to learn, but the organization works that way.

Second, consider the example of an OrderLine (LineItem) table, which has a row for each line item from it parent, the Orders table. We might want the key in this table to consist of two columns. One being the Order key value. The second being a sequential number starting with one. Every order has one or more lineitems, so line item 1 exists for every order.

The third example is a join table. Each row in table A may match to several rows in table B, and vice versa. For this we have join table AB, which has two key columns, one being the key value of table A and the second being the key value of table B. Table AB does not have more than one row for any one A and B combination. In this case, the key of table AB is two column combination of the A and B keys. No separate identity column is necessary.


Some years ago, certain developers became enamored of GUIDs and even went so far as to espouse having every table employ a GUID as the primary key column. There are definitely legitimate uses of a GUID as the primary key. There are situations in which the key value and row could be generated in different locations. And then at a later time, the different sources are merged into a single table. When there is not a very important reason to use a GUID, the GUID most probably not the best choice to be the key.

Beyond all shadow of doubt, small dimension tables supporting giant fact tables should not have a GUID key.

Clustered Index Key

The original table organization options in SQL Server were heap and clustered index. Newer options include Columnstore and Memory-Optimized tables, which are not in the scope here.

The use of a Clustered Index for nearly all tables is pervasive in SQL Server. There are some "Best Practices" scripts which check for tables without a clustered index, implying that this was a mistake. In early versions of Azure SQL Database required tables to have a clustered index until someone figured out that this was a stupid requirement.

Recall that prior to SQL Server version 2005, we did not have the Included columns feature for non-clustered indexes. The advantage of a clustered index is that queries involving a seek to the clustered index do not need a key lookup for additional columns.

In the query plan below, the first query is to a table with a clustered index in which the search argument is the key. The second query is to a heap table, having a nonclustered index on the search argument.


The execution plan for the first query only has the Index Seek table access operator. The second has an Index Seek followed by a RID Lookup, which the lookup operation for a Heap table, even though I use the term Key Lookup (in part, because I am so used to working with Clustered Indexes that I use its terminology reflexsively).

The cost of each of these three operations, at Estimated Number of Executions = 1 and Estimated Number of Rows = 1, is identical. Estimated I/O cost is 0.003125 (1/320) and Estimated CPU Cost of 0.0001581.


  NcIndexSeekOp   RIDLookupOp



  The initial page access is assumed to have random IO access with cost 0.003125,
  corresponding to a storage system capable of 320 IOPS (1/320 = 0.003125).

  IndexSeekCL   IndexSeekNC

  The cost of accessing successive pages is modeled as sequential IO with cost 0.00074074 per page,
  corresponding to 1350 pages per sec (1/1350 = 0.00074074), or 10,800KB/s (10.5MB/s).


The Key Lookup for a single row has the same cost structure as for a single row index seek.

  KeyLookup   RID

In a query plan having multiple executes of the Key Lookup component, the Estimate I/O Cost is shown as 0.003125, representing a single execution.


The Subtree cost is a multiple of the IO cost based on the estimate of how likely a leaf level pages was previously accessed in this specific operation and whether it was flushed out or not, plus the single execute CPU cost of 0.0001581 times the full Estimated Number of Executions.


In the above, the Subtree Cost is 12.0854. We can subtract 0.0001581 x 4258.68 for the CPU contribution, leaving 11.412. This is 3651.87 times 0.003125, meaning the model says IO costs are incurred for 85.75% of executes.


The scan operation is much simpler. The above table has 13500 pages, the first page contributes 0.003125, and each of the remaining 13499 pages contributes 1/1350 for a total IO of 10.002384.





The CPU for each additional row is 0.0000011. Note that this value is smaller than the successive page cost by a factor 673. The typical table might have 50-200 rows per page. What this means is that the full cost of a scan is largely in the IO portion. A very narrow index, 4 byte key + 4 byte cluster key could have 579 rows per page, but this is likely an extreme case.

The net result of the SQL Server cost model is that the execution plan at DOP 1 shifts from an Index Seek + Key Lookup to a Scan when the number of rows requiring a key lookup is about 31.6% the number of pages. Alternatively, the model has a Key Lookup as approximately equal in cost to scanning 3.16 pages. The shift point comes sooner at higher DOP.

The assumption of disk IO was reasonably valid in the very long ago past, when system memory was tens of MB's. It was not really so at tens of GB's. For a long time now, most properly tuned SQL environments operate with active data largely in memory. The cost model of execution plan operations for data in memory is very different. Even when there is IO, now that data is on flash storage, the random versus sequential IO characteristics are no longer particularly important.

Actual Cost Structure for Data in Buffer Cache

This is potentially an overly complicated topic. However, consider that the objective for the cost model is to determine the best data access method and join order. We do not need to have a complete model for every aspect of query execution. In particular, we need not be concerned with the cost of logic that must be processed regardless of the execution plan unless the number of executions necessary is impacted.

Variabilities in query cost include the lock level, from Row, Page, Table and NOLOCK. Consider when SQL Server default changes from row to table lock. This was possibly the reason for the anomaly in versions 7 and 2000 for the very high cost assigned to the first page of a scan. The higher isolation levels beyond Read Committed also have high cost, for which scans should be avoided as much as possible.

The contents of the processor cache is a factor. Operations that work on a set of data that fits within the L2 will have very low cost for each subsequent operation. Note that for a substantial period from 2010 to 2016, Intel held the L2 cache fixed at 256K. The Intel Xeon Scalable processors, however, have 1M L2. Also, the Meltdown and Spectre vulnerabilities were patched with fixes that appear to flush L1 (and L2?) caches at certain points?

Regardless of the complexities, my preference is to adopt a simple model under conservative performance assumptions. The apparent pattern is the initial part of a multi-row or multiple number of executes operation has a much higher cost than the subsequent rows or executions within an execution plan operation. For the cost model, we are more interested in the cost of incremental rows and executes because first steps must be completed in any execution plan. It is the cost of subsequent elements that determines the optimal plan

As a very rough model, the preliminary proposal for data already residing in the buffer cache is a page access cost of approximately 0.8µs, a row access cost of 0.05 µs per row. The cost of either a key lookup to a heap appears to be about 2.5 µs per execute for row lock, and lower if a coarser lock is in effect.

The cost of key lookups to a clustered index appears to be 4 µs per execute at lower counts, falling to 2 per execute at higher counts. This should be similar to a loop join inner source execute.

Key Lookup ‐ Scan Cross-over

Now consider the implications of the actual cost structure versus the existing IO based cost model. As stated earlier, the SQL Server query optimizer thinks the Key Lookup execute is about 3X higher than a Scan of one page.

Suppose our table has 80 rows page, or an average of 100 bytes per row. The cost of each page scanned is 0.8 µs for the page plus 80 x 0.05 µs for the rows = 4.8 µs combined. (This translates to a data consumption rate of 1.6GB/s which might seem high. In a query aggregating several column, data consumption rate might be several hundred MB/s due to the cost of the logic.)

Assume the high row count cost is in effect for the key lookup, or approximately 2 µs per row, which is lower, not higher than a scan of one page.

The query optimizer gives up on the index seek + key lookup, switching to a scan too soon by a factor of 6. In other words, SQL Server significantly underestimates the cost of a scan, more so for narrow indexes than fat tables, and overestimates the cost for Key Lookups.


There are serious discrepancies between the SQL Server execution plan cost model and the true cost of operations for data in memory. Not having a reasonably accurate model places serious limitations on what can be done with software intelligence working with such inaccuracies. Even more serious are the limitations with the existing model of parallel execution plan operations. The result is that the execution plan provides very little insight on expected scaling of parallel execution.