Parent, Previous, Next, Ó 2002 Joe Chang. All rights reserved.

SQL Server Execution Plan Costs, Part II

Joins

There are three major types of join operations in SQL Server: nested loops (loop), hash and merge. The nested loops join will appear often in transaction processing applications where queries typically involve few rows from each table. Hash and merge joins will occur frequently in decision support applications where joins frequently affect a large number of rows. There are actually three types of hash joins. The hash join investigated here is the in-memory hash join, where the entire hash table fits in memory. The other two hash join types apply for very large data structures that do not fit into available memory. The query optimizer decides which join type and join order is more efficient. If a specific join type (loop, hash or merge) is specified, then the join order is also fixed.

Nested Loops Joins

The query below shows a simple join between two tables. The execution plan for a join could involve Index Seek operations, bookmark lookups, or table scans, depending on the search arguments specified and the indexes available.

SELECT M2C_01.ID, M2D_01.Value
FROM M2C_01 INNER JOIN M2D_01 ON M2D_01.ID = M2C_01.ID2
WHERE M2C_01.GroupID = 1

For the above query, the M2C_01 table has a covered index on the columns GroupID, ID2 and ID (which is automatically included if ID is part or all of a clustered index) and the M2D_01 table has a clustered index on the ID column. The loop join execution plan, shown in Figure 2-14, has the lowest plan cost at low row counts.

_x0000_s1027

Figure 2-14. Loop join execution plan.

The upper table (M2C_01) in the above execution plan is the outer source and the lower table (M2D_01) is the inner source. The cost details for the three components of the above nested loops join are shown in Figures 2-15a, 15b, and 15c. The loop join plan first retrieves rows from the outer source. For each row from the outer source, all matching rows from the inner source are found. Even though no explicit search argument is specified for the inner source table, the join condition itself can be employed as a search argument in a nested loops join. In this example, ten rows are retrieved from the outer source, where the details show estimated row count = 10 and estimated number of executes = 1.

_x0000_s1035

Figure 2-15a. Outer source Index Seek cost details.

_x0000_i1028

Figure 2-15b. Inner source Index Seek cost details.

_x0000_s1036

Figure 2-15c. Loop join cost details.

Each row from the outer source matches to exactly one row in the inner source. The detail for the inner source Index Seek operation indicates an estimated row count of 1 and the estimated number of executes of 9.919 (actual is 10). The Nested Loops/Inner Join detail shows an estimated row count of 9 but the SET option SHOWPLAN_ALL reveals this is 9.92. The displayed cost detail rounds down, instead of rounding to the nearest value. Note that the argument in the loop join detail is OUTER REFERENCES. The significance of this is discussed later.

When the outer source is an Index Seek operation (no bookmark lookups), the cost for this component is just that of the Index Seek operation discussed earlier. The loop join component operation with argument condition OUTER REFERENCES has zero I/O cost and the CPU cost formula follows:

Loop join CPU Cost = 0.00000418 per row

The cost detail for the inner source table shows the I/O and CPU cost is that for a single row Index Seek (0.0063285 and 0.0000796), but the cost of the entire operation (executed 9.919 times) is 0.00768413. The cost formula versus number of executes is not simply the sum of the I/O and CPU costs multiplied by the number of executes.

In fact, at least three different costs versus row count behaviors have been observed for the loop join inner source, as can be demonstrated by the three queries below involving a two table join operation. (There is a fourth case involving a spool operation that will be covered in a later release.) A loop join is explicitly specified which forces the join order, with the first table as the outer source and the second table as the inner source. For all cases, the outer source table has a covered index beginning with the GroupID column followed by the join column.

Case 1

SELECT M2C_01.ID, N1C_01.Value
FROM M2C_01 INNER LOOP JOIN N1C_01 ON N1C_01.ID = M2C_01.ID6
WHERE M2C_01.GroupID = @Group1

Case 2

SELECT M2C_01.ID, N1C_01.Value
FROM M2C_01 INNER LOOP JOIN M2D_01 ON M2D_01.ID = M2C_01.ID2
WHERE M2C_01.GroupID = @Group1

Case 3

SELECT M2C_01.ID, N1C_01.Value
FROM M2C_01 INNER LOOP JOIN M2D_01 ON M2D_01.ID = M2C_01.ID2
WHERE M2C_01.GroupID = @Group1 AND M2D_01.GroupID = @Group2

There is no logical difference between Case 1 and Case 2. The only difference is in the size of the inner source tables. The Case 1 inner source table, N1C_01, is small, and in this example occupies in a single 8KB page. The Case 2 inner source table, M2D_01, is not small at 50,000 rows and 506 pages. The outer source table, M2C_01, has 50,000 rows, but this has no influence on the cost structure. Both the Case 1 and Case 2 joins use the clustered index for the inner source Index Seek operation.

Case 3 has the same tables as Case 2. The difference between them is that Case 3 specifies an explicit search argument on the inner source table. This allows the inner source Index Seek to use a covered index beginning with the specified search argument, GroupID, followed by the column specified in the join condition.

Consider now the nature of the loop join operation. For each row from the outer source, matching rows in the inner source are found. There is no guarantee that successive rows from the outer source will match to successive rows from the inner source. For example, the first row from the outer source might have a join column value of 1 and the next row might be 2 or 50,000. Each row from the outer source requires a complete Index Seek operation to locate matching entries from the inner source table. The only certain synergy is that the root level of the index employed for the inner source table has already been found.

The Figure 2-16a shows the execution plan normalized cost for the Case 1, 2 and 3 loop join queries in the above example. Figure 2-16b shows the normalized differential cost per row for the three loop joins discussed above.

_x0000_i1029

Figure 2-16a. Loop join cost versus row count.

_x0000_i1030

Figure 2-16b. Loop join differential cost per row.

The “normalized cost” is the cost divided by the single row Index Seek cost (0.0064081 for ≤1GB). The normalized cost for >1GB cost has essentially the same cost structure as for ≤1GB, except that Index Seek base cost is roughly one-half that of the ≤1GB case. All three cases have the same outer source Index Seek and loop join component costs, so the difference is due entirely to the inner source Index Seek cost. The differential cost per row at 300 rows is calculated by taking the difference in cost between the 300 row join and the 100 row join divided by the difference in rows (200).

In Case 1, the inner source table fits on a single page. Hence the execution plan for the inner source simply involves repeatedly reading that one leaf level page for each row from the outer source. The Case 1 inner source cost formula is very closely approximated by:

Case 1 Inner Source Cost = 0.0063285 + 0.0000796 per execute

In Case 2, the inner source table is not small. The cost formula for Case 2 follows the formula below for low number of executes (~130). Between 130 and 140 rows, the Case 2 loop join exhibits a sharp jump in cost. Somewhere above 1000 rows, the Case 2 cost per incremental execute eventually drops to the same incremental cost as Case 1 (0.0000796).

Case 2 I.S. Cost = 0.0063285 + 0.000140-0.000154 per execute

The Case 3 inner source table is not small either, but a covered index isolates the required rows to a limited range. The Case 3 inner source cost formula is similar to Case 2 at low row counts, but does not exhibit the sharp jump between 130 to 140 rows. The Case 3 incremental cost per execute rather gradually decreases to the same incremental cost of Case 1 and 2 at high row counts. All three cases have approximately the same incremental cost per execute at high row counts.

The Case 3 cost structure observed above may actually occur only in limited circumstances. It could be the SARG on the inner source table in conjunction with the very high row density (500 rows/page) covered index has the effect of making the inner source equivalent to a small table. On examining the Case 3 loop join (SARG on each source) on tables with lower density (100 rows/page) clustered indexes on the inner source, a different cost structure is observed. Figure 2-17 compares the Case 2 and Case 3 loop join cost structures where the Case 3 inner source table has a lower row density clustered index on GroupID column.

_x0000_i1031

Figure 2-17. Case 2 and 3 loop joins with lower row density indexes.

Up to 130 rows, the incremental cost per inner source row is ~0.00015 as observed earlier. The same sharp jump in the inner source cost is observed to occur between 130 and 140 rows as for the Case 2 loop join. Between 140 and 240 rows, the incremental cost is very low. Beyond 260 rows, the Case 3 incremental cost is reasonably linear at 0.0054 per row. From 1000 rows and higher, this Case 3 loop join has higher cost than the Case 2 loop join. This is opposite of what was seen in the earlier Case 3 example.

One-to-many loop joins and other loop join examples

The above plan cost charts are for one-to-one joins, where each row from the outer source joins to exactly one row from the inner source. When each row from the outer source joins to more than one row from the inner source, then the incremental CPU cost (which is a “per execute” cost) on the inner source is:

I.S. CPU Cost 0.0000011 per additional row

Presumably, if each row from the outer source matches to a very large number of rows in the inner source, the 0.00074074 per additional leaf level page cost applies as well, but this condition has not been tested. The loop join component cost (0.00000418 per row) applies to all rows from the join.

Loop joins can involve bookmark lookup operations for either or both the outer and inner sources. The first execution plan in Figure 2-18 has a bookmark lookup operation for the outer source occurring before the join operation. The second plan shows a bookmark lookup for the inner source occurring after the join operation.

_x0000_i1032

_x0000_i1033

Figure 2-18. Loop joins with bookmark lookup.

The bookmark lookup can occur before or after the join operation. The bookmark lookup cost appears to follow the rules established earlier, but this has not been extensively investigated. Occasional cost anomalies in the bookmark lookup cost have been observed. The outer source can also be a table scan operation. It is unlikely for a loop join to have a table scan on the inner source, since this would be done once for each entry from the outer source. A spool component can occur between the inner source and the loop join components when the inner source index does not have all the search columns in a required order.

The loop join cost structure is summarized as follows. The outer source cost component follows the formulas derived earlier for Index Seek (and bookmark lookups when applicable) or Table Scan operations. The inner source cost structure is more complicated. The incremental “per row” cost is relatively low at low row counts, but can be high or low at higher row counts. The loop join component operation is fairly simple, having only a linear “per row” CPU cost.

When search arguments are specified for only one table, then having that table as the outer source can eliminate the high table scan cost overhead. An index on the columns specified on the join condition for the inner source table can be used in a loop join. If search arguments are specified for the inner source table, then an index on the SARG followed by the join column can be positive in some cases, and negative in other cases. Another other factor that can influence join order is the number of rows involved from each table. The Loop Join “per row” cost structure favors the table providing fewer rows as the outer source.

Hash Joins

Both hash and merge joins process the outer and inner sources separately. The join condition is not used as a search argument for the inner source table. When no search argument is explicitly specified for either table or no suitable index exists, a scan on that table is required. It is possible for hash and merge joins to employ bookmark lookup operations, but this is unlikely unless the join type is forced.

The query below explicitly states a hash join. Search arguments are specified for each table in the join operation. Both tables have clustered or covered indexes suitable for the join specified.

SELECT m.ID, n.Value
FROM M2C m INNER HASH JOIN M2D n ON n.ID = m.ID
WHERE m.GroupID = @Group1 AND n.GroupID = @Group2

The hash join execution plan shown in Figure 2-19 is composed of an Index Seek operation for each source, and the hash join operation, for a total of three components. The cost detail for the hash join operation is shown in Figure 2-20.

_x0000_i1034

Figure 2-19. Hash join execution plan.

_x0000_i1035

Figure 2-20. Hash join cost details.

The hash and merge join cost structure for both outer and inner sources follows that of the stand alone Index Seek or table scan operations previously described. Both outer and inner source operations are processed in a single execute for all rows involved, unlike the loop join inner source, where the inner source is executed once per row from the outer source. The cost structure for the hash join component operation closely fits the following formula:

CPU Cost = ~0.017770 + ~0.00001881 per row, 1-to-1 join
         + 0.00000523 to 0.000000531 per additional row in IS

The first line applies for a 1-to-1 join. The second line applies when each row from the outer source joins to more than one row from the inner source table. This “per row” cost structure also favors having the table providing fewer rows as the outer source. The total cost of the above hash join is the sum of: the outer source Index Seek, the inner source Index Seek and the hash join.

Merge Joins

The details of the merge join are discussed in SQL Server documentation and other sources. There are two types of merge joins, the one-to-many (includes one-to-one) merge join and the many-to-many join. Both types of merge joins require the rows from each table be in sorted order. The one-to-many join is the simpler and more efficient operation. The key additional requirement for a one-to-many merge join is that the rows from the outer source must be unique on the join column. Each row from the outer source can join to any number of rows from the inner source, but each row from the inner source can join to no more than one row from the outer source. A many-to-many merge join does not require this condition but is a more complicated operation.

The query below explicitly specifies a merge join, which also forces the join order. The outer source (M2C) has a covered index on the columns GroupID, ID. The join column on the outer source is the ID field, which is the primary key, so the one-to-many merge join condition is met. An explicit search argument is specified on both tables, allowing Index Seek operations on both outer and inner sources, instead of table scans. Both tables have clustered or covered indexes, so bookmark lookup operations are not required.

SELECT m.ID, n.Value
FROM M2C m INNER MERGE JOIN M2D n ON n.ID = m.ID
WHERE m.GroupID = @Group1 AND n.GroupID = @Group2

The merge join execution plan is shown in Figure 2-21. Figure 2-22 shows the merge join details. Note that the argument item at the bottom is MERGE. The cost structure of this merge join is composed of three component operations: two Index Seek operations (one each for the outer and inner sources) and the merge join component. The cost structure of the Index Seek operations is already known.

_x0000_i1037

Figure 2-21. Merge join execution plan.

_x0000_i1037

Figure 2-22. Merge join cost details.

The cost structure for the merge join component is approximately as follows:

CPU Cost = ~0.0056046 + ~0.00000446 per row, one-to-one

There is a small amount of variation in the cost formula from case to case. For a one-to-many join, the cost of each additional row in the inner source is:

CPU Cost 0.000002370 per additional inner source row

As with all types of joins, the cost structure favors the table providing fewer rows as the outer source, and the table providing more rows as the inner source.

Many-to-Many Merge Joins

In the query below, the join column for the outer source is the ID2 column, which not the primary key, nor is there a unique index on this column. Even though the join condition on the other table is unique, the explicit merge join forces the specified join order, with the first table as the outer source. Hence this query requires a many-to-many merge join. The many-to-many merge join execution plan is shown in Figure 2-23 and the join detail in Figure 2-24.

SELECT m.ID, n.Value
FROM M2C m INNER MERGE JOIN M2D n ON n.ID = m.ID2
WHERE m.GroupID = @Group1 AND n.GroupID = @Group2

_x0000_i1038

Figure 2-23. Execution plan for a many-to-many merge join.

_x0000_i1039

Figure 2-24. Many-to-many merge join cost details.

The I/O cost here is not zero, and the argument is MANY-TO-MANY MERGE. On examining the cost over a range of rows, the many-to-many merge join component operation is observed to follow the cost formula:

I/O Cost = 0.000310471 per row
CPU Cost = 0.0056000 + 0.00004908 per row, 1-1

The STATISTICS IO output will show the presence of many-to-many merge with a worktable preceding the two tables involved as part of the IO summary.

Table 'Worktable'. Scan count 749, logical reads 1250 …

Merge Joins with Sort Operation

A merge join operation can be accomplished with a sort operation when a sorted index is not available. Figure 2-25a shows the execution plan where an index on inner source covers the required rows, but is not in the sorted order specified by the join condition. Hence a sort operation is performed prior to the merge join. The sort operation cost detail is shown in Figure 2-25b.

_x0000_i1040

Figure 2-25a. Execution plan for a merge join with sort operation.

_x0000_i1041

Figure 2-25b. Sort operation cost details.

The sort CPU cost versus rows is shown in Figure 2-26. The pattern of the sort CPU cost is slightly more difficult to deduce. Notice that when a cost of 0.0000785 is subtracted from the sort cost, the log-log scale graph is nearly linear up to 5,000 rows, but with a slope slightly more than 1. Some where above 5,000 rows, the sort CPU cost behavior changes, but very high row count sort costs have not been explored in detail.

_x0000_i1042

Figure 2-26a. Sort operation CPU cost versus rows.

The sort operation cost approximately follows the formula below up to 5,000 rows. The sort operation has a higher per row cost structure beyond 5,000 rows.

Sort I/O Cost = 0.011261261
Sort CPU Cost ~ 0.0000785 + 0.000005 * (rows ^ 1.16)
Sort CPU Cost ~ 0.000100079 + 0.00000305849 * (rows-1)^1.26)

The Sort CPU cost formula is nonlinear. That is, the sort cost per row increases with the number of rows to be sorted.

_x0000_i1043

Figure 2-26b. Sort operation CPU cost versus rows.

Loop, Hash and Merge Joins Compared

Having examined the loop, hash and merge join execution plan cost formulas individually; the three join types will now be compared side by side where the requirements for all three join types are met.

The Figure 2-27 below shows the base costs for the loop, hash and merge joins (≤1GB). The base costs excludes the per row cost, including the first row. The outer source and inner source base costs represents the Index Seek base costs excluding the “per row” cost of 0.0000011. The join cost also excludes the first row and are approximates as there is small amount of variation from case to case. The >1GB base costs differ only in the Index Seek contribution (0.003283025).

_x0000_i1044

Figure 2-27. Loop, hash and merge join base costs for ≤1GB Index Seek.

The costs per row and per page for the components of the one-to-one loop, hash and merge join with explicit SARG on both tables are shown in the Table 2-1.

Per row costs 1:1LoopHashMerge
Inner Source  0.00007960-
~0.00015400/p
  0.00000110/r
+0.00074074/p
  0.00000110/r
+0.00074074/p
Join  0.00000418/r ~0.00001881/r ~0.00000446/r

duplicate, Courier font

Per row costs 1-1 Loop Hash Merge
Outer Source
 0.00000110/r
+0.00074074/p
 0.00000110/r
+0.00074074/p
 0.00000110/r
+0.00074074/p
Inner Source  0.00007960-
~0.00015400/r
 0.00000110/r
+0.00074074/p
 0.00000110/r
+0.00074074/p
Join  0.00000418/r ~0.00001881/r ~0.00000446/r
Total  0.000084880-
~0.000155280
+0.00074074/p

  0.00002101 +
2*0.00074074/p

  0.00000666 +
2*0.00074074/p

Table 2-1. Component cost formulas for additional rows by join type.

The cost anomalies in the loop join is not considered. The “per row” (/r) costs applies to all rows, the “per page” (/p) costs apply only if additional leaf level pages are required. The formulas for the total per row and per page costs need to take into account the “per page” conditions. Table 2-2 below shows the computed total cost per row at 10 and 100 rows per page, when the additional leaf level page costs apply.

Per row costs 1-1 Loop Hash Merge
10 rows / page 0.0002294 0.0001692 0.0001548
100 rows / page 0.0001627 0.0000358 0.0000215

Table 2-2. Total costs formulas for additional rows by join type and row density.

The loop join has the lowest base cost. This is due to the loop join component operation having base cost of zero; while the hash and merge join components have base costs of ~0.017770 and ~0.0056046 respectively.

The outer source costs are the same for all three joins, being that of an Index Seek operation. The difference in the “per row” cost structure for three join types comes mostly from the inner source, and partly from the join component itself. The inner source cost structure for the loop join reflects repeated Index Seek operations. The hash and merge join inner source cost structure is simply that of an Index Seek operation. The per row cost on the join component itself has the loop join lowest, followed closely by the merge join and the hash is about four times higher than the loop.

This cost structure results in the loop join having the lowest plan cost for low row counts. The merge join has the lowest total “per row” cost followed by the hash join, and the loop join has the highest per row cost. The merge join is always has a lower plan cost than the hash join. At some intermediate row count, the merge join becomes less expensive than the loop join. At a higher row count, the hash join also becomes less expensive than the loop join. The merge and hash joins advantage in the “per row” costs is greatest for high density indexes, (rows per page). When the density in rows per page is lower, the “per page” cost in the inner source negates some of the merge and hash joins advantage.

Table 2-3 below shows the cost per row for additional rows in the inner source.

Additional IS cost/
per row
LoopHashMerge
Inner Source  0.00000110/r
+0.00074074/p
  0.00000110/r
+0.00074074/p
  0.00000110/r
+0.00074074/p
Join  0.00000418/r  0.00000523/r  0.00000237/r

duplicate

Additional IS costs per row Loop Hash Merge
Inner Source  0.00000110/r
+0.00074074/p
 0.00000110/r
+0.00074074/p
 0.00000110/r
+0.00074074/p
Join  0.00000418/r  0.00000523/r  0.00000237/r

Table 2-3 Inner source cost for additional rows of one-to-many joins.

Note that the per page component for the loop join should apply only if each row in the outer source joins to sufficiently many rows in the inner source so as to require additional leaf level pages, while the per page component in the hash and merge joins apply when the sum total of all rows in the inner source requires additional leaf level pages. If there is a large disparity in the number of rows from each table, then the advantages of the hash and merge join is lost because the loop join will have the lower per additional inner source row cost.

Figures 2-28a and 2-28b shows the normalized loop, hash and merge join plan costs for ≤1GB. Figure 2-28c shows the same plan costs for the >1GB.

_x0000_i1045

Figure 2-28a.Plan cost (≤1GB) for loop, hash and merge joins on log scale.

_x0000_i1046

Figure 2-28b.  Normalized cost (≤1GB) loop, hash and merge joins, linear scale.

_x0000_i1047

Figure 2-28c.Normalized cost (>1GB) loop, hash and merge joins, linear scale.

Explicit search arguments are specified for both the outer and inner source tables, suitable indexed exists on each table, and each row in the outer source joins to exactly one row in the inner source. The execution plans cost are scaled relative to the ≤1GB or >1GB single row Index Seek cost as appropriate.

The loop join component operation itself has no startup cost, and the only startup costs are for the Index Seek operations on the inner and outer source tables, hence for low row counts, the total cost is just over twice the Index Seek operation. The hash join operation has a startup cost just less than three times the ≤1GB Index Seek for a total startup cost of just less than five times an Index Seek operation. The merge join has a startup cost just less than a single ≤1GB Index Seek, for a total startup cost of slightly less than three Index Seek operations. When suitable clustered or covered indexes are available on both tables, the merge join becomes favorable over loop join at 40 rows, hash join over loop at ~160 rows. In this example, the execution plan uses compact, high density covered indexes, holding over 400 rows per page. The cross-over points are higher for lower density indexes.

Figure 2-28d compares the loop hash and merge join costs for both ≤1GB and >1GB. Note the cost scale has not been normalized.

_x0000_i1048

Figure 2-28d.Plan cost, ≤1GB and >1GB loop, hash and merge joins.

There is a slight difference in the >1GB cross-over points between a loop join and the hash and merge joins compared with ≤1GB systems. On a >1GB server, the Index Seek base I/O cost is approximately one-half that of a ≤1GB system. The hash and merge join base costs do not change between ≤1GB and >1GB systems, so the startup costs are lower on the absolute scale, but higher relative to the >1GB Index Seek cost.

The above comparisons involving loop joins excluded the cost anomalies observed at higher row counts. Figure 2-28e shows the Case 2 loop join compared the hash and merge joins. The loop join cost jump between 130-140 rows effectively causes a switch to either a hash or merge join if suitable indexes are available.

_x0000_i1049

Figure 2-28e.Normalized cost ≤1GB join with 1 SARG to small table.

Figure 2-29 below shows the (>1GB) merge join with sort and many-to-many merge join compared with the standard loop, hash and merge joins.

_x0000_i1050

Figure 2-29.Merge join with sort and many-to-many merge joins compared with loop, hash and merge (all >1GB).

The merge join with sort is slightly less expensive than the hash join at low row counts, but becomes more expensive at higher row counts. The many-to-many merge join cost is between the regular merge and the hash joins at low row counts, but becomes more expensive than both for higher row counts.

Hash and Merge Joins with 1 SARG

Figure 2-30a shows the plan cost in the event that a search argument is specified for only one of the two tables so that the hash and merge joins requires a table scan on the other table. The loop join can use an index on the join condition, avoiding the table scan. The inner source table in this case is a small table, so the Case 1 loop join cost structure applies. Since the table scan base cost is approximately 6 times than an Index Seek base cost for ≤1GB, and 12 for >1GB, and much higher than the incremental row costs, the merge to loop and hash to loop cross-over points occur at substantially higher row counts (500-700).

_x0000_i1051

Figure 2-30a.Normalized cost ≤1GB join with 1 SARG to small table.

Join Summary

The loop, hash and merge join characteristics follows the description given in the “Advanced Query Tuning Concepts” section of “Optimizing Database Performance” in the SQL Server Books Online. However, it is useful to see the actual numbers rather than rely on qualitative descriptions such as small, large and similar.