Home, Benchmarks, TPC-C, TPC-E, TPC-H, SPEC CPU,
Details: SF100, SF300, SF1000, SF3000

TPC-H at SF300 (Correction 2011-03-03)

From Opteron dual-core to 12-core, and Xeon 7560 8-core

HP TPC-H benchmark reports at SF300 provides excellent history with the Opteron processor, and provides results for both the latest 4-way Opteron and Xeon systems. The first result starts with a 4-way dual-core Opteron, next proceeding to the 8-way quad-core (2.5GHz Barcelona), followed by the faster 2.7GHz Shanghai, then 8-way six-core (2.8GHz Istanbul), and most recently to the 4-way 12-core Magny-Cours. A comparison of the TPC-H 300GB results for the 8-way ProLiant DL785 G6 and the 4-way DL585 G7 is interesting, with the 4-way DL585G7 having 18% better performance on the Power metric.

SystemProcessorTotal
Cores
MemorySQLPowerThroughputComposite
QphH
DL585 Opt 822081285rtm25,206.413,283.818,298.5
DL785 Opt 8360322568rtm67,287.441,526.452,860.2
DL785 Opt 8384322568rtm75,161.244,271.957,684.7
DL785 G6Opt 8439482568sp1109,067.176,869.091,558.2
DL585 G7Opt 6176485128R2 129,198.389,547.7107,561.2
DL580 G7Xeon 7560326408R2152,453.196,585.4121,345.6

The significant differences between the two systems are below. Both system have the same number of total cores, the 8-way with 6-core processors and the 4-way with 12-core processors. The DL785G6 cores are 2.8GHz versus the DL585G7 at 2.3GHz, about a 20% difference. The DL585G7 has twice the memory, 512GB versus the 256GB.

For TPC-H at SF300, and using SQL Server 2008 page compression, 256GB is not quite sufficient to encompass the entire database tables and indexes. With 512GB, there is more than sufficient memory for data, indexes and probably most many hash join intermediate results (for minimal moderate tempdb activity)

Without compression, the TPC-H SF300 LineItem table is actually 240-255GB(?) using the new 3-byte Date data type in place of the original 8-byte datetime in 3 columns. With other tables and indexes, the total size might be 420GB?

SystemDL785DL785G6DL585G7DL580 G7
ProcessorOpteron 8384Opteron 8439Opteron 6167Xeon 7560
Sockets-Cores8 x 4 = 328 x 6 = 484 x 12 = 484 x 8 = 32
Frequency2.7GHz2.8GHz2.3GHz2.26GHz?
Memory256GB256GB512GB640GB
Storage204 HDD194 HDD4 SSDSSD? 6 HDD
Windows Server2008 RTM2008 EE SP12008 R2 EE2008 R2 EE
SQL Server2008 RTM2008 EE SP12008 R2 EE2008 R2 EE

That the DL585G7 employs SSD storage is not expected to impact performance, and was probably used for lower cost. The 194 15K HDDs and 12 storage enclosures in the DL785 cost $110K, while the 4 320GB Fusio-IO drives in the DL585 cost $55K. If the DL585 had 256 or less memory, then the SSD storage would have moderately better performance than with HDD storage. Another significant difference are the improvements in Windows Server 2008 R2, several of which have major impact scaling to a high number of processor cores.

When the HP DL580 G7 TPC-H SF 300 report came out, I was confused and thought that perhaps the storage configuration, data and tempdb on 6x10K HDDs was an error. (thanks to EW for pointing out that this is not an error) Apparently, with 640GB memory, there is sufficient memory for data, indexes and the interdiate query results to be kept in memory. So the 6 HDD do not seem to handicap performance for either data or tempdb activity.

The chart below shows the TPC-H power query run times for the DL585G7 relative to the DL785G6.

tpch300 DL785 vs DL585
TPC-H Power query run times, DL585G7 relative to DL785G6

Overall, the DL585G7 with 4 Opteron 6167 is about 20% higher than the DL785G6 with 8 Opteron 8439 processors. For the individual queries, several are moderately faster, 3 are much faster, 5 are about the same, and 3 are actually significantly slower. The DL785 has faster processors, which should make all queries run faster. It is difficult to account for differences in the system architecture, as there may be difference in how the individual dies are connected. The greater memory on the DL585 is expected to make certain queries run faster. The scaling improvements in R2 (OS and SQL) might contribute significant gains in some queries, but may also negative effects in others.

It would be very helpful to have access to the actual execution plans, along with execution statistics to determine if the differences can be attributed plans differences or differences in disk IO.

SF300

tpch100
SF300 big queries

tpch100
SF300 middle queries

tpch100
SF300 small queries