QDPMA.com - SQL Server Consultant - Joe Chang

About Joe, Contact: jchang6 at yahoo.com or jchang61 at outlook.com

SRAM as Main Memory (2017-Dec)
Proponents of SRAM as main memory almost never get past the huge price disparity between DRAM, heavily optimized for cost, and SRAM, used as on-die cache optimized for performance in L1 and L2, for density in L3 and sometimes for power. It may also be difficult to imagine database transaction processing as a candidate for SRAM main memory given the huge memory configurations commonly deployed. But consider the following.

1. A system could have both SRAM and DRAM as main memory, with the SRAM memory being far larger than the on-die cache of modern processors, yet smaller than DRAM. Thinks several to ten plus GBs versus several tens of MBs.
2. The large majority of memory in a database system is used to reduce IO, for which DRAM performance parameters and cost are eminently suitable. A minority of memory implemented with SRAM could potentially achieve a substantial reduction in average memory latency.
3. It so happens that database transaction performance is largely governed by round-trip memory latency, given the nature of B-tree index navigation and the disparity between the CPU clock-cycle and DRAM latency.
4. Attempting to scale transaction processing performance on a multi-socket system has moderate to limited effectiveness as few (zero?) real world databases have been architected to the characteristics of a system with non-uniform memory access (NUMA) and can even be counter-productive.
Scaling from 1 to 2 sockets might be 1.6X or less, and even lower from 2S to 4S, with the cumulative 1S to 4S gain of around 2X.
5. Increasing the degree of Hyper-Threading (HT), generic term Simultaneous Multi-Threading (SMT), has the potential for outstanding throughput performance at very high concurrency so long as there is almost zero contention between threads (guess how likely this is), but does nothing for single thread performance.
6. There is an established and prevalent pattern of willingness to throw hardware at the transaction processing system in the hope that this will solve problems of poor architecture and coding. This includes the employment of multiple $5-10K processors. Yet doing so often aggravates existing and creates new problems from NUMA and high concurrency operation.
7. A single-socket system has better single-thread performance than a multi-socket system as all memory accesses are to the local node resulting in lower average latency. Even though a single-socket system is now capable of handling most workloads, it might be perceived as insufficient for the usual bigger and more expensive must be better preconceived notions.
However, the single-socket system with SRAM plus DRAM memory further improves single thread performance and could match the throughput of the conventional DRAM two-socket and perhaps even that of a 4-socket system. The key is to make memory latency as low as possible, even if extraordinary measures are necessary.
The single socket system would not have the negative aspects of NUMA. It would need fewer cores and hence operate at lower concurrency level, avoiding contention issues. All of this makes the single-socket SRAM system a better technical solution in having fewer problematic aspects.
8. Consider that the flagship Intel Xeon SP 8180 Platinum 28-core processor carries a price of $10,000. The SRAM budget would be $10K at a 50% gain, and $30K at 100% gain for equality in throughput. The better single-thread performance and fewer negative aspects of the single socket SRAM memory system also has value, which could be substantial.

All of the above creates substantial space for SRAM to have high value impact as main memory. An SRAM cost structure 100 times higher than DRAM would not be prohibitive. And knowing database world, the real question will be: how much can I get, and I want more that.

 

I made this for a talk a few years ago. I liked is so much that it is now in all my slides.

Natural

Of course, that does not mean it is complete, so please send feedback via Joe on SQLblogs

Articles

In previous articles, I have discussed processor and system architecture, server systems, and performance benchmarks, often mixing more than one topic. Going forward, I will consolidate each topic into separate collections:

Query Optimizer and Performance  formerly "The SQL Server Cost-Based Optimizer"

  Scaling Project Overview 2016-12

  Insert Performance Limitations with Sequentially Increasing Index 2016-10

  kCura Relativity 2013-04, updated 2013-09
  Path to In-Memory Databases - Hekaton

Server Systems  2017-11

  Rethink Server Sizing 2017-12

  Memory Latency and NUMA 2016-12

Storage  

  Transaction IO Performance on Violin (2015-02),
  SAN IO Performance Problems (2015-02),

System Architecture   2017-01

  Rethinking System Architecture 2017-01

  Amdahl Revisited Update 2015-05

Processor Architectures   Updated 2016-12 new die images scaled

Performance Benchmarks  

TPC-H Studies
 SQL Server scaling and other characteristics are investigated using the TPC-H data generator and test queries. The test conditions are not identical with publication requirements.

  Parallel Execution in SQL Server 2016 (2017-01).

  Hyper-Threading performance on TPC-H queries (2013-03).

Performance Tools

ExecStats is my own SQL Server performance analysis and data collection tool. Please try it out and send feedback. SQL Exec Stats has been combined with SQL System as of 2012 Q4,

I am abandoning the specific build links with a generic that should point to the latest. The general link should normally be for the latest build, but it will be the last .NET 3.5 for the time being.

Download       Exec Stats (latest, .NET 4.00, 2017-10-24)

Exec Stats 2017-10-24 ?.

Exec Stats 2017-09-11 change perf ctr data, change in sql text, may have some inconsistencies.

Exec Stats 2017-08-31 change main info order

Exec Stats 2017-07-02 Azure fixes, sys.databases database_id values are not consistent between master and user database, must use value in the user database.

Exec Stats 2017-05-12 Collecting % Processor Performance and Processor Frequency, PF shared with the % Processor, and PP shared with IOPS. Left click on respective graph until back color changes.

Exec Stats 2017-05-08 preliminary support for Query Store

Now using Visual Studio 2017. For unknown reason, VS2017 does not like config files generated by previous versions. So copy your setting from previous versions to the current config file.

Notes on olders versions:
For the last several months, the principle mode of dm_exec_query_stats grouping was a hybrid of query hash and batch via sql_handle. The issue was that the same SQL/query hash would be assigned to the first batch encountered. The new version reintroduces batch mode, which is also a hybrid, first using the sql_handle + offset. However, any query hash for the the end is -1 is grouped by query hash. This should work for most cases, but please advise if not.
Exec Stats 2016-10-08 latest Gembox 3.9, VS15 preview 5.
Exec Stats 2016-03-03 now using dm_os_performance_counters instead of Windows API calls for SQL Server performance counters. Windows Server 2012 did not return values, and I could not find any discussion of this matter.
Exec Stats 2015-12-16 SQL Server 2008R2 RTM fix, exception handling
Exec Stats 2015-11-04 New batch mode, query hash option
2015-03-11 partition key not lead annotated

Documentation to follow. Feedback is desired.

My plan is to use JSON but I have not done so at this time.

SQLExecStats documentation, in the process of being updated, ok, the documentation is seriously out of date. Every time I update the documentation, I make further changes to ExecStats, obsoleting the documentation changes just made - oh well.

Sometime I will try allow reading execution plans from SQL Trace files as well. I am contemplating implementing a very basic graphical plan. I would rather not do this as SQL Sentry has a well reviewed Plan Explorer, but it would be useful.

Standalone Articles

 Climate, not my field, but I had to look into it.

 About 64-bit

 Gigabit Ethernet

 SIMD Extensions for the Database Storage Engine

SQL Performance Tools (Obsolete)

Older versions have been removed. Send me an email if older versions are desired for some reason.

Trace Analysis

  Trace Analysis Build 2015-10-06

SQL Clone
As you can see, it has been many years since I worked on SQL Clone. SQL Server 2014 SP2 now has this functionality, see
DBCC CLONEDATABASE

  Build 2010-03-04
This is an interim update for scripting Roles. A more thorough update is due, when I get a chance.

  Build 2009-01-18

SQL Scripts

Below are some helpful scripts I have built up over the years. The first is an extension of sp_spaceused to provide table level summary.
spaceused with table level details (2013-09)
The next is spaceused at the index level
Index Summary with key and include columns, size, and dmv index usage stats.

An alternate to the index level space usage query is a system stored procedure, extending sp_helpindex.
sp_helpindex3 is my version of extended index information. It would be desired if the table and index sp_spaceused extensions could be made into system views, but apparently SQL Server does not allow modifications to system resource database anymore?

Older versions Table Spaceused Summary (2008?) reports size, indexes, and other info.
Spaceused Summary (2012-04) and Index Summary(old) with key and include columns, size, and dmv index usage stats.

Trace is my standard Profiler script. Be careful in collecting execution plan information, as this could have high overhead.

Events and Presentations

2017

Data Architecture Summit

Is Performance Still Important

slidedeck

2015


2015 Jun 20 SQL Saturday Bratislava Comprehensive Indexing 2015
2015 May 30 SQL Saturday New York Comprehensive Indexing 2015
OK, same slidedeck for both.

2015 Mar 04 Copenhagen, SQL Rally Nordic,  
Automating Execution Plan Analysis 2015 (Updated 2015-03-06)

2015 Jan 22, Oslo SQL Server User Group Norway at Teknologihuset
Statistics That Need Special Attention

2014

Joe presented at SQL Saturday Oslo, 30 Aug 2014
Modern Performance 2014 Oslo

Redmond, 31 May 2014   Automating Execution Plan Analysis for Comprehensive (Whole DB?) Performance.

NOVA, 19 May 2014 Modern Performance 2014

SQL Saturday Atlanta, 3 May 2014

SQL Saturday Budapest, 1 Mar 2014
I was at SQL Saturday Boston on 29 Mar, but not presenting.

2013

2013 Oct 6 at Silicon Valley Code Camp

Modern Performance 2013 (updated 2013-10-06, original 10-03)
Note: Modern Performance-SVCC, the SQL Performance 2013-Riyadh and the 2012 Automating Performance-Copenhagen slidedecks are all on the same topic.

2013 May at SQL Saturday Riyadh, Saudi Arabia

Storage 2013

SQL Performance 2013 (Special Topics)

2012 Oct 2 & 3 at SQL Rally Nordic, Copenhagen

Automating Performance Data Collection, Analysis etc.

Presented at SQL PASS Deutschland

I presented at Munich on 10 Jan 2013. Will put up slides soon?

SQL PASS Deutschland

SQL Server Query Optimizer Cost Formulas 2010-Dec?

2011 SQL Saturday Trinidad Presentations

Performance for Small/Medium Business

SQLBits 2010

I presented a session at SQLBits on 1 Oct 2010 in York, England. (if this where New York was named after?) Anyways, I tried to squeeze 4-5 hours of material into 1 hour. So I will try to write up everything and collect it here.

  1) NUMA System Architecture
  2) Storage Performance for Data Warehouse
  3) Parallel Execution Plans Part I, Part II, Part III
  4) TPC-H Benchmarks, SF100, SF300, SF1000, SF3000
  5) TPC-H Studies at SF10

Slidedecks: SQLBits Presentations

SQL Server 2008 Presentations

SQL Server Query Optimizer Cost Formulas

Storage Configuration, NOVA 2009

SQL Server 2005 Presentations

SQL Server 2005 Performance Enhancements for Large Queries

SQL Server 2000 Presentations from past conferences

Execution Plan Cost Formulas

Quantitative Performance Analysis, CMG2004

Loop Merge and Hash Joins

Insert Update Delete

Large Data Operations

Parallel Execution Plans

Storage Performance, Connections Spring 2005 ?

 

Joe on SQLblogs

Joe on SQL-Server-Performance

Solid Quality Journal

 

Citations

Efficiently Clone Databases by Kevin Kline

Optimizing Power for SQL Server ... by Jimmy May

Logical I/Os – Not a useful performance metric? by Joe Sack

An Introduction to Cost Estimation Benjamin Nevarez

I posted this HealthCare.Gov Fiasco back in 2013.

I noticed this link at GSU HealthCare.Gov Fiasco for PMBA 8125 course, 2014, taught by Duane Truex?.

 

 

In case anyone is interested, the old frameset page is here wframeset.html. I never figured out how to code it.