Joe Chang, SQL Server Consultant
jchang@qdpma.com, or jchang6@yahoo.com
also known as the Cost Based Optimizer (CBO). The query optimizer formulas (along with data distribution statistics) allows understanding exactly why the optimizer chooses a particular execution plan, and why an index is used or not used.
for several processor and system architectures. What does a specific operation in the SQL Server execution plan really cost per row and per page? To what degree does the plan cost represent true cost? The plan cost model is really meant to determine the best table access strategy. The cost of logic does not factor into plan cost if it does not affect table access, but can significantly increase the cost of a query.
How will you know if something seems right or is seriously wrong?
Can the SQL Server query optimizer pick a poor plan? Why? Is it safe to override the default plan? Are you afraid to do so?
A number of anomalies and errors identified in SQL Server 7 and 2000 were sent back to MS for correction in 2005 (involving key operations like small table scan cost, Bookmark Lookups, Nested Loops Join). Even the SQL Server 2005 and 2008 query optimizers make certain assumptions that are frequently not valid.
Working with a copy of the production database was not feasible because sheer size (in the past, today with 500GB 2.5in hard drives, plus 17in notebooks with two internal drives, ...) not to mention data security issues. Simply having the DDL is not sufficient for query tuning, because the SQL Server query optimizer is a cost based optimzer, meaning it uses data distribution statistics to estimate row count.
By creating an empty database using the DDL to create identicial objects, but with no data, then cloning the data distribution statistics, it becomes possible to generate the same execution plans as on the production server (in most cases). This required direct manipulation of the system tables in SQL Server 2000. For SQL Server 2005 and later, Microsoft provided an API to clone data distribution statistics (because this technique was deemed important).
Most people approach performance tuning with a laundry list of techniques i.e. recipes, and try each one until something good happens. With deep experience and skill, they will gradually develop a feel for what might work, trying the most likely first.
This is called performance tuning as an art. It can be called religion if you believe on faith in the absense of supporting evidence, and sometimes with strong evidence to the contrary. I know its what we do for a living, but SQL Server is not a religion.
Science is about knowing for certain why. It is about having a model that can explain observed behavior and even better: make accurate predictions before observations are made. A scientific approach to performance tuning reduces the need to make guesses, avoiding dead-end wild goose chases (and possibly avoidable consulting billable hours). Many places implement policy based on FUD. What is the value of knowledge over FUD?
When a production server is having serious problems, when a Wall Street analyst ask your CEO at a public breifing in front of other analysts: why your field offices say your computer system goes down during end-of-month payment processing, you will not have a good day (actual event). If your 2-year development project is within weeks of cancellation for failure to meet performance requirements (also actual), do you want art or science?
You want your problem fixed fast. Would you like to tell your boss:
Can this be done with art?
People have upgraded to Big-Iron systems hoping it will solve their performance problems. There are reasonable performance benchmarks showing NUMA systems can scale SQL Server performance.
What is not widely known is that certain operations do not scale on SMP or NUMA, and that certain parallel execution plans have severe negative scaling on NUMA systems. Many of these events can be indentified and avoided, if one knew precisely what look for and what to do.
Fortunately, if your CIO or CEO approved the purchase of really expensive hardware, they will most probably also authorize funds for a really expensive consultant (Not to be construed as a solicitation for services) to the problems with the really expensive hardware.
A small problem is your problem.
A big problem is their problem too.
I think this is a matter of who get fired or punished for the mistake. For a small mistake, the bosses can fire a low level person to show that he/she is on top of the situation. When it is a really big mistake, firing a low or middle person is not sufficient. It is the big boss whose job may be on the line. Many things
Note: proposing to bring in an expensive consultant demonstrates
your seriousness in resolving the problem.
The bosses will be happy, you will be happy,
the consultant is happy, everybody is happy.
Don't rock the happy boat.
This is also called successfully deploying a multi-million project. It looks really impressive on your resume.
What should a storage system be able to do?
If perfectly configured, it will deliver:
SQL Server Disk IO patterns and settings that determine serialized or asynchronous IO. The SQL Server engine may generate serialized singleton IO patterns or batch. This means a query requiring relatively few disk IOs may actually take longer than the query that generates more disk IO. When does this happen? The most severe problem occurs with updates to a large number of rows and an index exists.
Articles on SQL-Server-Performance
Server System Architecture, 2007 (Preliminary)
Processor Performance 2006 for SQL Server
System and Storage Configuration for SQL Server
High Call Volume SQL Server Applications on NUMA Systems
Database Backup Performance with Quest LiteSpeed for SQL Server; 3 Terabytes in under 1 Hour
SQL Server XML Statistics and Execution Plans
SQL Server Backup Performance with Imceda LiteSpeed
Large Data Operations in SQL Server
Transferring SQL Server Statistics from One Database to Another
SQL Server Parallel Execution Plans
Processor Performance, Update 2004
A First Look at Execution Plan Costs in Yukon Beta 1
SQL Server Quantitative Performance Analysis
Gigabit Ethernet Direct Connect Networking
Server System Architecture, 2002