Query Writing Guidelines for SQL Server DBAs and Developers

Working on SQL Server Performance and Tuning for last 8 years has given a great experience to understand the common practices at customers’ end. Recently, I was looking at the list of companies in India I have worked with ,  during my tenure at Microsoft is around 48. I wanted to share the common trends and my observation with you through this post. It may be a great help to sort out really common unknown issues on the job as a SQL DBAs and developers.

I have observed a disconnect between DBA and Developer team most of the times. The common dialogues which I have heard from both the DBAs and Developers are:

DBA:
1. Developers don’t know how to write a query
2. Developers create indexes without knowing their impact
3. Developers never see execution plan of the queries before releasing those on the production server

Developers:
1. DBAs don’t do nothing 🙂
2. No visibility to the queries performing bad
3. Even if we have the list of queries, we don’t know how to tune them
4. What to look in the execution plans
5. No time for query tuning because of the strict deadlines

If we see all this from the top, there’s noone at fault. For DBAs, they are agitated because entire blame is put on them and for developers, they have to listen to the music from the DBA team. It’s all about educating the developers, on how to write the optimal query, the repercussions of the bad code and how it can bring the entire system down. They need to try their best to avoid writing bad code and there has to be efficient review process in place.

For DBA’s , they need to help the developers for the SQL query writing/tuning education and the challenges they face if the queries are bad. While all that is done, the DBA also need to tune the bad code so that, the application can perform optimally. Moreover , if possible have a monthly session on what queries were tuned and the benefit it yielded for the application. The bad queries should be shared with the developers team on regular intervals so that, those can be tuned/written by the developers. All the process has to function in tandem and wherever, I have observed a disconnect between the teams – the duration of SQL Server downtime is higher.

One big improvement point which I have observed for both the teams can be, the deep conceptual knowledge of how indexing works. That’s the reason for having multiple sessions and blog posts on indexing itself.

https://dbcouncil.net/2015/06/03/which-one-is-better-composite-index-or-index-with-included-columns/
https://dbcouncil.net/2015/06/04/which-index-is-better-composite-index-or-index-with-included-columns-part-2/
https://dbcouncil.net/2015/06/09/index-session-scripts-for-sql-community-delhi-ncr/
https://dbcouncil.net/2014/06/27/sql-server-is-my-table-indexing-optimal-2/

All in all, if developers start writing optimal queries – there was lesser chances of performance degradation due to application code and lesser the DBAs need to work for query tuning. There should be monthly sessions between DBAs and Developers on the application queries tuning and writing education. it will not only help the teams to work in tandem but will also help to resolve the teams the common challenges they face during the query writing/tuning process.

We had organized a session on , Optimal Query Writing which was delivered by my colleague Sourabh Agarwal. Please go though this post and let any of us know if you have any questions. It covers some best practices for query writing which can be really helpful for developers.

Check the summary of entire sessions on Query writing and tuning here:
https://dbcouncil.net/2015/12/24/delhi-ncr-community-december-event-session-1-leverage-query-store-for-query-troubleshooting/
https://dbcouncil.net/2015/12/28/query-tuning-approach-for-sql-server-and-sql-azure-db-part2/

HTH!

Disclaimer:
The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights

Advertisements

Query Tuning Approach – For SQL Server and SQL Azure DB – Part1

It’s been really long since I have written a post. I have been working on lots of query tuning my in recent past and I thought of sharing my approach.I have been using this approach for quite some time now and have got really good success rate to tune the query.

Just understand one rule, 80% of the times you will encounter queries which can be tuned by intermediate skills. It’ll be just 20% or even lesser when you will need some extreme level of query tuning skills. Honestly speaking, I rarely get into such situations where I see the query is complex to be tuned – really rare. I have been using really basic approach to combat bad queries and it’s really serving well for my customers.

I assume , you know how to identify which queries should be picked up for tuning e.g. if you are facing high CPU issues , you need to pick CPU intensive queries or if you are facing memory pressure or slow I/O , you need you pick high logical/physical read queries. Now, you know the resource contention on your system – let’s see what to do next:

1. Query to get top resource intensive queries: I have picked these queries from the PSSDIAG tool mentioned here

print ‘– top 10 CPU by query_hash –‘

select getdate() as runtime, *
from
(
SELECT TOP 10 query_hash, COUNT (distinct query_plan_hash) as ‘distinct query_plan_hash count’,
sum(execution_count) as ‘execution_count’,
sum(total_worker_time) as ‘total_worker_time’,
SUM(total_elapsed_time) as ‘total_elapsed_time’,
SUM (total_logical_reads) as ‘total_logical_reads’,
max(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,
CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text]))
ELSE qs.statement_end_offset/2 – qs.statement_start_offset/2 + 1
END), CHAR(13), ‘ ‘), CHAR(10), ‘ ‘))  AS sample_statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
group by query_hash
ORDER BY sum(total_worker_time) DESC
) t

print ‘– top 10 logical reads by query_hash –‘

select getdate() as runtime, *
from
(
SELECT TOP 10 query_hash,
COUNT (distinct query_plan_hash) as ‘distinct query_plan_hash count’,
sum(execution_count) as ‘execution_count’,
sum(total_worker_time) as ‘total_worker_time’,
SUM(total_elapsed_time) as ‘total_elapsed_time’,
SUM (total_logical_reads) as ‘total_logical_reads’,
max(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,
CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text]))
ELSE qs.statement_end_offset/2 – qs.statement_start_offset/2 + 1
END), CHAR(13), ‘ ‘), CHAR(10), ‘ ‘))  AS sample_statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
group by query_hash
ORDER BY sum(total_logical_reads) DESC
) t

print ‘– top 10 elapsed time by query_hash –‘

select getdate() as runtime, *
from
(
SELECT TOP 10 query_hash,
sum(execution_count) as ‘execution_count’,
COUNT (distinct query_plan_hash) as ‘distinct query_plan_hash count’,
sum(total_worker_time) as ‘total_worker_time’,
SUM(total_elapsed_time) as ‘total_elapsed_time’,
SUM (total_logical_reads) as ‘total_logical_reads’,
max(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,
CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text]))
ELSE qs.statement_end_offset/2 – qs.statement_start_offset/2 + 1
END), CHAR(13), ‘ ‘), CHAR(10), ‘ ‘))  AS sample_statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
group by query_hash
ORDER BY sum(total_elapsed_time) DESC
) t

print ‘– top 10 CPU by query_plan_hash and query_hash –‘

SELECT TOP 10 query_plan_hash, query_hash,
COUNT (distinct query_plan_hash) as ‘distinct query_plan_hash count’,
sum(execution_count) as ‘execution_count’,
sum(total_worker_time) as ‘total_worker_time’,
SUM(total_elapsed_time) as ‘total_elapsed_time’,
SUM (total_logical_reads) as ‘total_logical_reads’,
max(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,
CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text]))
ELSE qs.statement_end_offset/2 – qs.statement_start_offset/2 + 1
END), CHAR(13), ‘ ‘), CHAR(10), ‘ ‘))  AS sample_statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
group by query_plan_hash, query_hash
ORDER BY sum(total_worker_time) DESC;

print ‘– top 10 logical reads by query_plan_hash and query_hash –‘

SELECT TOP 10 query_plan_hash, query_hash, sum(execution_count) as ‘execution_count’,
sum(total_worker_time) as ‘total_worker_time’,
SUM(total_elapsed_time) as ‘total_elapsed_time’,
SUM (total_logical_reads) as ‘total_logical_reads’,
max(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,
CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text]))
ELSE qs.statement_end_offset/2 – qs.statement_start_offset/2 + 1
END), CHAR(13), ‘ ‘), CHAR(10), ‘ ‘))  AS sample_statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
group by query_plan_hash, query_hash
ORDER BY sum(total_logical_reads) DESC;

print ‘– top 10 elapsed time  by query_plan_hash and query_hash –‘

SELECT TOP 10 query_plan_hash, query_hash, sum(execution_count) as ‘execution_count’,
sum(total_worker_time) as ‘total_worker_time’,
SUM(total_elapsed_time) as ‘total_elapsed_time’,
SUM (total_logical_reads) as ‘total_logical_reads’,
max(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,
CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text]))
ELSE qs.statement_end_offset/2 – qs.statement_start_offset/2 + 1
END), CHAR(13), ‘ ‘), CHAR(10), ‘ ‘))  AS sample_statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
group by query_plan_hash, query_hash
ORDER BY sum(total_elapsed_time) DESC;

2. Check the execution plan for the query – Just remember , wherever you get the top resource intensive queries – be it MDW / Profiler / DMVs – you will get an option to get SQL Handle / Plan Handle/ SQL Hash / Plan Hash. If you have any of these , you can get an execution plan from the cache.

Query to get the execution plan is:

This will give you the query execution and plan

select b.*,a.* from sys.dm_exec_query_stats a cross apply
sys.dm_exec_query_plan  (a.plan_handle) b
where a.query_hash= <Query_hash>

if you want to know , execution statistics specific to the plan e.g. how many times the plan was reused – in the above output , you will get:

image

There can be multiple plans as well but you need to pick up the plans with More reads/CPU/time based on execution count.

3. Get the query parameters for the runtime planJust remember , the plans you get from these DMVs are the compile time plans and to get the actual plans – you will need to run the query on your SQL Instance. I have observed, DBAs ask the developer team to share the parameters of the query so that, they can run the query  on the test environment. In fact, if you have the query plan – you can get the parameter from there itself. let’s how to get that:

Right click on the graphical execution plan and click on show execution plan as XML – once the xml plan is visible , search for ParameterList as show below:

image

4. Execute the query :  Now, it’s the time to execute the query and analyze the actual statistics. leverage the below commands to delve deeper into the query execution:

Set statistics IO ON –>   Get the IO statistics of the query

Table ‘residual_demo’. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Set statistics Time ON –> Get the time statistics of the query i.e. how long it took for the query for compilation and running

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms,  elapsed time = 0 ms.

Set Statistics Profile ON –> Get the execution plan of the query in the text format
image

or
get the graphical execution by clicking on the actual execution plan from the Management Studio

Depending on the issue you are troubleshooting, the operator to be analyzed will vary. e.g. If you are tuning query for CPU intensive workload then probably the first thing to look at will be sort and for high logical reads , it will be an index/table scans

In this post, we have got the execution plan and now we have to start the tuning process. In the next post, I will talk about the operators you can pick for the tuning.

HTH!