In this blog we will discuss about how we can analyze the PSSDIAG logs without using SQLnexus tool. In this post I will talk in reference to my blog on troubleshooting high CPU issues. Please make sure you open it while reading this blog because I have followed the same steps to analyze the PSSDIAG.
After installing and configuring the PSSDIAG as mentioned in my blog post, here are the next steps we have to follow.
After configuring the PSSDIAG , you will get a cab named pssd.cab in the chosen location:
Please make sure the cab is saved in some folder because we will have to extract files and it will become easier to manage .
Here is how it will look when the files are extracted:
Before running the PSSDIAG data collection , we have to make sure that the issue is happening or going to happen and we will have to capture the logs for at least 5-10 minutes “whilst” the issue is happening. It will help us to get enough data for analysis.
To start the PSSDIAG we have to double click the PSSDIAG.cmd and it will look like :
Once you see the below text in the command prompt , that means now the PSSDIAG collection process has started.
Below is the scenario of high CPU for which I had collected the PSSDIAG logs.
Now lets look at the collected logs in the folder named “output” which will be created in the same location from where pssdiag.cmd was run.
The output folder will look like:
There will be lots of files in the output and I will discuss about the major files which will help us to find out the cause of high CPU. I will follow my blog on troubleshooting high CPU issues step by step to troubleshoot.
Step 1: – When you see high CPU utilization on the server check in task manager, which process is consuming high CPU. If it’s SQL server then open Perfmon and add the counter for Processor: % Privileged time, % Processor time and %user time
If you see the %privileged time to be high like hovering around 40% and above then it could be some driver issue. To troubleshoot these kinds of issues you could try:
Using Xperf: – http://blogs.msdn.com/b/ntdebugging/archive/2010/03/22/using-xperf-to-root-cause-cpu-consumption.aspx — Video; http://msdn.microsoft.com/en-us/performance/cc709422
If it’s user time which is high then find the process which is consuming high CPU.
In this case which apparently is , now we have to check the process consuming high CPU:
In this case SQL is consuming around 60% of the CPU and rest is consumed by the remaining processes. As SQL is major contributor to the high CPU at the system and is consistently consuming the same amount.
Another way to find out how much CPU SQL server is consuming – Open the file(from the output folder) , name ending with SQL_2005_Perf_Stats_Startup.OUT.
Look for: Recent SQL Processor Utilization (Health Records) – you will get the output as follows:
runtime system_idle_cpu sql_cpu_utilization
—————————— ————— ——————-
2012-03-23T16:26:33.137 90 1
2012-03-23T16:26:33.137 64 20
2012-03-23T16:26:33.137 58 25
2012-03-23T16:26:33.137 56 25
2012-03-23T16:26:33.137 55 25
2012-03-23T16:26:33.137 57 26
2012-03-23T16:26:33.137 57 26
2012-03-23T16:26:33.137 44 35
2012-03-23T16:26:33.137 14 53
2012-03-23T16:26:33.137 4 60
2012-03-23T16:26:33.137 3 59
2012-03-23T16:26:33.137 3 59
2012-03-23T16:26:33.137 2 58
2012-03-23T16:26:33.137 3 58
2012-03-23T16:26:33.137 8 54
2012-03-23T16:26:33.137 10 53
2012-03-23T16:26:33.137 2 56
Step 2. Lets move to the next step of finding the architecture which in our case is x64.
Snippet from Msinfo2 file:
Item Value
OS Name Microsoft Windows Server 2008 R2 Enterprise
Version 6.1.7600 Build 7600
System Name ANMOL-PC
System Type x64-based PC
Processor Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz, 2933 Mhz, 4 Core(s), 8 Logical Processor(s)
Step 3: – Check if there are any trace flags enabled:
To get this we can check in the SQL error log at the startup of SQL server. the file name is SQL_Base_Errorlog_Shutdown.TXT. In this case we don’t have any trace flags(snippet from error logs):
2012-03-23 11:50:55.060 Server Microsoft SQL Server 2008 (RTM) – 10.0.1600.22 (X64)
Jul 9 2008 14:17:44
Copyright (c) 1988-2008 Microsoft Corporation
Enterprise Edition (64-bit) on Windows NT 6.1 <X64> (Build 7600: )
2012-03-23 11:50:55.060 Server (c) 2005 Microsoft Corporation.
Step 4: – Check the below mentioned SQL parameters
To check the SP_configure output we can check the file : – name ending with sp_sqldiag_Shutdown.OUT
In this case:
max degree of parallelism 0 64 0 0
max server memory (MB) 16 2147483647 2048 2048
min server memory (MB) 0 2147483647 0 0
priority boost 0 1 0 0
In this case we are safe because we have:
1. Set max server memory
2. Not set Priority boost
3. As we have 8 processors, we don’t need to set any value for this unless required.
Step 5: – Run DBCC memorystatus and check for the USERSTORE_TOKENPERM memory clerk.
To check the DBCC memorystatus, please check the file: filename ending with DBCC_MEMORYSTATUS_Startup.OUT
In our case, it is normal:
USERSTORE_TOKENPERM (node 0) KB
—————————————- ———–
VM Reserved 0
VM Committed 0
Locked Pages Allocated 0
SM Reserved 0
SM Committed 0
SinglePage Allocator 584 <=very less so no problem here.
MultiPage Allocator 288
CACHESTORE_SQLCP (node 0) KB
—————————————- ———–
VM Reserved 0
VM Committed 0
Locked Pages Allocated 0
SM Reserved 0
SM Committed 0
SinglePage Allocator 10128 <= Not very high
MultiPage Allocator 2152
MEMORYCLERK_SQLQERESERVATIONS (node 0) KB
—————————————- ———–
VM Reserved 0
VM Committed 0
Locked Pages Allocated 0
SM Reserved 0
SM Committed 0
SinglePage Allocator 374256 <= High based on the allocated memory
MultiPage Allocator 0
Step 6: – If the memory consumed by CACHESTORE_SQLCP is high then we check compilations and re-compilations which in our case is not required.
But I will still mention about the file in the PSSDIAG output which you will need to check for procedure cache pollution: File name will end with :- SQL_2005_Mem_Stats_Startup.OUT and you will have to search for the string: – — proccache_pollution
The output will look like :
runtime plan_count total_size_in_bytes cacheobjtype objtype usecounts
———– ———– ——————– —————— ———————-
2011-08-22 48768 2898108416 Compiled Plan Adhoc 4
2011-08-22 5762 281034752 Compiled Plan Adhoc 1
2011-08-22 5755 1100980224 Compiled Plan Adhoc 1
2011-08-22 4125 103612416 Compiled Plan Adhoc 1
2011-08-22 1380 396263424 Compiled Plan Adhoc 1
2011-08-22 1379 200916992 Compiled Plan Adhoc 1
2011-08-22 1379 291463168 Compiled Plan Adhoc 1
2011-08-22 1378 149831680 Compiled Plan Adhoc 1
2011-08-22 1166 107454464 Compiled Plan Adhoc 4
2011-08-22 1056 158662656 Compiled Plan Adhoc 1
2011-08-22 1056 72105984 Compiled Plan Adhoc 1
2011-08-22 1054 48947200 Compiled Plan Adhoc 1
2011-08-22 157 11329536 Compiled Plan Prepared 1
2011-08-22 109 4751360 Compiled Plan Adhoc 1
But this is just dummy data to show you how it looks like. There will be more columns e.g. short_qry_text which will show you the query for which the parameterization is required.
In the above output, we have to look at the plan_count , size and the use counts which will simply mean that the query for which a new plan was created per execution. For more explanation, please check step 6 in troubleshooting high CPU issues.
Step 7:- This relates to our issue i.e. lots of hashing and sorting is happening :
If we see, the output exactly matches with :
MEMORYCLERK_SQLQERESERVATIONS (node 0) KB
—————————————- ———–
VM Reserved 0
VM Committed 0
Locked Pages Allocated 0
SM Reserved 0
SM Committed 0
SinglePage Allocator 374256
MultiPage Allocator 0
Now, the time to check worktables/sec and workfiles/sec
Worktables/sec are high(~15) based on the number of batches/sec (~ 2 ) – which simply lots of sorting spilling to tempdb :
Workfiles/sec are are very low (It shows that there was no hashing spilled to the tempdb) :
In essence:
1. No compilation and re-compilation which means no parameterization issue.
2. No configuration issues at SQL server end.
3. we need to find out a query which might have cause the issue.
Please note : – The values are low because I have run small batched to reproduce high CPU.
Now we will move in the direction of hunting the query which is causing the trouble.We will make use of SQL server 2008’s feature query plan hash to drill down the issue. In the file – name ending with Perf_Stats_Snapshot_Shutdown.OUT. Look for top 10 CPU by query_hash
Here is the output you will get:
runtime total_worker_time sample_statement_text
———- ——————– ———————————-
2012-03-23 1683404272 insert into test (select * from test123 order by col1)
2012-03-23 4432252 SELECT * FROM sys.sysperfinfo
2012-03-23 2461141 SELECT CONVERT (varchar(30), @ru
2012-03-23 1595088 select db_id() as dbid, case
2012-03-23 1495085 SELECT CONVERT (varchar(30), @runt
2012-03-23 1062060 SELECT CONVERT (varchar(30),
2012-03-23 1026058 SELECT CONVERT (varchar(30), @runt
2012-03-23 905051 SELECT CONVERT (varchar(30), @runt
2012-03-23 686038 SELECT sess.session_id, req.r
2012-03-23 684039 select TOP 10000 * from sys.dm_exe
In the above output, if we see the query which is causing the high CPU is: insert into test (select * from test123 order by col1)
We can tune this query to avoid excessive sorting and which will further help to avoid high CPU issue.
I will keep adding more content in this blog as I find more interesting scenario of high CPU.