I was working on a scenario in which a stored procedure was running faster on the test environment but was very slow on the production server. First thing which normally do is check the sql server is the output of select * from sys.dm_exec_tasks to find out what that SPID is doing.
Here was the output of that:
SPID Status waittype Waittime command
85 suspended WRITELOG 15 INSERT
Which is normally what we ignore or suspect disks. But we need enough data to prove that there is a disk problem. one thing which we do is – we check Perfmon counters to find out the performance of the disks . it’s good to start with.
Lets take an example of what’s happening in my scenario. I analyzed the profiler traces and found:
My database log files were place in the I drive.
PhysicalDisk(8 I:)Avg. Disk Queue Length 0.418 0.940 0.002
PhysicalDisk(8 I:)Avg. Disk sec/Transfer 0.003 0.024 0.002
Avg sec/transfer is 0.024 ms which is quite ok and disk queue length is alright too. then what’s the issue ?
On Good server :
PhysicalDisk(0 C:)Avg. Disk Queue Length 0.002 0.025 0.000
PhysicalDisk(0 C:)Avg. Disk sec/Transfer 0.000 0.002 0.000
It shows there is some difference but still not much convincing.
Here are some more counters which will prove exactly and will assure us that there is something better which can be done on disks.
On Bad server:
SQLServer:Databases(test)Log Bytes Flushed/sec 73391.539 138944.066 0.000
SQLServer:Databases(test)Log File(s) Size (KB) 511992.000 511992.000 511992.000
SQLServer:Databases(test)Log Flush Waits/sec 141.611 269.243 0.000
SQLServer:Databases(test)Log Flushes/sec 141.610 269.243 0.000
On Good server :
SQLServer:Databases(Test)Log Bytes Flushed/sec 491250.951 717276.594 0.000
SQLServer:Databases(Test)Log File(s) Size (KB) 511992.000 511992.000 511992.000
SQLServer:Databases(Test)Log Flush Wait Time 9.814 188.000 0.000
SQLServer:Databases(Test)Log Flushes/sec 947.677 1384.064 0.000
If we see log file size is same which is 511992 but when we compare the log flush waits , Log flushes and log bytes flushed there is a huge difference. That’s what gives the clear indication that there is some I/O issue.
Still there is a catch, sometimes too much of activity on the disk also can cause this issue. But best will be to try first to place the files on the disks where there is very less load. There is one very good blog on how to tackle writelog issue : – http://blogs.msdn.com/b/sqlsakthi/archive/2011/04/17/what-is-writelog-waittype-and-how-to-troubleshoot-and-fix-this-wait-in-sql-server.aspx