Complete explanation of SQL server memory dumps and it’s types.

Have you ever thought :
whats memory dump and why is that generated when some exception or crash occurs in SQL?
whats difference between mini dump/filtered dump/full dump?
whats difference between Exception dump/Hang Dump/Crash dump?

If Yes/No :), that’s what we are going in this blog. This blog will help you clear the fundamentals of why memory dumps are important to understand the issue closely and resolve it quickly.

The agenda for this blog will be as follows:
1. What’s Mini dump,Filtered Dump, Full Dump
2. What’s Exception Dump, Hang Dump, Crash Dump
3. Preference and utilities of dumps based on the scenrios

Before we start with the agenda, lets discuss little bit about the memory dumps. We all know how profiler trace works. aren’t we? To check the reason where the application execution is failing, we check the profiler trace. Because , it gives the information about which statement it failed on and the error due to which it failed etc.. We have to think in the same line with the programming.

Lets say our SQL crashes and there is no facility of dump then think what will happen. We will never know the cause of the failure so, we will not know how to resolve the issue. With the faclility of dumps, atleast we can come to know while executing which function SQL failed on and what were the variables passed etc. Atleast we will have some clue where it failed and why it failed. it helps to fix the issue quickly. When we open the memory dump in windbg tool, here is how it looks like:

Child-SP          RetAddr           Call Site
00000000`0c187368 000007fe`fe0d10dc ntdll!ZwWaitForSingleObject+0xa
00000000`0c187370 00000000`02e5fb89 KERNELBASE!WaitForSingleObjectEx+0x79
00000000`0c187410 00000000`02e5f68c sqlservr!CDmpDump::DumpInternal+0x4d9
00000000`0c187510 00000000`02e5f5c1 sqlservr!CDmpDump::DumpFilter+0xbc
00000000`0c187690 00000000`751fb5c6 sqlservr!`CDmpDump::Dump’::`1′::filt$0+0x21
00000000`0c1876c0 00000000`77a39d0d msvcr80!__C_specific_handler+0x96
00000000`0c187730 00000000`77a291af ntdll!RtlpExecuteHandlerForException+0xd
00000000`0c187760 00000000`77a297a8 ntdll!RtlDispatchException+0x45a
00000000`0c187e40 000007fe`fe0dcacd ntdll!RtlRaiseException+0x22f
00000000`0c188480 00000000`02e5f58c KERNELBASE!RaiseException+0x39
00000000`0c188550 00000000`0239d1b8 sqlservr!CDmpDump::Dump+0x7c
00000000`0c1885a0 00000000`0239dd0c sqlservr!SQLDumperLibraryInvoke+0x1a0
00000000`0c1885d0 00000000`023903f7 sqlservr!CImageHelper::DoMiniDump+0x3d4
00000000`0c1887d0 00000000`02002f84 sqlservr!stackTrace+0x82b
00000000`0c189d20 00000000`01b27da9 sqlservr!utassert_fail+0x844
00000000`0c18a980 00000000`7528bf40 sqlservr!`CQueryExecContext::~CQueryExecContext’::`1′::catch$3+0xa5
00000000`0c18a9d0 00000000`7522964e msvcr80!_CallSettingFrame+0x20
00000000`0c18a9f0 00000000`77a60c21 msvcr80!__CxxCallCatchBlock+0xfe
00000000`0c18aaa0 00000000`00369e0a ntdll!RcFrameConsolidation+0x3
00000000`0c18ec10 00000000`0036a2b4 sqlservr!CQueryExecContext::~CQueryExecContext+0x21d
00000000`0c18ede0 00000000`00324f1c sqlservr!CQueryScan::ShutdownQueryExecContext+0x5d
00000000`0c18ee10 00000000`0036a5cf sqlservr!CQueryScan::ShutdownOnException+0x78
00000000`0c18ee40 00000000`006dcd39 sqlservr!CExecuteStatement::Destroy+0x5f
00000000`0c18ee70 00000000`01742d87 sqlservr!CMetaStmtSet::DeleteXstmtsInPartition+0x59
00000000`0c18eeb0 00000000`01744f00 sqlservr!CMetaStmtSet::StepOnStatements+0xe3
00000000`0c18ef20 00000000`01733a68 sqlservr!CCompPlan::StepOnStatements+0x30
00000000`0c18ef70 00000000`027c6b32 sqlservr!CCacheStoreCPCallback::SteppedOnEntry+0xb4
00000000`0c18efc0 00000000`027c65c1 sqlservr!CacheClockHand::MoveInternal+0x4cc
00000000`0c18f270 00000000`027c8e65 sqlservr!CacheClockHand::Move+0x109
00000000`0c18f300 00000000`027c5b13 sqlservr!TClockAlgorithm<CacheClockHand>::ProcessTick+0xed
00000000`0c18f340 00000000`027d20c8 sqlservr!SOS_CacheStore::Notify+0xa3
00000000`0c18f370 00000000`027d1b22 sqlservr!ResourceMonitor::NotifyMemoryConsumers+0x2f4
00000000`0c18f4d0 00000000`023841ef sqlservr!ResourceMonitor::ResourceMonitorTask+0x21a
00000000`0c18f610 00000000`002cb450 sqlservr!SetupResourceMonitorTaskContext+0x133
00000000`0c18f950 00000000`002cb116 sqlservr!SOS_Task::Param::Execute+0x12a
00000000`0c18fa60 00000000`002caf5b sqlservr!SOS_Scheduler::RunTask+0x96
00000000`0c18fac0 00000000`004044fa sqlservr!SOS_Scheduler::ProcessTasks+0x128
00000000`0c18fb30 00000000`004047dd sqlservr!SchedulerManager::WorkerEntryPoint+0x2d2
00000000`0c18fc10 00000000`0084c0cd sqlservr!SystemThread::RunWorker+0xcc
00000000`0c18fc50 00000000`004053d2 sqlservr!SystemThreadDispatcher::ProcessWorker+0x2db
00000000`0c18fd00 00000000`751f37d7 sqlservr!SchedulerManager::ThreadEntryPoint+0x173
00000000`0c18fda0 00000000`751f3894 msvcr80!_callthreadstartex+0x17
00000000`0c18fdd0 00000000`776f652d msvcr80!_threadstartex+0x84
00000000`0c18fe00 00000000`77a3c521 kernel32!BaseThreadInitThunk+0xd
00000000`0c18fe30 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

If you see above, we are getting some idea of which function was executed before SQL failed or generated exception.

Now, Lets get started with the agenda of the blog.

1. What’s Mini dump,Filtered Dump, Full Dump

Mini Dump:- This is a kind of dump in which we get very basic information about SQL server thread stack and very limited variables. It doesn’t include any indepth information or any data/index pages. The size of these dumps is very small and  don’t even take much of resources while creation. These are the dumps enabled by default in SQL server.

Filtered Dump : – This a kind of dump in which we get more in depth information. It contains all the details of structures,functions including the details of all the stolen buffers.  The size of these dumps is moderate and in 64 bit systems the size of these dumps depend on max server memory(It can reach in multiple GBs).  To get the SQL server filtered dump we need to enable the trace flag 2551 using command dbcc traceon(2551,-1)

Full Dump : –   This is a kind of dump in which we get the dump of entire SQL server process memory. It contains both the hashed and stolen pages(pages except index or data pages). The size of the dump file is equal to the total memory consumed by SQL server. On 64 bit systems, it reaches in multiple GBs in size. To get the full dumps we need to enable the trace flag using command dbcc traceon(2544,-1)

Please note :- when we enable the filtered/full dumps,it takes some time to generate these dumps. It makes the SQL unresponsive while generating the dump and thats why we have to be careful while working with them especially on cluster. Because if the dump takes long time to get created then  it can cause failover of SQL cluster. To avoid the failover, we’ll need to tweak the timeout values of isalive and checkalive services on the cluster.

2.  What’s Exception Dump, Hang Dump, Crash Dump 

Exception Dumps : – The dumps which are generated after the exception and ofcourse based on the configuration in SQL server i.e. exception on which the dump has to be genrated. Internally, SQL calls SQLdumper utility to take the dump. e.g. 17883,17884 non-yielding scheduler exceptions. We can also configure SQL server to take a dump on a specific exception(will explain in the article further). They can also be called crash dumps sometimes.

Hang dumps : – These kinds of dumps are always taken manually. In some scenarios, e.g. SQL agent job takes ages to complete a job – we can take a hang dump. Even in high CPU scenarios also , these kinds of dumps are helpful.

Crash dumps : – These kinds of dumps are generated in the process crash scenarios. In SQL server, whenever access violation happens we get a mini dump in SQL server error log and also it crashes the SQL server(most of the times).  In some scenarios , we generate the utassert and don’t crash the SQL server but still we will get the mini dump. We can manully configure a crash dump e.g. for SQL agent or any other processes using ADplus utility.

All these types of dumps can be mini/filtered/full dumps – again based on requirement we decide on this.

3.  Preference of dumps based on the scenrios

Exception dumps : – Most of the times, these dumps are generated automatically but they are based on the configuration in SQL server code. If we want to take these dumps manually – we can do that but there are some limitations. In the scenarios e.g lets while running some query you are getting some error lets error 12345 (some dummy error) –  you could use dbcc dumptrigger(‘set’,12345).

This will generate mini dump everytime you see the error 12345. You may be thinking that if I set the dumptrigger for this error and if this error occurs hundreds of times then it will fill up the disk space. Yes, you are correct but we have a workaround for this. There is a trace flag 8026 which we enable using the same command dbcc traceon(8026,-1) . It will clear the dump trigger after the generation of first exception dump.

If we need filtered dump (only once) on the exception 12345 then we can enable trace flag 2551,8026 :

dbcc traceon (8026,2551,-1)

2551 –> To configure the filtered dump
8026 –>To clear the dumptrigger after generating the dump once.
and then enable
dbcc dumptrigger(‘set’,12345)

To check if the dumptrigger is enabled, you will need to run the command:
dbcc traceon(3604,-1) –>It will throw the output of below query onto the query analyzer(is required to check if any dump triggers are available)
dbcc dumptrigger(‘display’,12345) –> the output will be :

Dump Triggers
DebugBreak = 0

CExDumpTriggers @0x00483EE0
Exception = 12345

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

After the dump is generated, please clear the dumptrigger dbcc dumptrigger(‘clear’,12345), confirm using command  dbcc dumptrigger(‘display’,12345) and disable the traceflags dbcc traceoff (8026,2551,-1)

Otherwise , the next dumps which will be generated will be filtered dumps.

Hang Dumps : – These dumps have to be configured manually , always.  We can configure these dumps in the scenarios where we see SQL server is hung while doing certain operations e.g. on high CPU scenarios (we can also use procdump utlity in these scenarios) or lets say SQLagent is running some job and is taking forever to run that or while taking backup entire SQL is hung.

The utilities which we can use for these scenarios are :- Adplus,procdump,DBCC stackdump,WER,SQLdumper,windbg etc.

* To take a hang dump using ADplus , here is the command which you need to run:

1. Download Adplus from : – http://support.microsoft.com/kb/286350

2. Run the command : – adplus –hang –p <pid>  -ma -o <output folder>  -quiet

For detailed information on the ADPLUS tool, please check the article : – http://support.microsoft.com/kb/286350

There is another scenario of similar to hang but at a very granular level of SQL server : – lets say you are running some query and it’s taking very long time to execute and when we check the wait in activity monitor for that query ; it comes out to be none. In those scenarios, we can take a dump and understand what that SPID is doing.

* To take the hang/crash dump using Procdump : –  http://technet.microsoft.com/en-us/sysinternals/dd996900. This utility is helpful when we want to take a dump on threshold of a specific counter value. It’s mostly used in high CPU scenarios.

* Another extreme and very famous scenario is, we killed the execution of the query in the middle and the status of the SPID is running but command is killed/rollback. There are two scenarios in this case 1. either wait or restart SQL(little risky) 2. find the cause of the issue. And, If we want to follow point no.2 then we will have to take a dump and find out the status of the SPID.

There  is one SQL server command DBCC stackdump which could be used to take SPID/SPIDs dump. Lets we want to take a dump of the SPID 56, the command which we will use will be:

DBCC stackdump(-1,<SPID>) –>For a specifc SPID so, it’ll be DBCC stackdump(-1,56)
DBCC stackdump() –>For all the SPIDs

And if you want to have filtered/full dump , you will have to enable the trace flags accordingly.

* On Windows 2008(using WER), the dumps can be taken from task manager by right clicking on the process and create dump :

Here also you have an option of setting filtered/mini/full dumps but you will have to make changes registry for that. Please check the article : – http://msdn.microsoft.com/en-us/library/windows/desktop/bb787181(v=vs.85).aspx for that.

* SQLdumper can also be used to take a manual dump of SQL process.  Please check this article :- http://support.microsoft.com/kb/827690 for more details on this. The command will look like : –

C:Program FilesMicrosoft SQL Server100Sharedsqldumper nnnn 0 0x8100 0 c:dumpfiles

where nnnn is the process id of the sql server process and C:dumpfiles is a an existing folder to hold the dumps files created. The above call will create a filtered dump of the sqlservr process.

All the above methodologies except adplus, can be used to take the dump even if SQL is not hung. Even, in normal running scenarios we can run the utilities to take the dump.

Crash Dump: –  As we already know that in certain scenarios like AV , SQL generate the dump itself using sqldumper. But for remaining ones we have to configure it ourselves. Lets say in scenarios where SQLagent is crashing or SQL is crashing while doing something or even SQL setup is failing and may more, we can leverage these dumps to find out the cause of the issue.

The utilities which we can use for these scenarios are :- Adplus or DebugDiag etc. .

To take a crash dump of SQL process or any process , here are the commands which could be used:

1. Download Adplus from : – http://support.microsoft.com/kb/286350
2. Run the command : – adplus –crash  –p <pid>  -CTCF  -o <output folder>  -quiet

For detailed information on the ADPLUS tool, please check the article : – http://support.microsoft.com/kb/286350

I will cover how to take manual crash/hang dump using windbg by connecting to SQL process invasively- in my next blog.

Advertisements

9 thoughts on “Complete explanation of SQL server memory dumps and it’s types.

  1. Nice compilation! Couple of things I wanted to comment on:

    i. Quote”atleast we can come to know while executing which function SQL failed on and what were the variables passed etc.”
    > Firstly, technically speaking SQL never fails 🙂 and from public symbols we can not see the variable, at the time of writing this comment 🙂

    ii. Syntax of DBCC STACKDUMP seems incorrect. For ‘ALL’ spids, either of DBCC STACKDUMP() or DBCC STACKDUMP (‘all’) or DBCC STACKDUMP (all) works (For that matter pass any wrong parameter and it would dumps all the spids ignoring the wrong parameter passed)

    The syntax for dumping a single spid is actually DBCC STACKDUMP (-1,) e.g. if we want to dump spid 53 then issue DBCC STACKDUMP (-1,53)

    Like

    • Thanks for response Prashant! I appreaciate your time to read my blog.

      Please see my comments inline

      1) Firstly, technically speaking SQL never fails and from public symbols we can not see the variable, at the time of writing this comment

      I appreciate your effort and understanding on this. Actually, from public symbols you will not be able to see the variables and you will only have the access to the function names. It may/may not help to understand/guess where the issue lies. But for SQL support team at Microsoft, they are of great help because of the access to private symbols and source code. Because, local and global symbols and all other required information can be accessed with the private symbols. It helps to understand the issue lot better and easier.

      2) Syntax of DBCC STACKDUMP seems incorrect. For ‘ALL’ spids, either of DBCC STACKDUMP() or DBCC STACKDUMP (‘all’) or DBCC STACKDUMP (all) works (For that matter pass any wrong parameter and it would dumps all the spids ignoring the wrong parameter passed)

      To dump all the threads in the SQL server , we can use dbcc stackdump() command. This is tried and tested.

      and for taking the dump of a specific SPID, you are correct that it should be dbcc stackdump(-1) or dbcc stackdump(-1,) – even corrected in my blog and thanks for correcting me.

      Like

  2. Due to some reason, the syntaxt didnt post accurately, I meant

    “The syntax for dumping a single spid is actually DBCC STACKDUMP (-1,spid no.) “

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s