Today, I will talk a little bit about memory troubleshooting. Even though It’s a new world of 64-bit systems but there are still so may systems on 32-bit hardware , sometimes due to operations criticality and sometimes due to investment constraints. So, I think the blog will add some value to the SQL community.
Your take away from the blog will be:
1. Introduction to SQL server memory on 32-bit systems
2. Check if the memory pressure is from MemToLeave (MTL) or Buffer pool(Bpool)
3. Troubleshoot MTL(non-Bpool) issues
4. Troubleshoot Bpool issues
Please note this blog is only for 32-bit SQL environments
I am sure you might have read lots of blog about memory and for people who haven’t – don’t worry I am going to share the links. First of all it’s very important to understand what is 32-bit system and what is 64-bit system. I have seen many people are still confused with this concept.
People who are not aware, please check out this video: – http://blogs.msdn.com/b/sqlserverfaq/archive/2011/11/16/demystifying-sql-server-memory-concepts-video.aspx
The above video will give you a little bit of understanding with respect to the SQL server on 32-bit system. If you are looking for more information about this, please feel free to post the comment – I will be more than happy to answer the questions.
Here are the few clarifications which I want to make before I move into troubleshooting of memory issues.
1. when do I use memory from Bpool and when from MTL?
The answer is simple, when I need memory allocation of 8KB I will use memory from Bpool and if the requirement is of allocation more than 8KB , I will use memory from MTL.
2. Now, you may be thinking – when I set max server memory for SQL server ; does it accommodate both of MTL and Bpool region?
Well, the answer is no. It’s only for Bpool i.e. allocation of 8 KB only. And, if I need allocation of more than 8KB, it will use the memory from the OS. If you have seen the above mentioned video , they had shown the below mentioned picture:
Out of 2 GB of address space 1.6 GB address space is given to Bpool and 0.4 GB is given for MTL and worker threads. Out of 0.4 GB, 128 MB (255 worker thread * 512 KB stack size) is given to worker threads. For more information, please check the above mentioned video or the blog by Jonathan :- http://sqlblog.com/blogs/jonathan_kehayias/archive/2009/07/07/understanding-the-vas-reservation-aka-memtoleave-in-sql-server.aspx.
As we have very limited VAS for MTL region, we land into problems when we have CLR, Extended stored procedure , linked servers and backups etc. as they take memory from MTL region.
Now, questions arises – How can you say it’s memory pressure?
SQL server will throw error message in error log file along with the DBCC memorystatus output in almost every case of memory pressure i.e. whenever it’s not able to allocate memory.
Main issue is to identify is whether it’s MTL pressure or Bpool pressure.
Before we get into that, we need to be aware of what’s external memory pressure and what’s internal memory pressure? Well, to understand that there is a very nice article by Slava Oks :- http://blogs.msdn.com/b/slavao/archive/2005/02/01/364523.aspx. In the above article, there is a very nice explanation of what kind of memory pressures are there and how they are handled by OS/SQL.
Now, we are clear with the basics to understand the context of the blog. Moving to the next phase of memory issues which we face on 32-bit systems.
1. MTL pressure
2. Bpool pressure
The errors which you will see when there is MTL pressure: –
The below error is a generic one, which we can ignore for a moment. We will see this error whenever there is a memory pressure – be it Bpool or MTL.
2010-07-07 08:09:00.71 spid67 Error: 701, Severity: 17, State: 123.
2010-07-07 08:09:00.71 spid67 There is insufficient system memory in resource pool ‘default’ to run this query.
We always have to look for the first error e.g.
Allocate Bytes: FAIL_VIRTUAL_RESERVE 1048576 <== look here
2012-06-26 08:16:04.31 spid14s Error: 701, Severity: 17, State: 123.
2012-06-26 08:16:04.31 spid14s There is insufficient system memory to run this query.
2012-06-26 08:16:04.31 spid12s Error: 802, Severity: 17, State: 11.
2012-06-26 08:16:04.31 spid12s There is insufficient memory available in the buffer pool.
We started scanning DBCC memorystatus output and found :
OBJECTSTORE_SERVICE_BROKER (Total)
VM Reserved = 0 KB
VM Committed = 0 KB
AWE Allocated = 0 KB
SM Reserved = 0 KB
SM Committed = 0 KB
SinglePage Allocator = 1523880 KB
MultiPage Allocator = 0 KB
:
:
:
:
2012-06-26 08:16:04.31 spid92 Error: 701, Severity: 17, State: 123.
2012-06-26 08:16:04.31 spid92 There is insufficient system memory in resource pool ‘internal’ to run this query.
In the above snippet, we can see that the main issue is FAIL_VIRTUAL_RESERVE 1048576 and rest are the logs which may help to troubleshoot the actual issue.
Now , we need to find out why it’s MTL issue. If we see the message closely it says FAIL_VIRTUAL_RESERVE 1048576 and if we convert 1048576 bytes to KB it becomes 1024 KB(1MB). If we flashback to the concepts – If the allocation is more than 8KB , it will use MTL region. Hence, it’s MTL
Another message which you may see is : –
2010-06-18 12:20:21.81 spid112 Failed allocate pages: FAIL_PAGE_ALLOCATION 2272
2010-06-18 12:20:22.19 spid112
Memory Manager
VM Reserved = 144584 KB
VM Committed = 352080 KB
AWE Allocated = 26640384 KB
Reserved Memory = 1024 KB
Reserved Memory In Use = 0 KB
2012-06-18 12:20:22.19 spid112
Memory node Id = 0
VM Reserved = 3584 KB
VM Committed = 66364 KB
AWE Allocated = 2973644 KB
SinglePage Allocator = 468336 KB
MultiPage Allocator = 3352 KB
When we see Fail_page_allocation, it can be from Bpool or MTL. So, based on the number of pages requested we will find out the size in this case it’s 2272*8KB which comes out to be 18 MB approx. And as the size is more than 8KB so, MTL.
These are the 2 major error messages which you see for MTL pressure. Now, the question is how to fix it?
Here are the few things which can be done at a very basic level before ringing MS :
1. As we know by default 256 MB is used for MTL region’s address space and sometimes, it’s genuine usage by SQL server which can be due to linked server,backups,CLR etc. In these scenarios, it’s alright to add –g384(or –g512 if max server memory for SQL is below 16 GB – based on experience and not any official recommendation) switch in SQL server startup parameter. This switch will basically make the MTL region to increase by 128 MB i.e. from 256 MB to 384 MB. It will help to bigger the playground of the components.
But the point to be noted is , the process address space is still 2GB and the 128 MB address space will be taken from Bpool’s 1.6 GB address space.
Normally in the 32-bit systems :
Memory Manager
VM Reserved = 1650436 KB <—you will see it to be 1.6 GB
VM Committed = 1649556 KB
AWE Allocated = 0 KB
Reserved Memory = 1024 KB
Reserved Memory In Use = 0 KB
but when we enable –g384 switch, it becomes:
MEMORYCLERK_SQLBUFFERPOOL (Total) KB
——————————————————- ———–
VM Reserved 1522336 <—After deducting 128 MB
VM Committed 1522336
AWE Allocated 0
SM Reserved 0
SM Commited 0
SinglePage Allocator 0
MultiPage Allocator 368
But if, you still see the same error of MTL pressure , then you may want to monitor the MTL address space usage by following the article: – http://blogs.msdn.com/b/sqlserverfaq/archive/2010/02/16/how-to-find-who-is-using-eating-up-the-virtual-address-space-on-your-sql-server.aspx
you will need to add the queries in a job and run them and check at what time the max free size KB drops and so on.
But if, we are suspecting that there is some leak then you could do the following:
1. Try to find out the third party dlls loaded in SQL server by running the query :
select * from sys.dm_os_loaded_modules where
company<>’Microsoft Corporation’
Try to move those dlls out of SQL address space and see if that helps to resolve the issue.
2. To find out the leak , you will need help from MS.
60-70% of the cases get resolved by setting –g384/512 trace flag and remaining ones need some work.
Now, it’s the time for finding Bpool pressure:
There is one very common error which we see for the Bpool memory pressure: FAIL_PAGE_ALLOCATION 1 . It’s for one page i.e. 8 KB. This is sometimes fairly simple and sometimes it can be tricky.
When we see this sort of issue, we have to hunt for a memoryclerk /Cachestore / userstore which is consuming highest amount of memory. There is no thumb rule for any clerk that if uses some x amount of memory then it’s consuming over expected amount of memory.
In specific to 32-bit systems only
The playground for Bpool consumer is 1.6 GB of address space and for data and index pages – we have the solution of enabling AWE. But for the stolen pages, we still have limited space of 1.6 GB minus some address space for AWE pages. Most of the times the issue happens for the stolen pages.
FYI : – Data and index pages a.k.a hashed pages and any other allocation from Bpool except hashed pages are known as stolen pages e.g. allocation for locks, execution plans etc.
Our target will be to find out which memoryclerk /Cachestore / userstore is consuming highest amount of memory. We can check DBCC memorystatus output which gets written just after the error of FAIL_PAGE_ALLOCATION in the SQL error log. To understand what is DBCC memorystatus : – http://support.microsoft.com/kb/907877
In this scenario, we try to look into dbcc memorystatus output. In one of the scenarios :
2010-02-17 07:28:43.06 Server Failed allocate pages: FAIL_PAGE_ALLOCATION 1
2010-02-17 07:28:43.06 spid14s Error: 701, Severity: 17, State: 123.
2010-02-17 07:28:43.06 spid14s There is insufficient system memory to run this query.
2010-02-17 07:28:43.06 spid12s Error: 802, Severity: 17, State: 11.
2010-02-17 07:28:43.06 spid12s There is insufficient memory available in the buffer pool.
We started scanning DBCC memorystatus output and found :
OBJECTSTORE_SERVICE_BROKER (Total)
VM Reserved = 0 KB
VM Committed = 0 KB
AWE Allocated = 0 KB
SM Reserved = 0 KB
SM Committed = 0 KB
SinglePage Allocator = 1528184 KB à Most of the buffer pool memory
is being consumed by service broker.
MultiPage Allocator = 0 KB
To find out how it was troubleshot, please check :- http://blogs.msdn.com/b/sqlserverfaq/archive/2010/03/25/sql-server-encounters-memory-pressure-and-generate-701-error-due-to-incorrect-configuration-of-service-broker.aspx
Another issue which I worked on was:
OBJECTSTORE_LBSS (node 0) KB
—————————————- ———–
VM Reserved 0
VM Committed 0
AWE Allocated 0
SM Reserved 0
SM Committed 0
SinglePage Allocator 733608 <- 732 MB approx.
MultiPage Allocator 0
OBJECTSTORE_LBSS is used to store temporary LOBs – variables, parameters, intermediate results of expressions. This is also used in parser code for storing metadata expression objects and for loading Stats blob during compilations.
In this case, there was a stored procedure which was using TVP and was working on LOB data was found to be the culprit.
The above examples were just to give you an idea how we troubleshoot the Bpool issues. Similarly , there can be any memoryclerk /Cachestore / userstore which may be culprit. we have to understant their purpose from the article :- http://support.microsoft.com/kb/907877 and try to find out the cause of the issue.
One of the very famous scenario is , even after having 148 GB set for SQL server max server memory the error of FAIL_PAGE_ALLOCATION 1 can occur. The issue in these scenarios can be due to uneven distribution of memory on the NUMA nodes. To find out the cause of the issue , please check the blog: –http://blogs.msdn.com/b/psssql/archive/2009/05/15/how-it-works-dbcc-memorystatus-locked-pages-allocated-and-singlepageallocator-values.aspx
Memory Manager KB
—————————————- ———-
VM Reserved 1079677032
VM Committed 5194524
Locked Pages Allocated 70262784 About 74GB(VM Committed + Locked Pages Allocated OVERALL)
Reserved Memory 1024
Reserved Memory In Use 0
2011-09-12 15:14:51.690 spid222
Memory node Id = 0 KB
—————————————- ———-
VM Reserved 47296
VM Committed 47176
Locked Pages Allocated 0
MultiPage Allocator 11400
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 1 KB
—————————————- ———-
VM Reserved 1079348712
VM Committed 4868904
Locked Pages Allocated 70262784 Seems like this Node has all the memory
MultiPage Allocator 1467984
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 2 KB
—————————————- ———-
VM Reserved 44800
VM Committed 44688
Locked Pages Allocated 0
MultiPage Allocator 8920
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 3 KB
—————————————- ———-
VM Reserved 46976
VM Committed 46584
Locked Pages Allocated 0
MultiPage Allocator 13520
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 4 KB
—————————————- ———-
VM Reserved 65984
VM Committed 64336
Locked Pages Allocated 0
MultiPage Allocator 28064
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 5 KB
—————————————- ———-
VM Reserved 40448
VM Committed 40336
Locked Pages Allocated 0
MultiPage Allocator 7352
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 6 KB
—————————————- ———-
VM Reserved 36288
VM Committed 36176
Locked Pages Allocated 0
MultiPage Allocator 3288
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 7 KB
—————————————- ———-
VM Reserved 42880
VM Committed 42760
Locked Pages Allocated 0
MultiPage Allocator 9368
SinglePage Allocator 8646672
2011-09-12 15:14:51.690 spid222
Memory node Id = 64 KB => NODE 64: This is usually the DAC Node. Can ignore it in this case.
—————————————- ———-
VM Reserved 2560
VM Committed 2504
Locked Pages Allocated 0
MultiPage Allocator 2416
SinglePage Allocator 8646672
HTH!
Excellent post,loved it
LikeLike
Thanks Anup!!
LikeLike
superb explanation
LikeLike
Thanks Dalvi!!
LikeLike
Good job mate !…
LikeLike
Thanks Prasad!
LikeLike
GOOD explanation man thanks alot…
LikeLike