SQL Server : How can you say It’s memory pressure?

Today, I will talk a little bit about memory troubleshooting. Even though It’s a new world of 64-bit systems but there are still so may systems on 32-bit hardware , sometimes due to operations criticality and sometimes due to investment constraints. So, I think the blog will add some value to the SQL community.

Your take away from the blog will be:

1. Introduction to SQL server memory on 32-bit systems
2. Check if the memory pressure is from MemToLeave (MTL) or Buffer pool(Bpool)
3. Troubleshoot MTL(non-Bpool)  issues
4. Troubleshoot Bpool issues

Please note this blog is only for 32-bit SQL environments

I am sure you might have read lots of blog about memory and for people who haven’t – don’t worry I am going to share the links. First of all it’s very important to understand what is 32-bit system and what is 64-bit system. I have seen many people are still confused with this concept.

People who are not aware,  please check out this video: – http://blogs.msdn.com/b/sqlserverfaq/archive/2011/11/16/demystifying-sql-server-memory-concepts-video.aspx

The above video will give you a little bit of understanding with respect to the SQL server on 32-bit system.  If you are looking for more information about this, please feel free to post the comment – I will be more than happy to answer the questions.


Here are the few clarifications which I want to make before I move into troubleshooting of memory issues.

1. when do I use memory from Bpool and when from MTL?

The answer is simple, when I need memory allocation of 8KB I will use memory from Bpool and if the requirement is of allocation more than 8KB , I will use memory from MTL.

2. Now, you may be thinking – when I set max server memory for SQL server ; does it accommodate both of MTL and Bpool region?

Well, the answer is no. It’s only for Bpool i.e. allocation of 8 KB only.  And, if I need allocation of more than 8KB, it will use the memory from the OS. If you have seen the above mentioned video , they had shown the below mentioned picture:

image

Out of 2 GB of address space 1.6 GB address space is given to Bpool and 0.4 GB is given for MTL and worker threads.  Out of 0.4 GB, 128 MB (255 worker thread * 512 KB stack size) is given to worker threads. For more information, please check the above mentioned video or the blog by Jonathan :- http://sqlblog.com/blogs/jonathan_kehayias/archive/2009/07/07/understanding-the-vas-reservation-aka-memtoleave-in-sql-server.aspx.

As we have very limited VAS for MTL region, we land into problems when we have CLR, Extended stored procedure , linked servers and backups etc. as they take memory from MTL region.

Now, questions arises – How can you say it’s memory pressure?

SQL server will throw error message in error log file along with the DBCC memorystatus output in almost every case of memory pressure i.e. whenever it’s not able to allocate memory.

Main issue is to identify is whether it’s MTL pressure or Bpool pressure.

Before we get into that, we need to be aware of what’s external memory pressure and what’s internal memory pressure? Well, to understand that there is a very nice article by Slava Oks :- http://blogs.msdn.com/b/slavao/archive/2005/02/01/364523.aspx. In the above article, there is a very nice explanation of what kind of memory pressures are there and how they are handled by OS/SQL.

Now, we are clear with the basics to understand the context of the blog. Moving to the next phase of memory issues which we face on 32-bit systems.

1. MTL pressure
2. Bpool pressure

The errors which you will see when there is MTL pressure: –

The below error is a generic one, which we can ignore for a moment. We will see this error whenever there is a memory pressure – be it Bpool or MTL.

2010-07-07 08:09:00.71 spid67      Error: 701, Severity: 17, State: 123.
2010-07-07 08:09:00.71 spid67      There is insufficient system memory in resource pool ‘default’ to run this query.

We always have to look for the first error e.g.

Allocate Bytes: FAIL_VIRTUAL_RESERVE 1048576 <== look here

2012-06-26 08:16:04.31 spid14s     Error: 701, Severity: 17, State: 123.

2012-06-26 08:16:04.31 spid14s     There is insufficient system memory to run this query.

2012-06-26 08:16:04.31 spid12s     Error: 802, Severity: 17, State: 11.

2012-06-26 08:16:04.31 spid12s     There is insufficient memory available in the buffer pool.

We started scanning DBCC memorystatus output and found :

OBJECTSTORE_SERVICE_BROKER  (Total)

VM Reserved = 0 KB

VM Committed = 0 KB

AWE Allocated = 0 KB

SM Reserved = 0 KB

SM Committed = 0 KB

SinglePage Allocator = 1523880 KB

MultiPage Allocator = 0 KB

:
:
:
:

2012-06-26 08:16:04.31 spid92      Error: 701, Severity: 17, State: 123.
2012-06-26 08:16:04.31 spid92      There is insufficient system memory in resource pool ‘internal’ to run this query.

In the above snippet, we can see that the main issue is FAIL_VIRTUAL_RESERVE 1048576  and rest are the logs which may help to troubleshoot the actual issue. 

Now , we need to find out why it’s MTL issue. If we see the message closely it says FAIL_VIRTUAL_RESERVE 1048576  and if we convert 1048576 bytes to KB it becomes 1024 KB(1MB). If we flashback to the concepts – If the allocation is more than 8KB , it will use MTL region. Hence, it’s MTL Smile

Another message which you may see is : – 

2010-06-18 12:20:21.81 spid112      Failed allocate pages: FAIL_PAGE_ALLOCATION 2272

2010-06-18 12:20:22.19 spid112
Memory Manager
VM Reserved = 144584 KB
VM Committed = 352080 KB
AWE Allocated = 26640384 KB
Reserved Memory = 1024 KB
Reserved Memory In Use = 0 KB

2012-06-18 12:20:22.19 spid112
Memory node Id = 0
VM Reserved = 3584 KB
VM Committed = 66364 KB
AWE Allocated = 2973644 KB
SinglePage Allocator = 468336 KB
MultiPage Allocator = 3352 KB

When we see Fail_page_allocation, it can be from Bpool or MTL. So, based on the number of pages requested we will find out the size in this case it’s 2272*8KB which comes out to be 18 MB approx. And as the size is more than 8KB so, MTL.

These are the 2 major error messages which you see for MTL pressure. Now, the question is how to fix it?

Here are the few things which can be done at a very basic level before ringing MS :

1. As we know by default 256 MB  is used for MTL region’s address space and sometimes, it’s genuine usage by SQL server which can be due to linked server,backups,CLR etc. In these scenarios, it’s alright to add –g384(or –g512 if max server memory for SQL is below 16 GB – based on experience and not any official recommendation) switch in SQL server startup parameter. This switch will basically make the MTL region to increase by 128 MB i.e. from 256 MB to 384 MB. It will help to bigger the playground of the components.

But the point to be noted is , the process address space is still 2GB and the 128 MB address space will be taken from Bpool’s 1.6 GB address space.

Normally in the 32-bit systems :

Memory Manager

 VM Reserved = 1650436 KB  <—you will see it to be 1.6 GB

VM Committed = 1649556 KB

AWE Allocated = 0 KB

Reserved Memory = 1024 KB

Reserved Memory In Use = 0 KB

but when we enable –g384 switch, it becomes:

MEMORYCLERK_SQLBUFFERPOOL (Total)       KB

——————————————————-       ———–

VM Reserved                                                        1522336  <—After deducting 128 MB

VM Committed                                                     1522336

AWE Allocated                                                     0

SM Reserved                                                        0

SM Commited                                                       0

SinglePage Allocator                                           0

MultiPage Allocator                                             368

But if, you still see the same error of MTL pressure , then you may want to monitor the MTL address space usage by following the article: – http://blogs.msdn.com/b/sqlserverfaq/archive/2010/02/16/how-to-find-who-is-using-eating-up-the-virtual-address-space-on-your-sql-server.aspx

you will need to add the queries in a job and run them and check at what time the max free size KB drops and so on.

But if, we are suspecting that there is some leak then you could do the following:

1. Try to find out the third party dlls loaded in SQL server by running the query :

select * from sys.dm_os_loaded_modules where
company<>’Microsoft Corporation’

Try to move those dlls out of SQL address space and see if that helps to resolve the issue.

2. To find out the leak , you will need help from MS.

60-70% of the cases get resolved by setting –g384/512 trace flag and remaining ones need some work.

Now, it’s the time for finding Bpool pressure:

There is one very common error which we see for the Bpool memory pressure: FAIL_PAGE_ALLOCATION 1 . It’s for one page i.e. 8 KB.  This is sometimes fairly simple and sometimes it can be tricky.

When we see this sort of issue, we have to hunt for a memoryclerk /Cachestore / userstore which is consuming highest amount of memory. There is no thumb rule for any clerk that if uses some x amount of memory then it’s consuming over expected amount of memory. 

In specific to 32-bit systems only

The playground for Bpool consumer is 1.6 GB of address space and for data and index pages – we have the solution of enabling AWE. But for the stolen pages, we still have limited space of 1.6 GB minus some address space for AWE pages. Most of the times the issue happens for the stolen pages.

FYI : – Data and index pages a.k.a hashed pages and any other allocation from Bpool except hashed pages are known as stolen pages e.g. allocation for locks, execution plans etc.

Our target will be to find out which memoryclerk /Cachestore / userstore  is consuming highest amount of memory. We can check DBCC memorystatus output which gets written just after the error of FAIL_PAGE_ALLOCATION  in the SQL error log. To understand what is DBCC memorystatus : – http://support.microsoft.com/kb/907877

In this scenario, we try to look into dbcc memorystatus output. In one of the scenarios :

2010-02-17 07:28:43.06 Server       Failed allocate pages: FAIL_PAGE_ALLOCATION 1

2010-02-17 07:28:43.06 spid14s     Error: 701, Severity: 17, State: 123.

2010-02-17 07:28:43.06 spid14s     There is insufficient system memory to run this query.

2010-02-17 07:28:43.06 spid12s     Error: 802, Severity: 17, State: 11.

2010-02-17 07:28:43.06 spid12s     There is insufficient memory available in the buffer pool.

We started scanning DBCC memorystatus output and found :

OBJECTSTORE_SERVICE_BROKER (Total)
VM Reserved = 0 KB
VM Committed = 0 KB
AWE Allocated = 0 KB
SM Reserved = 0 KB
SM Committed = 0 KB
SinglePage Allocator = 1528184 KB à Most of the buffer pool memory
is being consumed by service broker.
MultiPage Allocator = 0 KB

To find out how it was troubleshot, please check :- http://blogs.msdn.com/b/sqlserverfaq/archive/2010/03/25/sql-server-encounters-memory-pressure-and-generate-701-error-due-to-incorrect-configuration-of-service-broker.aspx

Another issue which I worked on was:

OBJECTSTORE_LBSS (node 0)                KB
—————————————- ———–
VM Reserved                                        0
VM Committed                                       0
AWE Allocated                                      0
SM Reserved                                        0
SM Committed                                       0
SinglePage Allocator                          733608 <- 732 MB approx.
MultiPage Allocator                                0

OBJECTSTORE_LBSS is used to store temporary LOBs – variables, parameters, intermediate results of expressions. This is also used in parser code for storing metadata expression objects and for loading Stats blob during compilations.

In this case, there was a stored procedure which was using TVP and was working on LOB data was found to be the culprit.

The above examples were just to give you an idea how we troubleshoot the Bpool issues. Similarly , there can be any memoryclerk /Cachestore / userstore  which may be culprit. we have to understant their purpose from the article :- http://support.microsoft.com/kb/907877 and try to find out the cause of the issue.

One of the very famous scenario is , even after having 148 GB set for SQL server max server memory the error of FAIL_PAGE_ALLOCATION 1 can occur. The issue in these scenarios can be due to uneven distribution of memory on the NUMA nodes.  To find out the cause of the issue , please check the blog: –http://blogs.msdn.com/b/psssql/archive/2009/05/15/how-it-works-dbcc-memorystatus-locked-pages-allocated-and-singlepageallocator-values.aspx

Memory Manager                                   KB

—————————————- ———-

VM Reserved                              1079677032

VM Committed                                5194524

Locked Pages Allocated                     70262784     About 74GB(VM Committed + Locked Pages Allocated OVERALL)

Reserved Memory                                1024

Reserved Memory In Use                            0

2011-09-12 15:14:51.690 spid222

Memory node Id = 0                               KB

—————————————- ———-

VM Reserved                                   47296

VM Committed                                  47176

Locked Pages Allocated                            0

MultiPage Allocator                           11400

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 1                               KB

—————————————- ———-

VM Reserved                              1079348712

VM Committed                                4868904

Locked Pages Allocated                     70262784  Seems like this Node has all the memory

MultiPage Allocator                         1467984

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 2                               KB

—————————————- ———-

VM Reserved                                   44800

VM Committed                                  44688

Locked Pages Allocated                            0

MultiPage Allocator                            8920

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 3                               KB

—————————————- ———-

VM Reserved                                   46976

VM Committed                                  46584

Locked Pages Allocated                            0

MultiPage Allocator                           13520

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 4                               KB

—————————————- ———-

VM Reserved                                   65984

VM Committed                                  64336

Locked Pages Allocated                            0

MultiPage Allocator                           28064

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 5                               KB

—————————————- ———-

VM Reserved                                   40448

VM Committed                                  40336

Locked Pages Allocated                            0

MultiPage Allocator                            7352

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 6                               KB

—————————————- ———-

VM Reserved                                   36288

VM Committed                                  36176

Locked Pages Allocated                            0

MultiPage Allocator                            3288

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 7                               KB

—————————————- ———-

VM Reserved                                   42880

VM Committed                                  42760

Locked Pages Allocated                            0

MultiPage Allocator                            9368

SinglePage Allocator                        8646672

2011-09-12 15:14:51.690 spid222

Memory node Id = 64                              KB    => NODE 64: This is usually the DAC Node. Can ignore it in this case.

—————————————- ———-

VM Reserved                                    2560

VM Committed                                   2504

Locked Pages Allocated                            0

MultiPage Allocator                            2416

SinglePage Allocator                        8646672

HTH!

Advertisements

7 thoughts on “SQL Server : How can you say It’s memory pressure?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s