Troubleshooting VSS Writer issues

In this blog I will discuss about various issues you can come across with SQL VSS Writer, while working with third party backup tools e.g. IBM Tivoli , Quest Litespeed etc.

I have worked on many such issues and I will try to cover most of the scenarios which I have troubleshot. TBH, I have seen most of the times – just doing basics resolves the issue. There are very few extreme scenarios when we take VSS writer traces and troubleshoot

Lets get started, I will start with a basic scenario

1. we run Vssadmin list writers command and sqlserverwriter is  missing and

     Also, there is no error message in Application event logs and

     Also, SQLserver VSS writer service is running

Here are the steps you need to follow to resolve such issues.

First of all we need to check if there are any spaces in the databases names. please run the query :

select ‘#’ + name +’#’ from sys.databases

if you notice any space in database names e.g. #test # , we need to make sure we remove the space from the database name.

Please follow this article to rename the database name : – http://msdn.microsoft.com/en-us/library/ms345378.aspx

once you remove the space in the database name, then issue vssadmin list writers command and check if the SQLwriter is there or not

If still SQLwriter is not there then:

please check the service logon which is mostly Nt authority/system or any user which you have mentioned as service logon, is added in SQL server as sysadmin privileges or not. If it’s not there then please make it as syadmin. Run the VSSadmin list writers command.

Most of the times the above 2 steps resolves the issues.

2.  We run VSSadmin list writers and SqlServerWriter fails:

First thing to check is open Windows application event logs and check for the errors. Here is the article which describes all about how to handle the connectivity issues: http://support.microsoft.com/kb/919023

Message 1
Event Type: Error
Event Source: SQLWRITER
Event Category: None
Event ID: 24583
Date: 4/30/2006
Time: 11:38:44 AM
User: N/A
Computer: ComputerName
Description:
Sqllib error: OLEDB Error encountered calling IDBInitialize::Initialize. hr = 0x80040e4d. SQLSTATE: 28000, Native Error: 18456
Error state: 1, Severity: 14
Source: Microsoft SQL Native Client
Error message: Login failed for user ‘NT AUTHORITYSYSTEM’.
DBPROP_INIT_DATASOURCE: ComputerName
DBPROP_INIT_CATALOG: master
DBPROP_AUTH_INTEGRATED: SSPI

Message 2

Event Type: Error
Event Source: SQLWRITER
Event Category: None
Event ID: 24583
Date: 4/30/2006
Time: 11:38:44 AM
User: N/A
Computer: ComputerName
Description:
Sqllib error: OLEDB Error encountered calling IDBInitialize::Initialize. hr = 0x80040e4d. SQLSTATE: 28000, Native Error: 18456
Error state: 1, Severity: 14
Source: Microsoft SQL Native Client
Error message: Login failed for user ‘NT AUTHORITYSYSTEM’.
DBPROP_INIT_DATASOURCE: ComputerName
DBPROP_INIT_CATALOG: master
DBPROP_AUTH_INTEGRATED: SSPI

Message 3

Event Type: Error
Event Source: VSS
Event Category: None
Event ID: 6013
Date: 4/30/2006
Time: 11:38:44 AM
User: N/A
Computer: ComputerName
Description:
Sqllib error: OLEDB Error encountered calling IDBInitialize::Initialize. hr = 0x80040e4d. SQLSTATE: 42000, Native Error: 18456
Error state: 1, Severity: 14
Source: Microsoft OLE DB Provider for SQL Server
Error message: Login failed for user ‘NT AUTHORITYSYSTEM’.

In essence, here are the points you need to make sure of:

1. TCP/IP,Shared Memory and Named Pipe protocols should be enabled. (for 32 bit ,64 bit SQL native clients and Network configuration)

VSS_writers

2. Logon account for SQL Server VSS Writer Service ( which is NT AuthoritySystem , most of the times) should have sysadmin privileges in SQL server

3. There should not be any alias in SQL server configuration manager. If you see something highlighted in the right hand side of the screenshot.

VSS_writers_alias

You will have to remove that and test the VSSadmin list writers command again. If the SqlServerWriter is now successful then you have found the culprit.

Another exceptional issue, which I worked on was :-  when we ran VSSadmin list writers command , SqlServerWriter fails. After we checked the application logs , the error was:

Error: OLEDB Error encountered calling IDBInitialize::Initialize. hr = 0x80004005. SQLSTATE: HYT00, Native Error: 0 Source: Microsoft SQL Native Client Error message: Login timeout expired SQLSTATE: 08001, Native Error: 2 Source: Microsoft SQL Native Client Error message: An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. SQLSTATE: 08001, Native Error: 2 Error state: 1, Severity: 16 Source: Microsoft SQL Native Client Error message: Named Pipes Provider: Could not open a connection to SQL Server [2]. 

DBPROP_INIT_DATASOURCE: Server Name à Default instance name.
DBPROP_INIT_CATALOG: master
DBPROP_AUTH_INTEGRATED: SSPI

To check how the issue was resolved please check :- http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/28/backup-software-fails-to-take-system-state-backup-if-sql-server-vss-writer-service-is-running.aspx

==> When we have more than 127 databases then also, we can face the issue. The error in that case may be something like this:

Log Name:      Application Source:        SQLWRITER Date:          19/9/2012 1:29:16 μμ Event ID:      24583 Task Category: None Level:         Error Keywords:      Classic User:          N/A Computer:      SQL.domain.local Description: Sqllib error: OLEDB Error encountered calling ICommandText::Execute. hr = 0x80040e14. SQLSTATE: 42000, Native Error: 3013 Error state: 1, Severity: 16 Source: Microsoft SQL Server Native Client 10.0 Error message: BACKUP DATABASE is terminating abnormally. SQLSTATE: 42000, Native Error: 3202 Error state: 1, Severity: 16 Source: Microsoft SQL Server Native Client 10.0 Error message: Write on “{E3FE4354-2B95-4C2B-85A7-639F4E3F7B0E}29” failed: 995(failed to retrieve text for this error. Reason: 15105)

Here is a very good article which explains about the issue which you may face with large number of databases :- http://blogs.msdn.com/b/psssql/archive/2009/11/13/how-it-works-how-many-databases-can-be-backed-up-simultaneously.aspx

Here is the solution for this issue:- http://support.microsoft.com/kb/943471

3. Many times, I have heard from the customer that when they shut down the SQL server VSS Writer service, the issue gets resolved.

After probing further about what files they are backing up, I got to know that they are backing up some OS files except SQL server database files.  Well, this is strange. Isn’t it?

I tried to find out the reason and here is what I got (http://support.microsoft.com/kb/919023):

You might be wondering why VSS framework components would need to connect to SQL Server when the components are only performing a backup of the volume. During the initial phases of snapshot creation, the configured default writer makes a connection to the instances of SQL Server on the particular server. One of the first phases of a snapshot creation process is “Backup Initialization.” During this phase, the backup application (requestor) performs the following actions to make sure that all the components in the snapshot creation process are ready:

  • The backup application binds to the IVssBackupComponentsinterface.
  • The backup application initializes the IVssBackupComponentsinterface.
  • The backup application calls the IVssGatherWriterMetadata API to perform metadata enumeration.

The VSS framework then instructs all writers to gather metadata. This includes a default writer that is included with SQL Server. It could be either MSDEWriter or SqlServerWriter based on the server’s current settings. This default writer for SQL Server connects to all instances of SQL Server that are started on the local system, obtains the required information about the databases on the instance of SQL Server, and then creates the metadata document. The metadata document is then returned to the backup application.

The failure in this scenario is because of – Logon account for SQL Server VSS Writer Service ( which is NT AuthoritySystem , most of the times) doesn’t have sysadmin privileges in SQL server.

If you fix the permission, then you won’t need to stop SQL VSS Writer Service.

4. Some windows bug: I worked on a scenario where ntbackup used to fail while SQL VSSwriter service was in running state. After the failure of backup, vssadmin list writers command on command prompt used to throw error:
Writer name: sqlserverWriter Writer Id: {a65faa63—5ea8—4ehc—9dbd—ac4db26912a) Writer Instance Id: {d4343f5O—672e—4a11—b2f9—333f9ebff3)
State: [8]
Failed Last error: Inconsistent shadow copy

This command used to return successful results prior to the backup failure. The issue was found to be due to bug of windows mentioned in the article : – http://support.microsoft.com/kb/2457458

If after following the above steps, still the issue persists then the issue is falling in the category of so called “Extreme Scenario”

First thing, I would do is to run a very good tool names backup simulator. This tool was written by my colleague Amit Banerjee. This article describes about how it works: –http://troubleshootingsql.com/2011/06/18/sql-server-backup-simulator-v2-available-now/

Backup simulator tool will help to narrow down if the issue is with SQLVDI or the third party tool which we are using. This tool validates the SQLVDI infrastructure and will take a small backup. If it succeeds then the issue will probably be with the third party tool.

Here are few errors which you will see if there is any issue with SQLVDI :

Error message 1
2007-06-18 11:21:00.83 spid820 BackupVirtualDeviceFile::ClearError: failure on backup device ‘VDI_ DeviceID ‘. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

Error message 2

2007-06-18 11:21:00.83 spid820 Error: 18210, Severity: 16, State: 1.
2007-06-18 11:21:00.83 spid820 BackupMedium::ReportIoError: write failure on backup
device ‘VDI_ DeviceID ‘. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.)

Error message 3

2007-06-18 11:21:00.87 spid820 Error: 18210, Severity: 16, State: 1.
2007-06-18 11:21:00.87 spid820 BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device ‘VDI_ DeviceID . Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.)

For further information on this, please check this blog: – http://blogs.msdn.com/b/sqlserverfaq/archive/2009/04/28/is-sqlvdi-dll-functioning-properly.aspx

References :-
How SQL VDI works: – http://blogs.msdn.com/b/sqlserverfaq/archive/2009/04/28/informational-shedding-light-on-vss-vdi-backups-in-sql-server.aspx

Backup Simulator: – http://blogs.msdn.com/b/sqlserverfaq/archive/2010/10/27/sql-server-backup-simulator.aspx

VSS connectivity Issues: – http://support.microsoft.com/kb/919023

Please feel free to leave comments in case you have any questions.

HTH!