Apr 16

Troubleshooting Volume Shadow Copy (VSS) quiesce related issues

VE Comments Off on Troubleshooting Volume Shadow Copy (VSS) quiesce related issues

Details

•When quiescing a guest operating system using the Microsoft Volume Shadow Copy Service (VSS) prior to a snapshot, the operation fails.

•You see one of these errors:

Cannot create a quiesced snapshot because the snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine

An error occurred while quiescing the virtual machine. The error code was: 4 The error was: Quiesce aborted

•In the Event Viewer Application logs of the Windows Virtual Machine, you see errors similar to this:

◦Event ID 11
◦Event ID 12292
◦Event ID 12032
◦Event ID 12298

•The Event ID errors contain descriptions similar to this:

Volume Shadow Copy Service information: The COM Server with CLSID {} and name cannot be started. [].

Solution

Background Information

VMware products may require file systems within a guest operating system to be quiesced prior to a snapshot operation for the purposes of backup and data integrity.

VMware products which use quiesced snapshots include, but are not limited to, VMware Consolidated Backup and VMware Data Recovery.

As of ESX 3.5 Update 2, quiescing can be done by Microsoft Volume Shadow Copy Service (VSS), which is available in Windows Server 2003.

Operating systems which do not have VSS make use of the SYNC driver for quiescing operations. When VSS is invoked, all VSS providers must be running. If there is an issue with any third-party providers or the VSS service itself, the snapshot operation may fail.

 

Before verifying a VSS quiescing issue, ensure that you are able to create a manual non-quiesced snapshot using the vSphere Snapshot Manager. For more information see Troubleshooting issues when creating or committing snapshots in ESX/ESXi (1038963).

 

Troubleshooting Steps

Note: These steps may vary depending on the operating system.
  1. Ensure that you meet all of the prerequisites of VSS when quiescing guest operating systems:
    • Verify that you are running ESX 3.5.0 Update 2 or higher.
    • Verify that the latest version of VMware Tools is installed on the virtual machine. VSS components must be explicitly specified during the VMware Tools upgrade process. VSS is not installed in non-interactive mode. For more information, see Verifying a VMware Tools build version (1003947).
    • Ensure that you are using Windows Server 2003 or higher. Previous versions of Windows, such as Windows XP and Windows 2000, do not include VSS and rely on the SYNC driver.
  2. Ensure that all the appropriate services are running and the startup types are listed correctly:
    Note: With the exception of the VMware Snapshot Provider service, all services listed below are Microsoft services bundled with the operating system. If you are having difficulty starting these services, consult Microsoft support before proceeding with the troubleshooting steps.

    • During the installation of VMware Tools (this is required to install the VMware Snapshot Provider Service):
      • Ensure that the COM+ System Application service is listed as Started and that the startup type is listed as Manual.
      • Ensure that the COM+ Event System service is listed as Started and that the startup type is listed as Automatic.
    • While Idle:
      • Ensure that the COM+ System Application is listed as Started and that the startup type is listed as Manual.
      • Ensure that the COM+ Event System service is listed as Started and that the startup type is listed as Automatic.
      • Ensure that the Volume Shadow Copy service is not running and the startup type is listed as Manual.
      • The Microsoft Software Shadow Copy Provider service may or may not be started. Ensure that the startup type is listed as Manual.
      • Ensure that the VMware Snapshot Provider is not running and that the startup type is listed as Manual.
    • During the backup:
      • Ensure that the COM+ System Application is listed as Started and that the startup type is listed as Manual.
      • Ensure that the COM+ Event System service is listed as Started and that the startup type is listed as Automatic.
      • The Volume Shadow Copy service may or may not be started. Ensure that the startup type is listed as Manual.
      • Ensure that the Microsoft Software Shadow Copy Provider service startup type is listed as Manual.
      • Ensure that the VMware Snapshot Provider is listed as Started and that the startup type is listed as Manual.
      • Ensure that the Virtual Disk is listed as Started and that the startup type is listed as Manual.
  3. Make sure that you are using the Microsoft Software Shadow Copy provider:
    1. Click Start > Run.
    2. Type cmd and press Enter to open a command prompt.
      Note: You may need to run this as administrator.
    3. Check the VSS Providers with this command:
      C:\Users\Workstation> vssadmin list providers
      The output appears similar to this:
      Provider name: ‘Microsoft Software Shadow Copy provider 1.0’
      Provider type: System
      Provider Id: {b5946137-7b9f-4925-af80-51abd60b20d5}
      Version: 1.0.0.7

      Check the permissions on the VSS and on any third party VSS providers and ensure that the account is valid. For more information, see the Microsoft Knowledge Base article 259733.
      If you have a third-party provider, it may interfere with the quiescing operation. Try uninstalling any third-party VSS providers.
      Note: The vssadmin utility and VSS are bundled with the Microsoft operating system. If the vssadmin utility is reporting errors, this may indicate that you may have a pre-existing issue with the VSS. Consult Microsoft support before proceeding with the next troubleshooting steps.
  4. Ensure that all of the VSS writers are stable and that they are not reporting an error. Run this command:
    C:\Users\Workstation> vssadmin list writers
    The output for each writer appears similar to this:
    Writer name: ‘VSS Metadata Store Writer’
    Writer Id: {75dfb225-e2e4-4d39-9ac9-ffaff65ddf06}
    Writer Instance Id: {088e7a7d-09a8-4cc6-a609-ad90e75ddc93}
    State: [1] Stable
    Last error: No error

    Ensure that the State is Stable and that the last line of the output is Last error: No error.
  5. If the issue persists, perform these troubleshooting steps:
    1. Run NTBackup and see if there are VSS errors when NTBackup attempts to run. For more information, see the Windows NT Backup – Restore Utility page.
    2. Try reinstalling VMware Tools to re-register VSS.
    3. If you are having difficulty backing up information from specific applications such as Microsoft Exchange, Microsoft SQL, and Active Directory, ensure that all necessary VSS writers are also installed with these components.
    4. Engage Microsoft support to ensure that there are no known issues with VSS. Visit http://support.microsoft.com/ to see if there are any patches and updates for VSS.
    5. If errors such as Error: 0x8000FFFF or Event ID 12302 persist, see the Microsoft Knowledge Base article 940184.
    6. Ensure that the time is accurate in the guest operating system and validate that it is being synchronized via NTP or VMware Tools. Please see Timekeeping in VMware Virtual Machines for more information.

Notes:

  • In Windows 2000, XP, and 2003, you can reinstall VMware Tools in custom mode and choose not to install VSS. This results in the SYNC driver being used.
  • The Distributed Transaction Coordinator service must be running while installing VMware Tools. Otherwise, VSS fails to quiesce Windows 2008 R2.

written by Bosse

Apr 11

INCORRECT MESSAGE ANR3246W ISSUED IN THE ACTIVITY LOG

TSM 7.1 Comments Off on INCORRECT MESSAGE ANR3246W ISSUED IN THE ACTIVITY LOG

Starting with Tivoli Storage Manager Server v7.1 the following
new message has been introduced:

ANR3246W Process skipped files on
volume because the files have been deleted.

It has been identified that the message should be an information
message instead of a warning message.

Tivoli Storage Manager Versions Affected: Tivoli Storage
Manager Server 7.1 on all Platforms.

written by Bosse

Apr 11

Considerations about DIRMC processing

Versions files and directories Comments Off on Considerations about DIRMC processing

Question
Is a specific storage pool required to keep directory information on disk?

Answer
The DIRMC option had a big impact on restore performance in earlier versions of Tivoli Storage Manager (pre V3), where directories were restored first, followed by the files. It was therefore an advantage to have directory information cached so that it was permanently on disk, and best practices design was to create small storage pools specifically for directories. The DIRMC directive in the client options file was used to bind directories to a management class pointing to this storage pool.

Restore processing has changed since then: during the process directories will be created with default attributes and the correct attributes and ACL information is applied once the data is read from the media. Therefore the original reason to cache directories on disk no longer applies.

Nevertheless, the DIRMC option is still useful. If you do not specify this option to associate a management class with directories, the client, during backup, uses the management class in the active policy set of your policy domain with the longest retention period, which could point to a storage pool on tape. This might result in unwanted mount requests during backup/restore.

Having the DIRMC point to a disk storage pool separate from the backup objects might cause the symptom as described with technote swg21247892 (Unexpected high number of EndTxn processing during back up, see link section below)

Related information

Unexpected high number of EndTxn processing during back

written by Bosse

Apr 11

Unexpected high number of EndTxn processing during back up

Versions files and directories Comments Off on Unexpected high number of EndTxn processing during back up

Problem(Abstract)
This technote describes a scenario where it can come to performance degradation during Tivoli Storage Manager client backup because an unexpected high number of transactions is processed.

Resolving the problem
In this scenario, the following settings have been defined:

Server->TxnGroupMax 256

Client-> TXNBYTELIMIT 25600

The following statistics have been reported by a client instrumentation trace:

Server Version 5, Release 3, Level 3.0
Data compression forced off by the server
Server date/time: 29-06-2006 12:00:05 Last access: 29-06-2006 09:15:07

Current command: Incremental
Total number of objects inspected: 74.819
Total number of objects backed up: 74.814
Total number of objects updated: 0
Total number of journal objects: 79.687
Total number of objects rebound: 0
Total number of objects deleted: 0
Total number of objects expired: 4.265
Total number of objects failed: 0
Total number of bytes transferred: 13,96 GB
LanFree data bytes: 0 B
Server-Free data bytes: 0 B
Data transfer time: 5.295,51 sec
Network data transfer rate: 2.766,04 KB/sec
Aggregate data transfer rate: 699,58 KB/sec
Total number of bytes pre-compress: 14.996.313.174
Total number of bytes post-compress: 14.996.313.174
Objects compressed by: 0%
Elapsed processing time: 05:48:57
Average file size: 195,74 KB

——————————————————————

Detailed Instrumentation statistics for
Thread: 6328 Elapsed time 20937,580 sec
Section Actual (sec) Average(msec) Frequency used

———————————————————
Process Dirs 0,000 0,0 0
Solve Tree 0,000 0,0 0
Compute 1,657 0,0 270984
BeginTxn Verb 0,141 0,0 11167
Transaction 48,856 4,4 11167
File I/O 299,004 1,0 307988
Compression 0,000 0,0 0
Encryption 0,000 0,0 0
CRC 0,000 0,0 0
Delta 0,000 0,0 0
Data Verb 2726,280 10,1 270984
Confirm Verb 0,000 0,0 1
EndTxn Verb 17755,910 1590,0 11167
Sleep 0,000 0,0 0
Thread Wait 85,075 2430,7 35
Other 20,657 0,0 0

——————————————————————

Detailed Instrumentation statistics for

Thread: 6508 Elapsed time 20908,581 sec
Section Actual (sec) Average(msec) Frequency used

——————————————————————
Process Dirs 0,000 0,0 0
Solve Tree 0,000 0,0 0
Compute 1,633 0,0 250649
BeginTxn Verb 0,076 0,0 11370
Transaction 49,845 4,4 11370
File I/O 304,188 1,1 288521
Compression 0,000 0,0 0
Encryption 0,000 0,0 0
CRC 0,000 0,0 0
Delta 0,000 0,0 0
Data Verb 2563,864 10,2 250649
Confirm Verb 0,110 110,0 1
EndTxn Verb 17901,403 1574,4 11370
Sleep 0,000 0,0 0
Thread Wait 72,014 1636,7 44
Other 15,448 0,0 0

——————————————————————
From above client instrument:detail stats:

1) total # of files backed up: 74814
2) # of EndTxn’s:
thread 6328: 11167
thread 6508: 11370
Total EndTxns: 22537 or avg # files/txn: 3.3.

With TXNBYTELIMIT=25600 KB , TxnGroupMax=256 and Average file size: 195,74 KB, it seems that the above stats show that Tivoli Storage Manager client is requesting way too many DB commits (EndTxn’s) from Tivoli Storage Manager server.

Avg # of files/txn should be closer to 256, rather than 3.3 as calculated above.

Further investigation did show that different management classes with different copy destinations were configured for files and directories with this setup.

In this case, the directories on the fileservers changed just as much as the files, forcing Tivoli Storage Manager client to request EndTxn’s every time it encountered files/dirs being backed up that are destined for a different MC during backup operation.

The change implemented in this case was to use the same management class for files and directories. The above example should give you a starting point to verify if you are facing this condition when running into performance problems during back up.

For more information on client instrumentation traces see 15.1.2 Tivoli Storage Manager client performance tracing in the IBM Tivoli Storage Manager Implementation Guide:
http://www.redbooks.ibm.com/redbooks/pdfs/sg245416.pdf

written by Bosse

Apr 09

Tivoli Storage Manager server upgrade from V6 to V7 fails with ANRI1043E

DB2, Uppgradering Comments Off on Tivoli Storage Manager server upgrade from V6 to V7 fails with ANRI1043E

Problem(Abstract)

A Tivoli Storage Manager server is upgraded from V6 to V7. The upgrade fails with the following error : ANRI1043E: An error occurred while dropping the DB2 instances.

Symptom

The following errors are logged :

=====> IBM Installation Manager> Error
ERROR: Error during "install" phase:
Details: ANRI1043E: An error occurred while dropping the DB2 instances. To review errors and correct any issues, review the log files in /var/ibm/InstallationManager/logs/native.

 

Cause

Orphaned DB2 instance causing the db2idrop command to fail during upgrade

Environment

Tivoli Storage Manager Server on Unix/Linux

 

Diagnosing the problem

 

1. Run the db2ilist command to verify the DB2 instances that are configured on the system. For example : # /opt/tivoli/tsm/db2/instance/db2ilist
tsmi tsmiold
In this case, it shows two instances, tsmi and tsmiold. The tsmiold instance is an instance that is no longer in use.
2. Run the db2greg command to verify the DB2 registry. For example : /opt/tivoli/tsm/db2/bin/db2greg -dump show V,DB2GPRF,DB2SYSTEM,xvotsmsrv01,/opt/tivoli/tsm/db2, I,DB2,9.7.0.6,tsmi,/home/tsmi/sqllib,,1,0,/opt/tivoli/tsm/db2,, V,DB2GPRF,DB2INSTDEF,tsmi,/opt/tivoli/tsm/db2, I,DB2,9.7.0.4,tsmiold,/home/tsmiv/sqllib,,1,0,/opt/tivoli/tsm/db2,, V,DB2GPRF,DB2FCMCOMM,TCPIP4,/opt/tivoli/tsm/db2, S,DB2,9.7.0.6,/opt/tivoli/tsm/db2,,,6,0,,1359560348,0 Again, in this case, the registry shows references to the tsmi and tsmiold instances
3. Run the db2idrop command to remove the old instance (tsmiold). The command fails with the following error : DBI1081E The file or directory /home/tsmiold/sqllib/bin is missing.

Resolving the problem

Remove the orphaned DB2 registry reference to the old instance (tsmiold) with the following command :

/opt/tivoli/tsm/db2/bin/db2greg -delinstrec instancename=tsmiold Retry the upgrade once the orphaned instance is removed.

 

written by Bosse