Sep 13

IC65186: MIGRATION INITIATED REPAIR STGVOL MIGHT FAIL WITH ANR4759E

Storage pool, TSM Server 6 Comments Off on IC65186: MIGRATION INITIATED REPAIR STGVOL MIGHT FAIL WITH ANR4759E

If during Tivoli Storage Manager storage pool migration a volume
is identified to be processed by a REPAIR STGVOL process, this
might fail with ANR4759E.

The activity log will report a sequence like:
ANR1100I Migration started for volume <volume>, storage pool
<poolid> (process number <number>). (SESSION: <migration
sessionid>, PROCESS: <processid>)
ANR1102I Removable volume <volume> is required for migration.
(SESSION: <migration sessionid>, PROCESS: <processid>)
ANR0984I Process 747 for REPAIR STGVOL started in the BACKGROUND
at <time>. (SESSION: <repair sessionid>, PROCESS: <processid>)
ANR4752I REPAIR STGVOL process 747 started for 1 volumes.
(SESSION: <repair sessionid>, PROCESS: <processid>)
ANR1101I Migration ended for volume <volume>. (SESSION:
<migration sessionid>, PROCESS: <processid>)
..
ANR4759E REPAIR STGVOL failed to process <volume>. (SESSION:
<repair sessionid>, PROCESS: <processid>)

The symptom is caused by migration leasing a volume for
processing, and the lease not being freed before the REPAIR
command is invoked.

Tivoli Storage Manager Versions Affected: all V6 servers
Customer/L2 Diagnostics (If Applicable)
A trace with tracflag REPAIRVOL shows the following
messages:
<time>

[22469][ssalloc.c][2574][DoRepairStgVols]:DoRepairStgVols:
Failure leasing volume.
<time> [22469][ssalloc.c][2933][ReportVol]:ReportVol: Volume
<volume> report state 2.
<time> [22469][output.c][6653][PutConsoleMsg]:ANR4759E REPAIR
STGVOL failed to process <volume>.~
<time>
[22469][ssalloc.c][2692][DoRepairStgVols]:DoRepairStgVols:
Exiting with rc 155.

RC 155 translates to SSRC_VOLUME_ALREADY_LEASED

SHOW SSL will report
Volume <volume>(<volumeid>) –> Mode=Access, PoolId=15,
Strategy=30, SessId=<migration sessionid>, Tag=4
Initial Impact: Low
Additional Keywords: REPAIR STGVOL TSM ZZ61

Reclamation can also show this same condition.

Platform/Version affected:

Tivoli Storage Manager Server V55 V61

Local fix

Manually run the
REPAIR STGVOLS VOLName=<volumename>
while there is no lease (SHOW SSL) active for the volume

written by Bosse

Sep 13

IC37275: CLEANUP UTILITY TO REMOVE ORPHANED ENTRIES INTODUCED BY IC36975 NEEDED

Storage pool, TSM Server 6 Comments Off on IC37275: CLEANUP UTILITY TO REMOVE ORPHANED ENTRIES INTODUCED BY IC36975 NEEDED

Error description

A cleanup utility is needed to remove orphaned TSM server DB
entries introduced by defect addressed in APAR IC36975

The TSM server has been enhanced to provide a utility that
searches for and removes unreferenced database entries that
may have been caused by using simultaneous write prior to the
fix for IC36975.

The correction utility can be invoked by issuing
"REPAIR STGVOL".  This command will start a background
process to examine the database and determine all possible
volumes that may have been affected by this error.  For
each candidate volume, the process will evaluate the
database references for that volume and if any are
determined to be unreferenced, these entries will be
deleted.

This correction utility may require a long time to process
since it may need to evaluate a large number of database
entries.  Because of the time it may take to process, the
process may be cancelled using the "CANCEL PROCESS" command.
If the process is cancelled, the next time "REPAIR STGVOL" is
run, it will begin processing with the volume that it was
examining at the time it was cancelled.

It is also possible to process a specific volume with this
utility.  To process a specific volume, issue
"REPAIR STGVOL VOLNAME=volume_name" where volume_name is the
name of the volume to be processed.

A number of new messages have been added for this correction
utility.

For "QUERY PROCESS", the following will be reported while
the process is running:

"Processing volume VOLUME_NAME, examined
NUMBER_OF_VOLUMES_DONE of TOTAL_NUMBER_OF_VOLUMES,
NUMBER_OF_VOLUMES_REPAIRED required database repairs,
NUMBER_OF_VOLUMES_FAILED volumes that failed to process."

For "QUERY PROCESS", the following will be reported if the
process is cancelled:

"Cancel pending."

When the process starts, the following message will
be issued:

"ANR4752I REPAIR STGVOL process PROCESS_NUMBER started for
NUMBER_OF_VOLUMES volumes."

Explanation: The REPAIR STGVOL process has been started as
the reported process number.  This will evaluate and if
needed repair the indicated number of volumes.

System Action: None.

User Response:  This process may be monitored using the
QUERY PROCESS command.  If this process needs to be
cancelled, issue the CANCEL PROCESS command.

The other messages that may be issued are:

"ANR4753E REPAIR STGVOL process PROCESS_NUMBER ended,
processed VOLUMES_PROCESSED of TOTAL_VOLUMES total
volumes with REPAIRED_VOLUMES repaired and
FAILED_VOLUMES failures."

Explanation:  The REPAIR STGVOL process has ended.  It
processed VOLUMES_PROCESSED out of TOTAL_VOLUMES total
volumes to be processed.  The REPAIRED_VOLUMES indicates
the number of volumes that needed database repairs and
were repaired.  The FAILED_VOLUMES are volumes that either
failed during evaluation or during database repair if it
was needed.

System Action: None.

User Response:  If there were FAILED VOLUMES, review the
activity log for more information about the failure.

"ANR4754I REPAIR STGVOL process PROCESS_NUMBER ended,
processed VOLUMES_PROCESSED of TOTAL_VOLUMES total volumes
with REPAIRED_VOLUMES repaired."

Explanation:  The REPAIR STGVOL process has ended.  It
processed VOLUMES_PROCESSED out of TOTAL_VOLUMES total
volumes to be processed.  The REPAIRED_VOLUMES
indicates the number of volumes that needed database
repairs and were repaired.

System Action:  None.

User Response:  None.

"ANR4755W REPAIR STGVOL process PROCESS_NUMBER ended,
processed VOLUMES_PROCESSED of TOTAL_VOLUMES total volumes
with REPAIRED VOLUMES repaired."

Explanation:  The REPAIR STGVOL process has ended.  It
processed VOLUMES_PROCESSED out of TOTAL_VOLUMES total
volumes to be processed.  The REPAIRED_VOLUMES indicates
the number of volumes that needed database repairs and
were repaired.

System Action:  None.

User Response:  If the VOLUMES_PROCESSED is less than
the TOTAL_VOLUMES, this may be because the process was
cancelled before it could process all the needed volumes.
Review the activity log to determine if this is the case
and re-issue the command to complete this processing.  Or if
volumes were reclaimed or deleted while this process
was running, there may have been fewer actual volumes
to process than what was calculated when the process
started.  If this is the case no further action is necessary.

"ANR4756I REPAIR STGVOL selected volume VOLUME_NAME."

Explanation:  The REPAIR STGVOL selected VOLUME_NAME volume
to process.  This volume will be evaluated for database
reference errors and if any are encountered, they will
be corrected.

System Action: None.

User Response: None.

"ANR4757I REPAIR STGVOL finished evaluating
volume VOLUME_NAME, no repair was needed."

Explanation: The REPAIR STGVOL finished evaluating
VOLUME_NAME volume.  No repair actions were taken for the
database because there were no errors detected.

System Action: None.

User Response: None.

"ANR4758W REPAIR STGVOL repaired volume VOLUME_NAME,
database reference errors were found and corrected."

Explanation:  The REPAIR STGVOL evaluated VOLUME_NAME
volume and determined that repair actions were needed.
The necessary repairs were done successfully.

System Action: None.

User Response: None.

"ANR4759E REPAIR STGVOL failed to process VOLUME_NAME."

Explanation:  The REPAIR STGVOL failed for VOLUME_NAME
volume.  This process either failed while evaluating this
volume or it had determined that repair actions were needed
and it failed performing those actions.

System Action: None

User Response: Review the activity log for an indication of
why processing for this volume failed.  Try issuing
REPAIR STGVOL VOLNAME=VOLUME_NAME to process this volume
again.  If this volume was deleted from the server, this
message can be ignored.

written by Bosse

Sep 13

IC65498: ANR9999D ERRORS ARE LOGGED IN SERVER ACTIVITY LOG DURING A REPAIR STGVOL PROCESS

Storage pool, TSM Server 6 Comments Off on IC65498: ANR9999D ERRORS ARE LOGGED IN SERVER ACTIVITY LOG DURING A REPAIR STGVOL PROCESS

Error description

During Tivoli Storage Manager server REPAIR STGVOL process, many
ANR9999D errors and Context Report are logged in  the server
actlog.

L2/Customer diagnostic:

The following messages are logged in the actlog for a
REPAIR STGVOL process :

ANR4752I REPAIR STGVOL process 12 started for 1 volumes.
ANR1101I Migration ended for volume
/home/tsminst1/filepool/0000022C.BFS.
ANR1100I Migration started for volume
/home/tsminst1/filepool/00000236.BFS, storage pool FILEPOOLB
(process number 10).
ANR1102I Removable volume /home/tsminst1/filepool/0000022E.BFS
is required for migration.
ANR1102I Removable volume /home/tsminst1/filepool/00000236.BFS
is required for migration.
ANR1176I Moving data for collocation set 1 of 3 on volume
/home/tsminst1/filepool/00000236.BFS.
ANR9999D_1440764637 FindOrAddVolAssignment(asaudit.c:2011)
Thread<560>:AsAuditEmptyVol not implemented.
ANR9999D Thread<560> issued message 9999 from:
ANR9999D Thread<560>  0x0000000000bf06d2 OutDiagToCons+0x0x142
ANR9999D Thread<560>  0x0000000000bf3254 outDiagfExt+0x0x194
ANR9999D Thread<560>  0x0000000000bb8b9a
AsCheckVolTables+0x0x58a
ANR9999D Thread<560>  0x0000000000bb186d AsRepairStgVol+0x0x2ad
ANR9999D Thread<560>  0x0000000000b61d5b
DoRepairStgVolThread+0x0x183b
ANR9999D Thread<560>  0x0000000000c6061b StartThread+0x0xcb
ANR9999D Thread<560>  0x000000361ee064a7 *UNKNOWN*
ANR9999D Thread<560>  0x000000361e2d3c2d *UNKNOWN*
(560) Context report
(560) Thread AfMigrVolumeThread (514) is a parent thread related
to: 560
(514) Generating TM Context Report: (struct=tmTxnDesc)
(slots=256)
(514) slot -> 12:
(514) Tsn=0:76812, Resurrected=False, InFlight=True,
Distributed=False
(514)  Participants=1, summaryVote=ReadOnly
    Participant DB: voteReceived=False, ackReceived=False
(514) slot -> 19:
(514) Tsn=0:76819, Resurrected=False, InFlight=True,
Distributed=False
(514)  Participants=3, summaryVote=ReadOnly
    Participant DB: voteReceived=False, ackReceived=False
    Participant BF: voteReceived=False, ackReceived=False
    Participant SS: voteReceived=False, ackReceived=False
(560)  ssSession 54(smSession 0) –> BufConfig=None,
TransBufSize=0,
SplitBuf=False, WrCount=0, WrBufIsEmpty=False,
WrBufIsFull=False, SourceRc=0,
SinkRc=0, Aux_1Created=False, Aux_1Begin=False, Aux_1Idle=False,
Aux_1Terminat-
e=False, Aux_1Exit=False, Aux_2Created=False, Aux_2Begin=False,
Aux_2Idle=Fals-
e, Aux_2Terminate=False, Aux_2Exit=False
(560)     Leased Volumes:(560)
/home/tsminst1/filepool/0000022C.BFS(556)(Access)
(560)     Excluded VolIds:(560)   (none)
(560) Generating ssOpenSeg Context Report:
(560)  No storage service segments found.
(560) Generating BF Copy Control Context Report:
(560)  No global copy control blocks.
(560)   procNum=12, status=Processing volume
/home/tsminst1/filepool/0000022C.-
BFS, examined 0 of 1, 0 required database repairs, 0 volumes
that failed to
process.
, cancelInProgress=False
(560)   descr=REPAIR STGVOL, name=REPAIR STGVOL, cancelled=False
(560) End Context report
ANR4757I REPAIR STGVOL finished evaluating volume
/home/tsminst1/filepool/0000022C.BFS, no repair was needed.
ANR0512I Process 10 opened input volume
/home/tsminst1/filepool/00000236.BFS.
ANR4754I REPAIR STGVOL process 12 ended, processed 1 of 1 total
volumes with 0 repaired.
ANR0987I Process 12 for REPAIR STGVOL running in the BACKGROUND
processed 1 items with a completion state of SUCCESS at 10:08:34
PM.

Platform/Version affected
Tivoli storage Manager server 6.1.2 on all platform and versions

written by Bosse