BUG REPORT: In NetBackup 7.1, ghost job may occur that can't be deleted, and that have a state of "waiting for retry" and an "exit status 50".
|Article:TECH163191|||||Created: 2011-06-24|||||Updated: 2011-10-31|||||Article URL http://www.symantec.com/docs/TECH163191|
On 7.1 Windows Media servers, that are running multiplexed backup jobs, try log messages for child mpx jobs are sent using the first mpx job's (within the mpx group) job id. This can result in ghost updates if the first mpx job within a group completes early and later mpx jobs continue to join its mpx group.
This behavior can be identified by reviewing the <install_path>\VERITAS\NetBackup\db\jobs\trylogs\<jobid>.t file on the master server. If this occurs, there will be repeated updates in the file stating "is the host to backup from", similar to the following:
LOG 1306842023 4 bpbrm 1031876 CLIENT_A is the host to backup data from LOG 1306842169 4 bpbrm 1032796 CLIENT_A is the host to backup data from LOG 1306842238 4 bpbrm 1031408 CLIENT_A is the host to backup data from LOG 1306842291 4 bpbrm 1035848 CLIENT_A is the host to backup data from LOG 1306842364 4 bpbrm 1035568 CLIENT_A is the host to backup data from LOG 1306842746 4 bpbrm 1028044 CLIENT_A is the host to backup data from
7.1 Windows Media servers running MPX jobs.
This behavior can be confirmed by reviewing the nbjm, bpjobd and bpbrm log files.
The bpbrm log file will show the first backup job in the mpx group starting (note the contents of these messages have been edited to try to make them easier to read). Example using backupid CLIENT_A_1306796447, jobid 164429, job groupid 164344:
The first backup job starts under pid 931908.931912:
19:00:47.711 [931908.931912] <2> logparams: -backup -S MASTER_SERVER -c CLIENT_A -b CLIENT_A_1306796447 -jobid 164429 -jobgrpid 164344
All subsequent jobs reference the first jobs, backupid (CLIENT_A_1306796447), jobid (164429) and job groupid (164344) in the backup start log messages.
Next backup starts, backup id is CLIENT_C_1306796454, jobid is 164440, and job groupid is 164339:
19:01:03.436 [932444.932840] <2> logparams: -backup -S MASTER_SERVER -c CLIENT_A -b CLIENT_NAME_A_1306796447 -jobid 164429 -jobgrpid 164344 -cl POLICY_NAME -b CLIENT_C_1306796454 -jobid 164440 -jobgrpid 164339
Next job starts, backup id is CLIENT_D_1306796490, jobid is164449, and job group id is164386 :
19:01:37.008 [934132.934136] <2> logparams: -backup -S MASTER_SERVER -c CLIENT_NAME_A -b CLIENT_A_1306796447 -jobid 164429 -jobgrpid 164344 -cl POLICY_NAME -b CLIENT_D_1306796490 -jobid 164449 -jobgrpid 164386
Periodically, the bpjobd process initiates forgotten job cleanup. All active jobid's within bpjobd will be compared with the active jobs in nbjm. If a jobid is not known as an active job by nbjm, the job will be cleaned up by bpjobd.
This process triggers cleanup of jobid 164429:
[JobManager_i::doForgottenJobCleanup ] job has been forgotten, perform cleanup, jobid=164429
After the job is removed by the forgotten job cleanup process the try log file will no longer exist in the <install_path>\VERITAS\NetBackup\db\jobs\trylogs
Since all the backup jobs associated with the MPX group are started referencing the jobid 164429, when subsequent updates are sent by bpbrm bpjobd enters into "create_new_job" for this 'old' jobid:
07:40:23.130 [536392.535680] <2> create_new_job: Allocating new Active job (164429)
07:40:23.130 [536392.535680] <2> create_new_job: Adding provider(0) for job(164429) socket (980)
07:40:23.130 [536392.535680] <2> process_active_job: Begin to append (JOBTRYFILE) job (164429)
The result is updates to the trylog file for a job finished long ago:
07:40:23.130 [536392.535680] <2> process_active_job: LOG 1306842023 4 bpbrm 1031876 CLIENT_A is the host to backup data from
A binary is available for 7.1 media servers under ET 2399149.
Symantec Corporation has acknowledged that the above mentioned issue (Etrack 2399149) is present in the current version(s) of the product(s) mentioned at the end of this article. Symantec Corporation is committed to product quality and satisfied customers.
This issue is tentatively scheduled to be addressed in the following release:
- NetBackup 220.127.116.11
As fixes are released, please visit the following link for download and readme information:http://www.symantec.com/enterprise/support/overview.jsp?pid=15143
Please note that Symantec Corporation reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests or introduces new risks to overall code stability. Symantec's plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk.
|ghost jobs in the activity monitor - appears to be related to bpbrm updates to bpjobd - ET2380076 binaries are installed.|
Article URL http://www.symantec.com/docs/TECH163191