Video Screencast Help

BPJOBD process temporary workaround, for automation

Created: 26 Apr 2013 • Updated: 23 May 2013 | 18 comments
This issue has been solved. See solution.

Hi Guys,

  I need some urgent help, I have an issue that bpjobd process gets killed every now and then, I opened a case with symantec which didn't help, they recommended to upgrade to 7.5 which is not currently been approved, However I have a strong server which can take place for the current master so it means there will be a hardware upgrade for my environment, However till that time I need a temporary fix, means I want the bpjobd to start by its own through a script I have which is in crontab to check if bpjobd is running in every 2 mins and if not, it will start the bpjobd with (bpjobd &)

When I tested this script it doesn't work, when I tried to manually run the script then also the command prompt doesn't comes back which I think is the issue, when I did ctrl+c twice the script came up and it started the bpjobd. So I thought of including trap command to issue a ctrl+c in the script which didnt work, may be because the script is not able to go pass the bpjobd & step. So please help.

Here is the script.

x=`ps -ef | grep -i bpjobd | grep -v "grep -i bpjobd" | wc -l`;
if [ $x -eq 0 ]
then
mail -s "Activity Monitor is hung, bpjobd started, please verify" email id's

/usr/openv/netbackup/bin/bpjobd &
trap "Ctrl + C" 2
trap "Ctrl + C" 2
fi
 

I have netbackup version 7.1.0.4

Thanks

Sid

Operating Systems:

Comments 18 CommentsJump to latest comment

LucSkywalker1957's picture

I suspect you have more going on here related to either your operating environment or your hardware. Check your drive space. If the volume where /usr/openv is located exceeds 98% full Netbackup will shutdown to protect itself. Check your /var/adm/messages (case insensitive) for "sense|fail|fatal|error" and see if you get anything from the system. Netbackup doesn't just randomly decide to stop working. There's got to be a reason. Check your /usr/openv/netbackup/logs/ for indications as to why it's behaving like this.

I can't believe Symantec would reccommend you to upgrade to the next version while you're having this issue. They usually require you to resolve any conflicts before that to ensure the smoothest possible outcome.

Let me know how this works out for you.

Regards!

Sid1987's picture

Hi LucSkywalker1957,

 Trust me I wasn't also very happy to see that suggestion but it was the 1st one. And checking the space and messages, I didnt find anything which should direct to this issue. I know there is a reason this is happening.

However 1 thing is the memory issue.

Linux 2.6.18-274.el5 (shnbupr01)        04/26/2013

04:38:47 PM kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad
04:38:49 PM    524140  32305908     98.40    645452  28494984   2096584       560      0.03         0
04:38:51 PM    521388  32308660     98.41    645452  28494992   2096584       560      0.03         0
04:38:53 PM    524008  32306040     98.40    645460  28495084   2096584       560      0.03         0
04:38:55 PM    523636  32306412     98.41    645464  28495092   2096584       560      0.03         0
04:38:57 PM    523760  32306288     98.40    645476  28495096   2096584       560      0.03         0
Average:       523386  32306662     98.41    645461  28495050   2096584       560      0.03         0

Not sure if this is normal for a heavily loaded master server.

Can you help me with that work around?

Thanks

Sid

mph999's picture

Sometimes an upgrade is recommended for various reasons.

I do not know the details of this issue (apart from above) but unless there are exceptional circumstances (eg, totally unsupported system, overloaded system which we can't magically fix or similar) I do not believe you should be left without a solution.

1.  Reopen call.

2.  Explain that an upgrade is not acceptable

3.  If no luck, raise issue with duty manager

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Sid1987's picture

Thanks Martin for the suggestion, I would definitely reopen the case. However For the time being I need scripting expertise for my team to atleast not to start bpjobd every now and then till the issue is fixed.

So could someone help me with the script?

 

Thanks

Sid

mph999's picture

Sure ... run this from cron every xx mins

 

#!/usr/bin/ksh
bpps -x |grep bpjobd >/dev/null 2>&1
if [[ $(echo $?) -ne 0 ]] then
nohup /usr/openv/netbackup/bin/bpjobd &
fi
 
 
For some reason bpjobd doesn't run in the background, hence the nohup / & 
I'll hazard a guess that bpjobd is actually started by some other process, which could explain it.
 
Anyhows, give it a go, let me know if there are still issues.
Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Sid1987's picture

Hi Martin,

Thanks for the script, Could you explain me few things because when I tried to use my logic and run the command with nohup it doesn't work.

nohup: appending output to `nohup.out' (this is what it says when I run your script and command prompt stays there is this normal or will it effect when it runs through crontab)

My script finds out if bpjobd is running or not and then it uses

nohup /usr/openv/netbackup/bin/bpjobd &

 

But it doesn't work can you explain please.

 

Thanks

Sid

 

mph999's picture

It would hang, until you press return.

I don't see this affecting it runing from cron, have you actually tried it ?

It won't do any harm, worst case is, it doesn't work.

(Apologies, I haven't had a chance to test it from cron).

If there are issues, the first thing I would try is remove the 'nohup' and try it from cron.

If I can find a linux box, I will test it, but above is the first thing I would do is as explained.

 

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Sid1987's picture

Hi Martin,

 Thanks for the reply.

I have tried the script without nohup before I started this blog. It doesn't work with crontab I tested it, We still had to start bpjobd manually. What would you suggest then.

Thanks

Sid

mph999's picture

I'll try it on a server myself, see what's wrong with it.

 

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Will Restore's picture

Add full path to bpps in the posted script.  I don't see a problem with the bpjobd line.

 

Will Restore -- where there is a Will there is a way

mph999's picture

HAving trouble with my LInux serverv (typical) - agree with wr, just realised that 'cron' doesn't have the usualy PATH variables set, so need to give everything the full path ...

I changed the shell to bash, probably better as that is the default for Linux (I think)

 

#!/usr/bin/ksh
/usr/openv/netbackup/bin/bpps -x |grep bpjobd >/dev/null 2>&1
if [[ $(echo $?) -ne 0 ]] then
nohup /usr/openv/netbackup/bin/bpjobd &
fi

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Sid1987's picture

Hi Martin,

Thanks for such a prompt response. I have added the full path to the script, However I fail to understand how the script runs when I run it manually when it doesn't has complete path.

Thanks

Sid

mph999's picture

Firstly, loks like it needs to be #!/bin/ksh - it doesn't work under bash

(Apologies, I always use 'Korn' shell, so there must be some slight change needed for bash)

In Unix/ Linus when you run a shell (the command line) there is a PATH variable set.

Eg. If my .profile contains :

PATH=/usr/local/bin:/sbin:/bin:/usr/openv/netbackup/bin:/usr/openv/netbackup/bin/admincmd

Any command I run, located in any of those directories will work, no matter which directory I am in.

If I ran vmoprcmd, when for example in the ./ dir, it would say no such command.  This is because vmoprcmd is in /usr/openv/volmgr/bin but this is not in the PATH and therefore  the system does no know where to look.

When using cron, it has it's own path, which is limited, something like bin and sbin.  So any command outside that, will need the full path specifing.

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
mph999's picture

OK, seems to work :

 

#!/usr/bin/ksh
/usr/openv/netbackup/bin/bpps -x |grep bpjobd >/dev/null 2>&1
if [[ $(echo $?) -ne 0 ]] then
nohup /usr/openv/netbackup/bin/bpjobd &
fi
 
Please give that a go and report back.
 
Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
SOLUTION
Yasuhisa Ishikawa's picture

If  ksh is not installed, try this instead.

#! /bin/sh
pgrep -x bpjobd  >/dev/null 2>&1
if [  -ne 0 ] then
nohup /usr/openv/netbackup/bin/bpjobd &
fi

BTW, bpjobd does not require Ctrl+C. I guess mail command in your script cause this issue. Any mail body is not supplied in your script , so mail command wait input from stdin.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

mph999's picture

I could only try the script from the command line when I wrote it, and that didn't return the promt, so I stuck in the nohup.

It works as shown in cron that I have now been able to test, so I left it alone without change.

M

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Sid1987's picture

Hi Martin,

 I will try it manually by killing bpjobd and see if it works.

However did you find your script in ps when you try to grep for bpjobd as defunct process.

[bpjobd.sh] <defunct>.

Thanks

Sid

Sid1987's picture

Thanks Martin,

 It seems to be working fine now.

Thanks

Sid