EMM failed all of a sudden
Updated: 21 May 2010 | 13 comments
Hello,
Master: NBU 6.0 MP6 running Solaris 9
The problem is that EMM daemon failed and then started few times during last weekend, whereas NBDB was up and running.
There are a number of status codes 252 in Activity Monitor and a messages in job status says that
NBEMM returned an extended error status: invalid error number (3000001)
I failed finding this msg in both Internet and Troubleshooting Guide.
I think it would be nice to have at least *.h files with error codes in NBU distro.
discussion Filed Under:
Comments
For starters EMM will log to the unified logs /usr/openv/logs (for unix) and can only be viewed with the vxlogview commands.
See for example: http://seer.entsupport.symantec.com/docs/280029.htm
Where commands such as vxlogview -i 111 -t 24:00:00 are described.
You may also need additional verbosity in these logs (the default I think is 1 or 0)
To see the current setting:
/usr/openv/netbackup/bin/vxlogcfg -p 51216 -o Default -l
To set to 6 (of course you will need to wait till the error happens again to see additional detail):
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o ALL -s DebugLevel=6 -s DiagnosticLevel=6
back to 0:
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o ALL -s DebugLevel=0 -s DiagnosticLevel=0
It's been my experience that these logs are fairly hard toi interpret and that Symantec (with their log searching tools) often can get to a root cause faster. I seem to recall loads of engineering binaries at 6.0 MP6.
Dhammica De Silva
"It's better to keep my mouth shut and be thought a fool rather than open it and remove all doubt." (Margaret Thatcher 1975)
EMM daemon will stop if free disk space falls below 1%.
Hello,
I have checked emm log and it says that emm first shut sown in a normal manner and then was brought online. I know that EMM was started by a fellow from duty admins team, but why EMM was shut down is a total puzzle for me.
Concerning free space, I checked df output and should say that each FS has a plenty of free blocks.
Maybe increase the UNIFIED logging to verbose 5 or 6 (I think they are the same anyway) and leave it to run over the weekend.
NOTE though that these logs at this verbosity require a HUGE amount of space (especially if you have a busy NetBackup server). If you have to the order of hundreds of free GB on your /usr/openv/ partition (these write to /usr/openv/logs) then fine, otherwise it could cause you even more headaches.
Dhammica De Silva
"It's better to keep my mouth shut and be thought a fool rather than open it and remove all doubt." (Margaret Thatcher 1975)
stop netbackup
stop PBX
clean schedules
check ipc jobs and clean if are hang
start PBX
start netbackup
and let us know what it came up in the logs
regards
Omar A Villa
Netbackup Expert
These are my personal views and not those of the company I work for
I suppose that you imply removing rm /usr/openv/netbackup/db/jobs/pempersist by clean schedules.
But I wonder what do you mean by removing IPC jobs?
hi ,
Try this command.....
This will start the EMM database seperately...
Command : installpath\programfiles\veritas\netbackup\bin\nbdb_admin -start
I have examined NBDB log and there were no any signs of DB going down. The problem is in emm daemon itself.
Anton, try Dami's suggestion of analyzing the VxUL logs again.
Here's a quick reference card to help with the parameters:
ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/287647.pdf
EMM has not expirienced any faults since then, so I suppose I can close this thread.
Thank you very much for your answers.
I realize that this thread was closed, but I have the same issue with EMM doing down with an invalid number? Please let me know if you have any updates?
Thanks.
James
Here are the steps that you need to do.
cat /usr/openv/db/log/server.log
and verify if you see any entries like this
I. 09/22 12:14:07. Database "NBDB" (NBDB.db) stopped at Mon Sep 22 2008 12:14
I. 09/22 12:14:07. Database "utility_db" (utility_db) stopped at Mon Sep 22 2008 12:14
These above lines indicate that the database has been stopped gracefully and if not examine the last 100 or 200 lines as to why and how the database has stopped.
Every time a checkpoint is taken it is updated to this log.
----- Next....
Run this command
/usr/openv/netbackup/bin/vxlogview -o nbemm -E -L -n 1 > nbemn_error.log
The above command dumps only the errors log generated for nbemm in the last 1 day.
Examine those logs and it should give you an idea on failures
also check the disk space where the emm is mounted on... if emm runs out of disk space it will log a critical message and not an error message and you can verify that as follows
/usr/openv/netbackup/bin/vxlogview -o nbemm -C -L -n 1 > nbemm_critc.log
Let me know the outcome, I will try and get you some more info.
Cheers!
Manoj
------------------
Time isn't running out, but life is...
What if I am attempting the view the log file on a different server? I took the unified log off the master and moved it to another NBU machine. Is it the -K (hostname) switch?
Thanks.
Would you like to reply?
Login or Register to post your comment.