db file corruption happens without SCSI error

Article:TECH190520  |  Created: 2012-06-06  |  Updated: 2012-06-06  |  Article URL http://www.symantec.com/docs/TECH190520
Article Type
Technical Solution


Environment

Issue



While application is running with Oracle instance, it starts having problem in accessing VxFS file systems with data corruption errors in the same disk group.

 


Error



From the Oracle alert log, many block access errors are recorded as following:

ORA-01578: ORACLE data block corrupted (file # 75, block # 253314)
ORA-01110: data file 75: '/dir/sub-dir/dir-sub2/instance'
Incident details in: /fs/apps/db/sub-sir/rdbms/instance/sub-dir1/incident/incdir_496534/instance_59471_i496534.trc
Wed Jun 06 08:54:31 2011
Corrupt Block Found
        TSN = 5, TSNAME = INDX00
        RFN = 75, BLK = 253314, RDBA = 314826114
        OBJN = -1, OBJD = 2114206, OBJECT = INDX00, SUBOBJECT =
        SEGMENT OWNER = , SEGMENT TYPE = Temporary Segment

Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xA1EDC90] [PC:0xA1EDC90, joxdmp_()] [flags: 0x0, count: 1]
Errors in file /fs/apps/db/sub-sir/rdbms/instance/sub-dir1/trace/instance_ora_59471.trc  (incident=496535):
ORA-07445: exception encountered: core dump [joxdmp_()] [SIGSEGV] [ADDR:0xA1EDC90] [PC:0xA1EDC90] [Address not mapped to object] []

 

From the system messages, VxFS gives warnings with V-2-3: vx_mapbad.

Jun  6 08:54:46 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 1 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 23 marked bad
Jun  6 08:54:46 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 2 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/dg/vol1 file system fullfsck flag set - vx_mapbad
Jun  6 08:54:46 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 3 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 24 marked bad
Jun  6 08:54:46 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 4 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 25 marked bad
Jun  6 08:54:46 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 5 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 26 marked bad
Jun  6 08:54:46 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 6 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 29 marked bad
Jun  6 08:54:49 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 7 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 38 marked bad
Jun  6 08:54:54 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 8 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol1 file system free extent bitmap in au 36 marked bad
Jun  6 10:18:00 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 11 mesg 003: V-2-3: vx_mapbad - vx_extmapupd_3 - /dev/vx/dsk/dg/vol2 file system free extent bitmap in au 24 marked bad
Jun  6 10:18:00 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 12 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/dg/vol1 file system fullfsck flag set - vx_mapbad
Jun  6 12:18:06 host1 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 13 mesg 003: V-2-3: vx_mapbad - vx_extfind - /dev/vx/dsk/dg/vol3 file system free extent bitmap in au 29 marked bad

 


Environment



Solaris 10 x86 and SF 5.0MP3 with VxFS 5.0MP3RP2HF3

 


Cause



Without any hardware/storage change, VxFS file systems start giving access errors to running application and Oracle instance.

By checking the file systems in fsck, many inodes has invalid aflags/mode/partially and failed validation in fileset 999 and allocation unit (au) give emap incorrect.

fileset 999 primary-ilist inode 13771 has invalid aflags (0x0018)
fileset 999 primary-ilist inode 13771 has invalid orgtype (140)
fileset 999 primary-ilist inode 13771 has invalid eopflags (0x00004203)
fileset 999 primary-ilist inode 13771 has invalid eopdata (36)
fileset 999 primary-ilist inode 13771 has invalid number of blocks (1011072444557428232)
fileset 999 primary-ilist inode 13771 has invalid block map
fileset 999 primary-ilist inode 13771 has invalid iattrino (1011072444557428232)
fileset 999 primary-ilist inode 13771 has non zero attribute area
fileset 999 primary-ilist inode 13771 failed validation clear? (ynq)y
fileset 999 primary-ilist inode 13772 has invalid mode (0x1c04580e)
fileset 999 primary-ilist inode 13772 has invalid size (36)
fileset 999 primary-ilist inode 13772 has invalid orgtype (15)
fileset 999 primary-ilist inode 13772 has invalid eopdata (3397321535776)
fileset 999 primary-ilist inode 13772 has invalid number of blocks (3462777755104972354)
fileset 999 primary-ilist inode 13772 has invalid block map
fileset 999 primary-ilist inode 13772 has invalid iattrino (3462777755121749570)
fileset 999 primary-ilist inode 13772 has non zero attribute area
fileset 999 primary-ilist inode 13772 failed validation clear? (ynq)y
fileset 999 primary-ilist inode 13773 partially allocated
fileset 999 primary-ilist inode 13773 failed validation clear? (ynq)y

... ...

au 21 emap incorrect - fix? (ynq)y
au 21 summary incorrect - fix? (ynq)y
au 23 emap incorrect - fix? (ynq)y
au 24 emap incorrect - fix? (ynq)y
au 25 emap incorrect - fix? (ynq)y
au 26 emap incorrect - fix? (ynq)y
au 29 emap incorrect - fix? (ynq)y
au 36 emap incorrect - fix? (ynq)y
au 38 emap incorrect - fix? (ynq)y

 

No file system can be mounted after the full fsck above. All volumes are enabled/active in disk group.

 


Solution



The system needs a reconfigure reboot to startup with all LUNs. Right after a system restart. 1 of the dm disks is not found during the disk group import process. By matching the dm to the OS device, the particular disk has a different disk type with dev attributes in the OS format utility. By selecting the disk type in format utility, it shows the type changed last. A system Admin had the disk type changed by accident.

The OS prtvtoc with the disk displays a wrong partition table, not VM tag (15). VM has the disk marked as error status and disk is not usabled.

The disk needs a correct type rectified. By reiniting the disk with original privoffset and privlen, the disk can be readded back to original disk group. Then all vols with sub-disk associated can be enabled/active. 

Do a fsck check for the vols related and the file systems should be ready and mounted online.

 




Article URL http://www.symantec.com/docs/TECH190520


Terms of use for this information are found in Legal Notices