Veritas Storage Foundation (tm) and High Availability Solutions 5.0 and 5.0 Maintenance Pack 3 (MP3) on Solaris x64 - Late Breaking News

Article:TECH50452  |  Created: 2010-01-26  |  Updated: 2010-08-09  |  Article URL http://www.symantec.com/docs/TECH50452
Article Type
Technical Solution

Product(s)

Environment

Issue



Veritas Storage Foundation (tm) and High Availability Solutions 5.0 and 5.0 Maintenance Pack 3 (MP3) on Solaris x64 - Late Breaking News


Solution





To locate the most current product patch releases including Maintenance Packs, Rolling Patches, and Hot Fixes visit https://vos.symantec.com/patch/matrix 


Documentation

5.0 MP3 Documentation List (Solaris SPARC and x64 combined):
http://entsupport.symantec.com/docs/307509

Product documentation, man pages and error messages for the 5.0 and 5.0 MP3 releases are available at
vos.symantec.com/documents

All product documentation links for this release are cross referenced in the Related Documents section of the Getting Started Guide:
http://support.veritas.com/docs/289315


Downloads

5.0 Maintenance Pack 3 for Solaris is available at https://fileconnect.symantec.com 

5.0 Maintenance Pack 3 Rolling Patch 1 for Solaris x64 is available at http://entsupport.symantec.com/docs/316672 

More patches are available on Patch Central https://vias.symantec.com/labs/patch 


Tools

VOS (Veritas Operation Services) Portal: https://vos.symantec.com 

VOS Portal Contains:

    - Risk Assessments
    - Installation Assessment Services (Installation and Upgrade preparation)
    - VOS Searchability (Error Code Lookup, Patches, Documentation, Systems, Reports)
    - Detailed Reports (Product and License Usage)
    - Notification Signup (VOS Notification Widget)


Addition to all Release Notes

Support of ZFS Root Devices

The 5.0 version of Storage Foundation is supported on servers with ZFS root devices, however care must be taken to ensure that devices managed by ZFS are excluded from being managed by Storage Foundation.  ZFS root disk encapsulation is not supported by this version of storage foundation.  Running ZFS on top of DMP or Veritas Volume Manager is not supported in this release. The 5.1 version of Storage Foundation includes functionality to detect and protect ZFS devices and is recommended when ZFS will be used on devices not specifically used for root functionality.  


The Storage Foundation Simple Admin 1.0 for Solaris is available at
http://entsupport.symantec.com/docs/303625 


For all Veritas Cluster Server Release Notes

VCS requires that all nodes in the cluster use the same processor architecture and
run the same operating system.

All nodes in the cluster must run the same VCS version. Each node in the cluster
may run a different version of the operating system, as long as the operating
system is supported by the VCS version in the cluster.


Cluster Volume Manager (CVM) fail back behavior for non-Active/Active arrays

This describes the fail back behavior for non-Active/Active arrays in a CVM cluster. This behavior applies to A/P, A/PF, APG, A/A-A, and ALUA arrays.

When all of the Primary paths fail or are disabled in a non-Active/Active array in a CVM cluster, the cluster-wide failover is triggered. All hosts in the cluster start using the Secondary path to the array. When the Primary path is enabled, the hosts fail back to the Primary path.

However, suppose that one of the hosts in the cluster is shut down or disabled while the Primary path is disabled. If the Primary path is then enabled, it does not trigger failback. The remaining hosts in the cluster continue to use the Secondary path. When the disabled host is rebooted and rejoins the cluster, all of the hosts in the cluster will continue using the Secondary path. This is expected behavior.

If the disabled host is rebooted and rejoins the cluster before the Primary path is enabled, enabling the path does trigger the failback. In this case, all of the hosts in the cluster will fail back to the Primary path. [e1441769]


VEA service takes a long time to start

VEA service takes a long time to start if the configuration contains large number of LUNs (1403191)

In configurations with large numbers of LUNS which need to be discovered, the VEA service may take a long time to start. The long start-up time may cause the boot time to be longer than is allowed.

Workaround:
The solution is to start the VEA service in the background, so that the boot continues while the LUNs are discovered.

To start the VEA service in the background:

1. Edit the VEA start-up script.

For Storage Foundation 5.x, use the following command:
#  edit shell script /opt/VRTSobc/pal33/bin/vxpalctrl      

2. In the start_agent() function, add the following line:
      exit 0

   Before the lines:
      max=10
           count=0


Version 5.0 Maintenance Pack 3 (MP3) Rolling Patch 1 (RP1):


Veritas Storage Foundation and High Availability Solutions 5.0 Maintenance Pack 3 Rolling Patch 1 is now available

Veritas Storage Foundation (tm) and High Availability Solutions 5.0 Maintenance Pack 3 Rolling Patch 1 Read This First - Solaris http://entsupport.symantec.com/docs/316673 

Veritas Storage Foundation (tm) and High Availability Solutions 5.0 Maintenance Pack 3 Rolling Patch 1 - Solaris x64 http://entsupport.symantec.com/docs/316672 


For all products that use Veritas Volume Manager, the vxdiskadm utility fails to replace a failed or removed non-root disk (1434779)

The vxdiskadm utility fails to replace a failed or removed disk using the options:

4 Remove a disk for replacement
5 Replace a failed or removed disk

This issue is specific to the replacement of a non-root disk.

An error message displays similar to the following example:

VxVM ERROR V-5-2-281
Replacement of disk rootdg02 in group rootdg with device c1t1d0
VxVM vxdg ERROR V-5-1-559 Disk rootdg02: Name is already used

Replace a different disk? [y,n,q,?] (default: n)

Workaround:

Replace the disk using the following command:
# vxdg -g $repldgname -k adddisk $repldmname=$repldaname

For example:
vxdg -g rootdg -k adddisk rootdg02=c1t1d0

This issue is seen in the following releases:
5.0 MP3 RP1

The issue will be fixed in the following releases:
5.0 MP3 RP2


Version 5.0 Maintenance Pack 3 (MP3):

The release of Veritas Storage Foundation (tm) and High Availability Solutions
5.0 Maintenance Pack 3 (MP3) for the Solaris x64 Platform is now combined
with the Solaris SPARC platform.

5.0 Maintenance Pack 3 for Solaris is available at https://fileconnect.symantec.com 

The Veritas Volume Manager 5.0 Maintenance Pack 3 SmartMove Hot Fix 1 for Solaris (x64 Platform) is available at http://entsupport.symantec.com/docs/311480 

The Veritas Cluster Server 5.0 Maintenance Pack 3 Hotfix 1 for Solaris (x64 Platform) is available at:
http://entsupport.symantec.com/docs/311940  
This Hotfix fixes the below issues:

VRTSvcs
1404384: HAD crashing when switching over a global group when PreSwitch is TRUE.
1397692: VCS engine clients hang in connect() if the target system is down
1414219 HostMonitor objects:
HostMonitor objects are internal-use only. While these objects appear in the UI for the 5.0 MP3 release, in later releases these objects are not displayed.
 

Starting with 5.0MP3, Symantec has added the feature of one-way link detection in LLT (Etrack Incident# 1031514)

Prior to this, (i.e. until 5.0MP1), LLT used broadcast heartbeats by default. Beginning with 5.0MP3, instead of using broadcast heartbeats, LLT uses unicast heartbeats.

LLT considers a link to be in trouble for a peer node when it finds that the link has not been up for that node for 2 seconds (peertrouble 200). On lo-pri links, LLT sends heartbeats just once every second; on hi-pri links, it is twice every second. With the above described change in LLT, it is possible that for lo-pri links, the troubled condition is hit more often. And hence the corresponding LLT messages will be printed more often in the system log. While this message is only informational, frequent appearance in the system logs may alarm the customer.

Therefore, it is recommended that the LLT peertrouble tunable should be increased to 400, so that link inactivity for up to 4 seconds is ignored by LLT before printing out the message in the system log.
            
As noted before, the trouble message is only informational and changing the trouble time to 4 seconds from 2 seconds is harmless.

If the following messages are seen frequently for an LLT lo-pri link, then change the LLT tunable named peertrouble to 400. Its default value is 200.

LLT INFO V-14-1-10205 link 2 (eri0) node 1 in trouble LLT INFO V-14-1-10024 link 2 (eri0) node 1 active

The peertrouble tunable can be changed with the following command on all the nodes in the cluster:

      lltconfig -T peertrouble:400

It is recommended that the following line be added to /etc/llttab on all the nodes in the cluster in order to ensure that this value is used across server reboots. On each node, the change will take effect only after restarting LLT.

      set-timer      peertrouble:400

Example:

# lltconfig -T query

Current LLT timer values (.01 sec units):
 . . .
 peertrouble = 200    <--------- before
 peerinact   = 1600
 . . .

# lltconfig -T peertrouble:400 <--- cmd

Use the following command to ensure that the values are indeed changed:

# lltconfig -T query

Current LLT timer values (.01 sec units):
 . . .
 peertrouble = 400    <--------- after
 peerinact   = 1600
 . . .


Upgrade to 5.0MP3 fails if Storage Foundation Manager is installed
(Etrack incident #1423124)

Upgrading the Storage Foundation products to 5.0MP3 may fail if Storage Foundation Manager is installed.
The product installer exits with a message indicating that a patch is missing in the media.

Workaround:
Start the product installer with the following option:
./installmp -mpok


ODM support for Storage Foundation 5.0 MP3

The Veritas extension for ODM is now supported for Storage Foundation Standard 5.0MP3 and Storage Foundation Enterprise 5.0MP3.  

In order to use this functionality, you must install a hotfix for the Veritas licensing package at
http://support.veritas.com/docs/316720 

You may also need to manually install the support packages. See Installing ODM for details.

Using ODM with Storage Foundation or Storage Foundation Cluster File System - Solaris at
http://support.veritas.com/docs/316757 


Addition to the Veritas Storage Foundation Release Notes 5.0 MP3


Veritas File System has increased the default value for the tunable max_seqio_extent_size


Veritas File System (VxFS) 5.0 MP3 has an increased default value for the max_seqio_extent_size tunable for better performance in modern file systems.

The max_seqio_extent_size tunable value is the maximum size of an individual extent.  Prior to the VxFS 5.0 MP3 release, the default value for this tunable was 2048 blocks. Database tests showed that this default value was outdated and resulted in slower than expected throughput on modern larger file systems.  To improve performance and reduce fragmentation, the default value of max_seqio_extent_size was changed to 1 gigabyte in VxFS 5.0 MP3. VxFS allocates extents in a way that allows VxFS to use only the necessary percentage of the 1 gigabyte extent size, avoiding over allocation.

The minimum value allowed for the max_seqio_extent_size tunable is 2048 blocks--the default value prior to the VxFS 5.0 MP3 release.

Known Issue:
Processes that relied on extents being allocated in smaller chunks could result in unneeded extent space being given to other processes. This could lead to file systems getting full.

Workaround:
Change the max_seqio_extent_size tunable back to the pre-5.0MP3 value of 2048.



Disk Group Failure Policy: requestleave

As of 5.0 MP3, Veritas Volume Manager (VxVM) supports 'requestleave' as a valid disk group failure policy. This new disk group failure policy is not currently documented in the 5.0MP3 Veritas Volume Manager Administrator's Guide. The Administrator's Guide will be updated with this information in the next major release.  

When the disk group failure policy is set to 'requestleave', the master node gracefully leaves the cluster if the master node loses access to all log/config copies of the diskgroup. If the master node loses access to the log/config copies of a shared diskgroup, Cluster Volume Manager (CVM) signals the CVM Cluster Veritas Cluster Server agent. Veritas Cluster Server (VCS) attempts to take offline the CVM group on the master node. When the CVM group is taken offline, the dependent services groups are also taken offline. If the dependent applications managed by VCS cannot be taken offline for some reason, the master node may not be able to leave the cluster gracefully.

Use the 'requestleave' disk group failure policy together with the 'local' detach policy. Use this combination of disk detach policy and disk group failure policy when the availability of the configuration change records is more important than the availability of nodes.  In other words, if you prefer to let a node leave the cluster rather than risk having the disk group be disabled cluster-wide, because of a loss of access to all copies of the disk configuration.

Set the requestleave disk group failure policy as follows:
# vxdg -g mydg set dgfailpolicy=requestleave

Refer to the 5.0MP3 Veritas Volume Manager Administrator's Guide for more information about the disk group failure policy and the disk detach policy.




Fire Drill with VVR not supported in SF Oracle RAC environments

SF Oracle RAC now supports Veritas Cluster Server (VCS) Fire Drill. Fire Drill enables organizations to validate the ability of business-critical applications in resuming operations at hot standby data centers following critical outages and disasters. Fire Drill automates creation of point-in-time snapshots and testing of the applications that use the replicated data in the event of a site-to-site application failover, often referred to as a High Availability Disaster Recovery (HA/DR) failover.
 
Note: All operations are managed within the VCS HA/DR framework through hardware replication technologies that use VCS agents. Replication using VVR is not supported for Fire Drill in an SF Oracle RAC environment. This note corrects a related documentation errata in the Veritas Storage Foundation for Oracle RAC Release Notes, which implies support for VVR.
 
The Fire Drill setup wizard allows automated configuration of a Fire Drill. The resultant Fire Drill configuration is also fully customizable. The Fire Drill wizard is invoked from the disaster recovery site using hardware replication by executing the script shipped with the hardware replication agents.


The Local detach policy support documented in the Storage Foundation Release Notes is not correct

The 5.0 MP3 Storage foundation Release Notes included a section titled: "Local detach policy now supported with Veritas Cluster Server clusters and with Dynamic Multipathing Active/Passive arrays." This section is not correct and should be ignored. The restrictions in using the local detach policy still apply for the 5.0MP3 release.


If the umask is 0077, the installation or upgrade can fail

Check the umask setting:
# umask
0077

Change umask to 0022:
# umask 0022
# umask
0022


Veritas Volume Manager 5.0 MP3 fixed issues section

The following additional incidents are fixed in 5.0MP3 but were not listed in the Veritas Volume Manager
5.0 MP3 fixed issues section or in the Volume Manager patch README.122058-11:

(e1095411) Need to be able to disable boot disk's last path for DR  
(e1095411) Sun Bugid 6277129  UNABLE TO SUPPRESS BOOT DISK FROM VXVM 4.0 control
(e803949) Opteron_x86 - removeable device not being ignored  
(e803949) Sun Bugid 6534693  "vxdisk list" lists FDD



Version 5.0:


Recommendations on use of Space-Optimized (SO) snapshots in Storage Foundation for Oracle RAC 5.0

If you use Volume Manager mirroring, Space-Optimized (SO) snapshots are recommended for Oracle data volumes.

Keep the Fast Mirror Resync regionsize equal to the database block size to reduce the copy-on-write
(COW) overhead.

Reducing the regionsize increases the amount of Cache Object allocations leading to performance overheads.

Do not create Oracle redo log volumes on a space-optimized snapshot.

Use "third-mirror break-off" snapshots for cloning the Oracle redo log volumes.



Documentation Errata: 5.0 Veritas Cluster Server Agent Developers Guide

The following content replaces the description of the LogFileSize attribute in the Veritas Cluster Server Agent Developers Guide.

LogFileSize

Sets the size of an agent log file. Value must be specified in bytes. Minimum is 65536 bytes (64KB). Maximum is 134217728 bytes (128MB). Default is 33554432 bytes (32MB).

For example,

hatype -modify FileOnOff LogFileSize 2097152

Values specified less than the minimum acceptable value are changed to 65536 bytes. Values specified greater than the maximum acceptable value are changed to 134217728 bytes. Therefore, out-of-range values displayed for the command:

hatype -display restype -attribute LogFileSize

are those entered with the -modify option, not the actual values. The LogFileSize attribute value cannot be overridden.


Addition to the Veritas Storage Foundation Release Notes

Known Issues section additions:

Etrack incident #1089081: For A/P array in the MPxIO enabled environment,
vxconfigd may process slow during MPxIO path failover and failback.

Etrack incident #1097902: For SUN 6540/STK FLX200/300 series arrays, there is
a known issue with Persistent Group Reservation (PGR) keys when Fencing is
configured by using DMP devices. To use these arrays with IO Fencing, Fencing
must be configured to use raw devices.

Etrack incident #995952: When testing an array for SCSI 3 compliance using the
vxfentsthdw script, using the -d option, may fail. This is a known issue, retry the script
using raw devices without the -d option.

Etrack incident #966143: There is a known issue with the HDS 9500 and AMS/WMS
series storage arrays. The Filesystems may become disabled if a SCSI INQ is
issued and fails.

The workaround is to set the following parameters:

DMP tunable setting:
1. Set dmp_scsi_timeout to 120
2. Set dmp_retry_timeout to 60

Throttle setting:
1. Add "set sd:sd_max_throttle=8" entry in the /etc/system file.
This may vary depending on the number of luns.

Please check with the array vendor for recommendations.


Supplemental Materials

Value966143
Description

Known issue with HDS 9500 and AMS/WMS series storage array with filesystems and SCSI INQ failing


Value995952
Description

When testing an array for SCSI 3 compliance using vxfentsthdw, using the -d option may fail


Value1097902
Description

Known issue with SUN 6540/STK FLX200/300 series arrays with Persistent Group Reservation (PGR) keys and Fencing


Value1089081
Description

For A/P array in the MPxIO enabled environment, vxconfigd may process slow during MPxIO path failover and failback



Legacy ID



286955


Article URL http://www.symantec.com/docs/TECH50452


Terms of use for this information are found in Legal Notices