In this Document
APPLIES TO:Oracle Database - Enterprise Edition - Version 11.2.0.2 and laterOracle Database Cloud Schema Service - Version N/A and later Oracle Database Exadata Cloud Machine - Version N/A and later Oracle Cloud Infrastructure - Database Service - Version N/A and later Oracle Database Cloud Exadata Service - Version N/A and later Information in this document applies to any platform. This issue impacts environments that do not have multicast enabled for the private network in the following situations: New installations of Oracle Grid Infrastructure 11.2.0.2 where multicast is not enabled on 230.0.1.0 Upgrades to Oracle Grid Infrastructure 11.2.0.2 from a pre-11.2.0.2 release where multicast is not enabled on 230.0.1.0 or 224.0.0.251 Installation of GI PSU 11.2.0.3.5, 11.2.0.3.6, 11.2.0.3.7 where multicast is not enabled on 230.0.1.0 or 224.0.0.251 Installation or upgrade to 12.1.0.1.0 where multicast is not enabled on 230.0.1.0 or 224.0.0.251 SYMPTOMSIf multicast based communication is not enabled as required either on the nodes of the cluster or on the network switches used for the private interconnect, the root.sh, which is called as part of a fresh installation of Oracle Grid Infrastructure 11.2.0.2, or the rootupgrade.sh (called as part of an upgrade to Oracle Grid Infrastructure 11.2.0.2) will only succeed on the first node of the cluster, but will fail on subsequent nodes with the symptoms shown below: CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node <node1>, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster Failed to start Oracle Clusterware stack Failed to start Cluster Synchorinisation Service in clustered mode at /u01/app/crs/11.2.0.2/crs/install/crsconfig_lib.pm line 1016. /u01/app/crs/11.2.0.2/perl/bin/perl -I/u01/app/crs/11.2.0.2/perl/lib -I/u01/app/crs/11.2.0.2/crs/install /u01/app/crs/11.2.0.2/crs/install/rootcrs.pl execution failed
Note: The symptoms will be the same whether an upgrade or a fresh installation of Oracle Grid Infrastructure 11.2.0.2 is performed; so will be the required diagnostics. This issue is also documented in the Oracle Database Readme 11g Release 2 Section 2.39 - "Open Bugs" under BUG: 9974223.
Note: This issue also impacts the following 11.2.0.3 PSUs: 11.2.0.3.5, 11.2.0.3.6, 11.2.0.3.7 as well as 12.1.0.1 installations if multicast is not enabled on the 230.0.1.0 or 224.0.0.251 multicast addresses (one of the 2 must be enabled/functional). With 11.2.0.3 GI was enhanced to utilize broadcast or multicast to bootstrap. However the 11.2.0.3.5 GI PSU introduced a new issue that effectively disables the broadcast functionality (Bug 16547309).
2010-09-16 23:13:14.862: [GIPCHGEN][1107937600] gipchaNodeCreate: adding new node 0x2aaab408d4a0 { host '<node1>', haName 'CSS_ttoprf10cluster', srcLuid 54d7bb0e-ef4a0c7e, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [0 : 0], createTime 9563084, flags 0x0 }
2. Shortly after the above log entry we will see an attempt to establish communication to <node1> from <node2> via multicast address 230.0.1.0, port 42424 on the private interconnect (here: 192.168.x.x): 2010-09-16 23:13:14.862: [GIPCHTHR][1106360640] gipchaWorkerUpdateInterface: created remote interface for node '<node1>', haName 'CSS_mycluster', inf 'mcast://230.0.1.0:42424/192.168.x.x'
2010-09-16 23:13:15.839: [ CSSD][1087465792]clssnmvDHBValidateNCopy: node 1, <node1>, has a disk HB, but no network HB, DHB has rcfg 180134562, wrtcnt, 8627, LATS 9564064, lastSeqNo 8624, uniqueness 1284701023, timestamp 1284703995/10564774
CHANGES
Note: 11.2.0.4 Grid Infrastructure is not impacted by this issue.
CAUSEAssuming that Cluster Verify (cluvfy) has succeeded regarding the network checks on all nodes of the cluster or the symptoms described above are observed as part of an upgrade to Oracle Grid Infrastructure 11.2.0.2 (which means that the current release does not encounter such communication issues), this issue is probably caused by multicast not being enabled on the network used as the private interconnect. Background information for 11.2.0.2 Background information for 11.2.0.3.5, 11.2.0.3.6, 11.2.0.3.7 GI PSUs and 12.1.0.1 Note: The Oracle CSS daemon may fail to establish network communication with peer nodes for other reasons than multicast not working as required on the private interconnect, which is discussed in this note. Therefore, refer to Note: 1054902.1 for general network communication troubleshooting, if you determine that multicasting is not the root cause for such issues on your system.
Configuring various network switches for multicast Cisco Nexus Switches - general no ip igmp snooping
(Consult the Cisco Nexus manual for the exact syntax of this command.) Cisco Catalyst Cisco Nexus 7000 Switch no hardware ip verify fragment
(Consult the Cisco Nexus manual for the exact syntax of this command.) These settings apply only to the Cisco Nexus 7000 switch. This command may be available on other models of the Nexus switch, but not all of the Nexus models support it. Check the Cisco manual for the model of concern.
SOLUTIONFor 11.2.0.2 Installations It has been found that using a 230.0.1.0, port 42424, network address for multicasting can be problematic with some network configurations. Therefore, Oracle has released Patch: 9974223 on top of Oracle Grid Infrastructure 11.2.0.2. This patch enables multicasting on the 224.0.0.251 multicast address (port 42424) in addition to the 230.0.1.0 (port 42424) address used by default. Multicast must generally be enabled on one of these two addresses to allow Oracle Grid Infrastructure to successfully start on all nodes configured to join a particular cluster. For 11.2.0.3.5, 11.2.0.3.6 and 11.2.0.3.7 GI PSU and 12.1.0.1 Installations 11.2.0.3.5 - 11.2.0.3.7 GI PSU and 12.1.0.1 installations will ONLY be impacted if those installations were relying on the broadcast functionality that was initially introduced in 11.2.0.3 base. For these installations the most simplistic solution is to enable multicast on either the 230.0.1.0 or 224.0.0.251 address (only 1 of the 2 addresses needs to be functional). For those installations that are unable to enable multicast, Patch 16547309 is available for some platforms to re-enable broadcast functionality. If you require this patch (due to not being able to enable multicast) and it is not available for your version/platofrm please raise an SR with support to request the fix. The fix for Bug 16547309 will be included in the 11.2.0.3.8 and 12.1.0.1.2 GI PSUs. For 12.2.0.1 Installations "Multicasting is required on the private interconnect. For this reason, at a minimum, you must enable multicasting for the cluster for the following..."
Problem validation Note: The issue described in this note does not apply
If you have already run the root.sh (rootupgrade.sh) on any node of the cluster: Note: The README for Patch: 9974223 and Patch 16547309 contain instructions for those installations in which root.sh (or rootupgrade.sh) has failed due to this issue. A complete reinstall of Grid Infrastructure should not be required.
Multicast can be vaildated using the mcasttest.pl tool attached to this note to validate whether or not multicast on the 230.0.1.0 or 224.0.0.251 address can be used on the private interconnect interface(s). CLUFFY in 11.2.0.3 and above does contain similar checks but will report pass if the broadcast check passes, with mentioned issue which breaks broadcast in 11.2.0.3.5 - 11.2.0.3.7 GI PSU and 12.1.0.1 this pass is a false positive. For this reason it is recommended to use mcasttest.pl program to validate multicast functionality before patching/upgrading/installing the mentioned releases. Note: Multicast based communication only needs to be successful on either the 230.0.1.0 address or
the 224.0.0.251 address. A successful multicast communication on both addresses is not required.
# gunzip mcasttest.tgz
# tar xvf mcasttest.tar
# perl mcasttest.pl -n <node1>,<node2>,<node_n...> -i <interface1>,<interface2><interface_n...>
The example below tests multicast for a two node cluster (<node1> and <node2>), in # perl mcasttest.pl -n <node1>,<node2> -i eth1,eth2
########### Setup for node <node1>########## Checking node access '<node1>' Checking node login '<node1>' Checking/Creating Directory /tmp/mcasttest for binary on node '<node1>' Distributing mcast2 binary to node '<node1>' ########### Setup for node <node2> ########## Checking node access '<node2>' Checking node login '<node2>' Checking/Creating Directory /tmp/mcasttest for binary on node '<node2>' Distributing mcast2 binary to node '<node2>' ########### testing Multicast on all nodes ########## Test for Multicast address 230.0.1.0 Nov 8 09:05:33 | Multicast Failed for eth1 using address 230.0.1.0:42000 Nov 8 09:05:34 | Multicast Failed for eth2 using address 230.0.1.0:42001 Test for Multicast address 224.0.0.251 Nov 8 09:05:35 | Multicast Succeeded for eth1 using address 224.0.0.251:42002 Nov 8 09:05:36 | Multicast Succeeded for eth2 using address 224.0.0.251:42003 As shown in the example above, the test has failed for the 230.0.1.0 address, but succeeded for the 224.0.0.251 multicast address. In this case, Patch: 9974223 must be applied to enable Oracle Grid Infrastructure to use the 224.0.0.251 multicast address. Interpreting the outcome of the mcasttest-tool correctly:
When and how to patch for 11.2.0.2
Note: as Opatch is used with the "-local" flag here, you need to perform this operation on every node.
When and how to patch for 11.2.0.3.5, 11.2.0.3.6, 11.2.0.3.7 As previously mentioned, the most simplistic solution is to enable multicast on either the 230.0.1.0 or 224.0.0.251 address to address. If this is not possible within your environment, the recommended solution is to install the fix for Bug 16547309prior to executing "rootcrs.pl -patch". When and how to patch for 12.1.0.1 Again, the most simplistic solution is to enable multicast on either the 230.0.1.0 or 224.0.0.251 address to address this issue. If this is not possible within your environment, the recommended solution is to install the fix for Bug 16547309 or the latest GI PSU containing the fix (12.1.0.1.2) prior to running root.sh or rootupgrade.sh. In order to apply Patch 16547309 prior to the execution of root.sh or rootupgrade.sh, perform the following steps:
Disclaimer and summary: Due to differences in network hardware and overall network topologies that may be used for the interconnect network, it is not possible for Oracle to provide a single solution for enabling multicast within your network environment. That being said, it is important that you work with your System and / or Network Administrators to enable multicast functionality for the 230.0.1.0 or the 224.0.0.251 multicast address (using Patch: 9974223) . The mcasttest.pl program attached to this note should be used to ensure that multicast communication is functioning properly to support Oracle Grid Infrastructure.
Scalability RAC CommunityTo discuss this topic further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Scalability RAC Community. REFERENCES
|