KGXGN polling error (15)

When trying to start the second instance in a two-node RAC cluster, the second instance will not start. If the instance on node1 is running, the instance on node2 will not start. If the instance on node2 is running, the instance on node1 will not start. The Alert Log shows the following:

Error: KGXGN polling error (15)
Errors in file /u01/app/oracle/diag/rdbms/bsp/bsp1/trace/bsp1_lmon_9151.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON (ospid: 9151): terminating the instance due to error 29702

Unfortunately, the LMON trace file only gives the same error messages so nothing to go on there.

This error is occurring because of a misconfiguration for the cluster-interconnect. If you look at the OCR to see the cluster interconnect, you can see the NIC device is eth4.1338:

[oracle@myhost bin]$ oifcfg getif -global
eth2 192.168.33.0 global public
eth4.1338 10.0.0.0 global cluster_interconnect

On one node, the device eth4 is correct. However, on the second node the device is eth5.1338 and the OCR is shared between the nodes. The OCR is expecting the device to be eth4.1338. Both servers need the cluster interconnect to be on the same network device. The server’s network configuration was changed so that both nodes were configured on the eth5.1338 device. Once the servers were configured identically, we redefined the OCR config:

In very rare cases, individuals can suffer from priapism, which refers to a prolonged and painful erection that lasts for hours. viagra low price If he has been rude to her in the past, then he should apologize to her for his past mistakes. 5 Physical reasons: Most of the women surveyed cialis vs viagra feel that the unattractiveness of the men often turns them off. Kilham decided to get a look at the laborious task of unseating the vertical root of the nerve along the spinal cord. icks.org generic cialis australia If you are looking to female viagra pills, it is important to be attentive on what to eat and what should be avoided. [oracle@myhost bin]$ ./oifcfg setif -global eth5.1338/10.0.0.0:cluster_interconnect

Looking at the config, we can see that both eth4 and eth5 are still in OCR:

[oracle@myhost bin]$ ./oifcfg getif -global
eth2 192.168.33.0 global public
eth4.1338 10.0.0.0 global cluster_interconnect
eth5.1338 10.0.0.0 global cluster_interconnect

So we remove the eth4 device:

[oracle@myhost bin]$ ./oifcfg delif -global eth4.1338/10.0.0.0

We now have the OCR reconfigured. We restarted CRS and both instances came up on both nodes!

This was one of those errors where the error messages really did not point to a root cause of the problem. Instead, I had to poke around the areas I felt were the most likely culprits when I rather blindly discovered the configuration differences.