Tag Archives: UEK

Kdump to NFS in UEK (Solution)

I’ve previously written about a problem I encountered when kdump is configured to write to an NFS location with UEK (in Exadata software version 11.2.3.2.1). I’m please to report that the root cause of the problem has been identified and there is a very simple workaround.

There were some frustrating times working this particular SR, the most notable being a response that was effectively, “It works for me (and so I’ll just put the status to Customer Working).”

After a bit more to-ing and fro-ing it emerged that the environment where Oracle had demonstrated kdump could write to NFS had the NFS server on the same subnet as the host where kdump was being tested. After a quick test of my own, using a second compute node in the Exadata system as the NFS server, I confirmed that kdump was able to write to an NFS location on the same subnet in my environment as well.

Soon after reporting the above test in the SR I was pointed to MOS note 1533611.1, which unfortunately is not publicly available (yet) and so I cannot read it… The crux of the issue is that the network interface configuration files have BOOTPROTO=none and kdump is not handling this appropriately, which results in an incomplete network configuration for bond1 when switching to the dump kernel during a crash.

The fix: Change BOOTPROTO=none to BOOTPROTO=static

The part of all this that I find interesting is that the documentation for <a href="https://access prix cialis en france.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Deployment_Guide/s1-networkscripts-interfaces.html”>RHEL 5 and RHEL 6, and even RHEL 7, all list the following options for BOOTPROTO:

  • none
  • bootp
  • dhcp

“static” does not appear to be a formally valid value. In an attempt to find more information about the behaviour I looked for more details and only got as far as Red Hat BZ#805803 and BZ#802928, neither of which I can access directly, but I can see a summary here and here respectively.

In conclusion, it appears that the issue is actually a kdump bug. More specifically a mkdumprd bug. Thankfully the workaround is simple, it just took a long time to get to it.

Kdump to NFS Broken in UEK

There were problems that affected UEK and NFS when 11.2.3.2.1 was initially released (as covered by Andy Colvin). As mentioned in the <a href="http://blog.oracle-ninja cialis 5mg pas cher.com/2013/03/exadata-11-2-3-2-1-nfs-issues-ksplice-support-for-exadata/#comment-1067″>comments of Andy’s post: Oracle released an updated ISO with fixes for this problem (patch 16432033).

There were also problems with kdump not functioning after 11.2.3.2.1 as listed in “Issue 1.17” of Oracle MOS “Exadata 11.2.3.2.1 release and patch (14522699) for Exadata 11.1.3.3, 11.2.1.2.x, 11.2.2.2.x, 11.2.2.3.x, 11.2.2.4.x, 11.2.3.1.x, 11.2.3.2.x (Doc ID 1485475.1)”.

After updating to 11.2.3.2.1 using the updated ISO (with NFS fixes) and installing the later version of kexec-tools as per the fix for “Issue 1.17” I was not able to get kdump to write to NFS during a crash.

I had previously tested kdump to NFS when running Exadata software version 11.2.2.4.2, so I knew that worked. 11.2.2.4.2 uses the RHEL kernel (“non-UEK” as Oracle would have you say) so I decided to test 11.2.3.2.1 after switching to the RHEL kernel, which also involves reverting the kexec-tools to the earlier version. That confirmed the issue was only evident when using UEK. Time for an SR…

Bug 17730336 – UNABLE TO KDUMP OVER NFS WITH KEXEC-TOOLS-2.0.3-3.0.4.EL5.X86_64.RPM INSTALLED

The bug listed above has been raised, but no fix has been supplied yet.

Not being able to direct kdump to NFS is probably not going to be your biggest worry, but definitely something to be aware of if you’re running UEK.

Note that I have only tested with 2.6.32-400.21.1el5uek. Oracle have confirmed they have reproduced the issue with 2.6.39-400.126.1.el5uek. Other UEK kernels may or may not be affected.