Category Archives: Linux

Clobbering grub.conf is Bad

I’m sharing this in the hope of saving someone from an unwelcome surprise.

Background

I recent upgraded an Exadata system from 11.2.3.2.1 to 11.2.3.3.1. Apart from what turns out to be a known bug[1] that resulted in the patching of the InfiniBand switches “failing”, it all seemed to go without a snag. That’s until I decided to do some node failure testing…

Having forced a node eviction I got on with something else while the evicted compute node (database server to non-Exadata folk) booted. After what seemed like a reasonable amount of time, and at an appropriate break in my other work, I attempted to connect to the previously evicted host. No joy. I connected to the ILOM console to see what was going on only to find something like this:

grub_prompt

My first though was, “Is there any conceivable way that causing a node eviction through generation of very heavy swap activity could be responsible for this?”

Investigation

Attempting to boot the host from the grub prompt using the kernel installed as part of 11.2.3.3.1 worked without any issues. Once the host was up I looked at /boot/grub/grub.conf. It was empty (zero bytes). I checked all compute nodes and found the same. Obviously this must have happened after the last reboot of the hosts otherwise they would have failed to boot beyond the grub prompt.

I raised an SR as this seemed like a big deal to me. Not that it wasn’t recoverable, but because it is a gremlin lying in wait to bite when least welcome. It’s easy to imagine a situation where having upgraded to 11.2.3.3.1 the environment is put back to use and at some point later a node is evicted or rebooted for another reason and it doesn’t boot back into the OS. That would suck. I expect my hosts to be in a state that allows them to boot cleanly; if that’s not the case then I want to know about it and be prepared.

The initial response in the SR was that I should run the upgrade again without the “Relax and Recover[2] entry in grub.conf stating, “… as there is some suspicion this might be related.”

Having restored the pre-upgrade backup on one of the compute nodes, I ran the upgrade again and started to investigate in detail. Rather than give a blow-by-blow account of that investigation, I’ll cut straight to the final conclusion.

Culprit

misceachboot (part of the Exadata “validations” framework and as the name suggests it runs every time an Exadata compute node boots) clobbers /boot/grub/grub.conf shortly after host startup if an entry for “Oracle Linux Server (2.6.18-308.24.1.0.1.el5)” is found in grub.conf. This is consistently repeatable with the simple test of copying the backup of grub.conf created by dbnodeupdate.sh over the empty grub.conf and rebooting.

As I’ve stated in the SR with Oracle, this appears to be something that would affect all Exadata systems that are upgraded from 11.2.3.2.1 to 11.2.3.3.1. Oracle Support have created bug 19428028, which is not visible at the time I write this.

If someone else had run into this problem then I’d like her/him to share it publicly so that I didn’t get caught out by it. Hence this blog post.

I would be very interested to hear from other Exadata users that have upgraded to 11.2.3.3.1, particularly if it was from 11.2.3.2.1, and whether or not they have seen the same problem.

Update (20th August 2014)

Having found more time to investigate I believe I’ve found the exact cause of the issue…

In the comments of this post I previously stated:

… the “Oracle Linux Server (2.6.18-308.24.1.0.1.el5)” entry is definitely the trigger.

Well, that’s not true!

Also, having taken the time to identify exactly what part of the image_functions code truncates grub.conf, it is now possible to be confident that the other 11.2.3.2.1 systems I have access to will not be affected.

So anyway, here’s the important points:

Point 1

Within image_functions a function named “image_functions_remove_from_grub” is defined that includes the following:

perl -00 -ne '/vmlinuz-$ENV{EXA_REMOVE_KERNEL_FROM_GRUB} / or print $_' -i $grub_conf

The “-00” part is particularly relevant. This invokes “paragraph mode”, which defines a paragraph as being the characters between two non-consecutive newlines.

The Perl command has the effect of removing any paragraph from $grub_conf (defined as /boot/grub/grub.conf) that contains the string “vmlinuz-$ENV{EXA_REMOVE_KERNEL_FROM_GRUB} “, where EXA_REMOVE_KERNEL_FROM_GRUB is a shell variable.

Point 2

For a reason I have yet to identify the grub.conf files on the compute nodes of the particular Exadata system that had been upgraded to 11.2.3.2.1 had spaces appended to the end of lines. The number of spaces appended was not consistent across nodes and I’ll continue to try to identify what was responsible. Anyway, this resulted in each break between entries in grub.conf not being simply a newline character, but rather a line with a number of spaces before the newline character.

End Result

I probably don’t need to explain, but in case it isn’t obvious: the combination of point 1 and 2 above means that the entire contents of grub.conf is seen as a single paragraph by the Perl command and as that paragraph contains the kernel referenced by the shell variable $EXA_REMOVE_KERNEL_FROM_GRUB it is removed from grub.conf resulting in an empty file.

Other Points

The incorrect assertion that it was the 2.6.18-308.24.1.0.1.el5 kernel entry that triggered the problem is the result of misceachboot repeatedly attempting to remove kernel 2.6.18-308.24.1.0.1.el5. If I’ve followed the logic in misceachboot correctly then the attempt to remove kernel 2.6.18-308.24.1.0.1.el5 happens on each boot because the rpm for that kernel is still “installed” even though the kernel files have been removed from /boot (and the entry removed from grub.conf). This is done because of a dependency between fuse-2.7.4-8.0.5.el5.x86_64 and kernel 2.6.18-308.24.1.0.1.el5. Whereas the attempt to remove kernel 2.6.32-400.21.1.el5uek is only performed once (assuming it is successful) after the first reboot post upgrade to 11.2.3.2.1 during which the rpm is completely removed along with the kernel files (and the entry removed from grub.conf).

Footnotes

1 – The bug is known and documented in MOS ID 1614149.1, however, there is no mention of it in MOS ID 1667414.1, which being entitled “Exadata 11.2.3.3.1 release and patch (17636228 )” and having a section of “Known Issues” seems like a reasonable place to expect it to be referenced. I have suggested to Oracle Support that an update to 1667414.1 referencing 1614149.1 would be a good idea. That was on 4th August 2014 and so far there’s been no update.
2 – Relax and Recover (rear) is a very useful bare-metal recovery tool for Linux.

Internet Access with VirtualBox & Host-only Networks (on OS X Mavericks)

Introduction

When creating VMs on my laptop I like to configure the minimum number network interfaces. I also tend to end up with environments where I want multiple VMs to be able to see each other, see the internet and see my physical host. It seems many people using VirtualBox use the approach of having a “Host-only Adapter” interface and a “NAT” interface. The only reason I have for not liking this is that it is possible for a “Host-only Adapter” to be able to access the wider world via the physical host and therefore I see the NAT interface as surplus to requirements.

In the past when running Snow Leopard I’d worked out that enabling “Internet Sharing” on the Mac allowed my VMs with “Host-only Adapter” to be routed out to through whatever network my Mac was connected to (assuming the network, resolv.conf, etc were appropriately configured on the VM). I’ve suggested this approach to others in the past without questioning what OS X version they were running. I had the odd report of it not working and people resorting to adding a NAT interface, but didn’t have opportunity to investigate.

Anyway, I got a new computer recently and it came with OS X Mavericks. On creating my first VM in VirtualBox I opted for my preferred approach of using “Host-only Adapter”. I then spent a while working out how to get my Mac to NAT the VMs. Here is the solution I came up with.

Commands

1. Enable IP forwarding:

$ sudo sysctl -w net.inet.ip.forwarding=1

If you want this to be persistent across reboots then you can add it to /etc/sysctl.conf.

2. Edit the pfctl configuration file (/etc/pf.conf) adding the following line below “nat-anchor”:

nat on en0 from vboxnet0:network -> (en0)

The above assumes that your Mac is connected to the internet over Airport/WiFi (en0) and that you want to allow the first VirtualBox “Host-only Network” (vboxnet0) to be NAT’d.

It is also possible to use natd & ipfw, as covered here, but they are deprecated in Mavericks, so you should probably adopt pfctl now.

3. Once pf.conf has been modified the file needs to be loaded:

$ sudo pfctl -f /etc/pf.conf

4. … And you need to ensure that PF is enabled:

$ sudo pfctl -e

The VMs will now be able to access the internet via their “Host-only Adapter” through the physical host.

Result

This gives me a situation where my Mac can see all my VMs, my VMs can see each other and importantly my VMs can get out to the internet – All with a single interface on each VM. KISS.

[Update] Persistent Through Reboot

As mentioned above the sysctl change can be made persistent through reboots by writing it to /etc/sysctl.conf, which you will probably need to create unless you’ve already been tinkering.

The changes to /etc/pf.conf will remain in place through restarts, but there are two changes required to ensure that PF is brought up automatically after startup. The first change is covered in section 2 of this Mavericks Server knowledge base article from Apple. The part that is relevant is:

$ sudo defaults write /System/Library/LaunchDaemons/com.apple.pfctl ProgramArguments '(pfctl, -f, /etc/pf.conf, -e)'

The chmod & ptutil commands that follow the above in the article were not required in my case as the permissions on the file were already as appropriate and the plist file was already in XML format. That said their is no harm in running them.

The other change you will need to make is to modify the syntax used in /etc/pf.conf. At the point PF is started the “vboxnet0” interface will not exist and therefore PF will not be able to determine the required network information for them, which results in PF not being enabled. In order to avoid this problem it is necessary to switch to the following syntax (assuming your haven’t changed vboxnet0 from the default configuration):

nat on en0 from 192.168.56.0/24 -> (en0)

If you have changed from the default configuration or use multiple “Host-only Networks” in your VirtualBox environment then I’d imagine you can work out how to match the above to your environment.

Sudo Keystoke Optimisation

If like me, and a couple of others I’ve spoken to recently, you were not previously aware of “sudo -i”[1] then you might be interested to know that you can save yourself two keystokes by switching from:

sudo su -

To:

sudo -i

From the man page:

‑i‑-login

Run the shell specified by the target user’s password database entry as a login shell. This means that login-specific resource files such as .profile or .login will be read by the shell. If a command is specified, it is passed to the shell for execution via the shell’s ‑c option. If no command is specified, an interactive shell is executed. sudo attempts to change to that user’s home directory before running the shell. The command is run with an environment similar to the one a user would receive at log in. The Command Environment section in the sudoers(5) manual documents how the ‑i option affects the environment in which a command is run when the sudoers policy is in use.

Simple, but useful if you’re someone that enjoys the quest to reduce keystokes.

Footnotes
[1] – This post was originally written about “-s”, but as pointed out by Paul in the comments this is not equivalent to “sudo su -“.

UKOUG Tech13

I will be presenting on two topics at the UKOUG Tech13 conference in Manchester.

<a href="http://www cialis discount paris.tech13.ukoug.org/default.asp?p=10186&dlgact=shwprs&prs_prsid=8591&day_dayid=73″>Goodbye KVM… Hello KVM – Monday (2nd December) @ 16:50 in Exchange 7 (45 mins) – If you use virtualisation in your “home lab”, but have never considered KVM then this session is aimed at you.

Pitfalls, Pain and Pleasure with RAC Connectivity – Wednesday (4th December) @ 08:30 in Exchange 10 (45 mins) – If you plan to implement Fast Connection Failover (FCF) for your connection pools to 11gR2 databases then there are some valuable “lessons learnt” in this presentation. If you’ve already implemented FCF to 11gR2 databases there’s still probably a few useful points covered.

Hope to see you there.

ReaR MAC Address Mix-Up

Relax and Recover (ReaR) is a great tool for facilitating Linux bare-metal recovery. In works really well, however, there is a bug in the 1.14 release (it seems the same issue is present in 1.15, but I didn’t test yet) that effects restores for many bonded Ethernet interface configurations.

Problem

After the recovery of an Exadata compute node using ReaR I saw the following message during the boot process:

Bringing up interface bond1:  Device eth2 has different MAC address than expected, ignoring.

Examination of /etc/sysconfig/network-scripts/ifcfg-eth2 and dmesg showed that sure enough the MAC address in /etc/sysconfig/network-scripts/ifcfg-eth2 did not match that of the device… Hmmm.

A quick look in the restore log file revealed:

2013-09-02 20:20:47 Including finalize/GNU/Linux/30_create_mac_mapping.sh
2013-09-02 20:20:48 Including finalize/GNU/Linux/41_migrate_udev_rules.sh
2013-09-02 20:20:48 Including finalize/GNU/Linux/42_migrate_network_configuration_files.sh
2013-09-02 20:20:48 SED_SCRIPT: ';s/<original mac address removed>/<new mac address removed>/g;s/<ORIGINAL MAC ADDRESS REMOVED>/<NEW MAC ADDRESS REMOVED>/g'

Just to be clear the parts in < and > have been removed by me and represent place-holders for the MAC addresses. First in lower case and then upper case.

Cause

The cause of the issue is in 30_create_mac_mapping.sh, which is responsible for creating a MAC address mapping file (if needed). The MAC address mapping file ($CONFIG_DIR/mappings/mac) is created to handle situations where the restore is to a different host, or at least one where the MAC addresses for the network cards are different to those on the system that was backed up.

The 30_create_mac_mapping.sh script is really very simple and compares the MAC address in the restored ifcfg-<interface> files (if there are any) with the MAC addresses in /sys/class/net/<interface>/address. If there is a difference then it writes the old and current MAC addresses to the MAC address mapping file ($CONFIG_DIR/mappings/mac[1]) for later use by 42_migrate_network_configuration_files.sh. All good, right? Well, a problem can come about when dealing with bonded interfaces. Specifically, when in active-backup mode with fail_over_mac set to none (0 has the same meaning). In this configuration /sys/class/net/<interface>/address reports the same value for all slaves of a bond.

So, given that when booting into recovery mode ReaR attempts to create the network configuration at the time the ReaR boot ISO was created (via script 60-network-devices.sh), if you had bonded interfaces at the point you created the ReaR boot ISO, then you’ll have them in recovery mode, which means that when 30_create_mac_mapping.sh runs it will write MAC addresses to $CONFIG_DIR/mappings/mac, and then shortly after 42_migrate_network_configuration_files.sh will run and update the ifcfg-<interface> files, setting the MAC address for all interfaces in a given bond to the same value, which is not correct.

Solution

After initially thinking that I was going to need to come up with a patch for ReaR that would handle bonded interfaces appropriately, and I thinking that’s more complicate than it might sound, I had another look at 30_create_mac_mapping.sh and realised that if I create an empty $CONFIG_DIR/mappings/mac file (valid in my case as the MAC addresses have not changed when doing a straight restore to the same host) then ReaR will not create a new file, or add records to the existing file, and there will be no attempt to update the MAC addresses in the ifcfg-<interface> scripts.

The above worked and so a fix for the MAC address mix-up with bonded interfaces, when restoring to a host with the same MAC addresses, is to run the following command[1] before running “rear recover”:

# touch /etc/rear/mappings/mac

Footnotes
[1] – $CONFIG_DIR is a variable used in Relax and Recover, but note that your $CONFIG_DIR might not be /etc/rear

Relax and Recover

Relax and Recovery (ReaR) is great. It does exactly what it needs to do with the minimum of fuss.

What?

I’ll start by explaining what ReaR does and does not do. The “does not” part is important as many people I’ve spoken with about ReaR have been confused about exactly what it does.

The main purpose of ReaR is to create a bootable image, based on what is currently installed on a Linux host, that can be used to partition disks and retrieve a backup of the system. There are options for where to create the bootable image and what to do with it after it has been created.

The bootable image can be a USB device, an ISO file or a number of other options.

If you create a bootable image on a USB device then you may also wish to create a backup of your system on the same device, which ReaR will support.

When creating a bootable image as an ISO file you have a multitude of options for what do to with the file in order to get it off the box so that it can be used for recovery. The two options I have used are rsync and TSM.

The misconception I mentioned earlier is the belief that ReaR will backup your system. It can do that, but it is not a given and depends on your configuration acheter du cialis 5.

If ReaR isn’t backing up the system, what will?

ReaR will work with tar, rsync and a number of 3rd party commercial backup solutions. Again, I have used rsync and TSM.

Why?

You might be asking yourself why this is important/useful.

Imagine your beloved machine has suffered a death by file system corruption. You have a backup of the system, but what next? You need a way of getting the backup data back on disk. I know a number of places that do not restore full systems, but rather rebuild them if something goes horribly wrong with the operating system. I can see the value in the approach, however, it relies on you knowing what the state of the system was. For example:

  1. What configuration changes have been made since the installation?
  2. What additional software has been installed?
  3. What scripts have been put in place?

The list goes on.

It is totally possible to manage all that, but you have to be proactive. It’s a problem that won’t solve itself.

If you don’t have details of all the customisations then wouldn’t it be really nice to be able to run a command to pull back the contents of all your local file systems as they were at the point of the last backup, allowing you to simply reboot and be back in business? That’s what ReaR will do for you 🙂

Example Procedure

The following is an example of the produce to protect a system with ReaR and TSM during some operating system patching activities (assumes TSM is already installed):

  1. Install ReaR (rpms are available here).
  2. Configure ReaR to use TSM and to create an ISO file by updating /etc/rear/local.conf with a line of OUTPUT=ISO and another with BACKUP=TSM.
  3. Run “rear -v mkrescue” to create the bootable ISO and send it to TSM (mkbackup would have the same effect in this case as TSM will be handling the file system backups independently – I feel mkrescue makes it clearer what you’re doing).
  4. Perform a incremental backup of your file systems with TSM using “dsmc inc …”.
  5. Do your patching activity.

If all goes well then you don’t need to boot from the ReaR ISO and restore you operating system. But, let’s say it didn’t go well. Your system will no longer boot and there’s no immediately obvious way forward. You decide to restore. The procedure is:

  1. Restore the ReaR ISO to a location that will allow you to present it to the server. This is most likely to be your desktop so you can present the ISO file as a virtual CD-ROM over the ILOM interface.
  2. Present the ISO to the host to be recovered.
  3. Boot the host from the ISO – It is highly likely that you’ll need to change the boot order or get a pop-up menu to select the ISO as the boot media.
  4. Select “Recover <hostname>” at the grub prompt.
  5. Log in as root (password not required).
  6. Run “rear -v recover” and answer the interactive prompts.

Issues

Since starting to use ReaR I have encountered two problems:

  1. When recovering a host that used an ext4 file system for /boot I found myself facing at message of “Error 16: Inconsistent filesystem structure.” from grub. After a bit of digging around and trying to understand what the issue was I ended up modifying the /var/lib/rear/layout/disklayout.conf ReaR file to change the file system type for /boot from ext4 to ext2. I initially tried ext3, but as the system did not use ext3 for any of the file systems the module was not available.
  2. The version of ReaR that I was using had a bug (tracked on GitHub) that affected systems that do not have a separate /boot partition. There is a patch for the bug available, but if like me you’re happy to have a manual workaround, you need to perform the following actions after the restore completes:
# chroot /mnt/local
# PATH=/bin:/sbin:/usr/bin
# grub-install <disk path>
# exit
# reboot

Finally, it’s worth mentioning that ReaR is written in shell and is open source.