return to OCLUG Web Site
A Django site.
October 15, 2012

» Adding Networks to Exadata: Fun with policy routing

I’ve noticed that Exadata servers are now configured to use Linux policy routing. Peeking at My Oracle Support, note 1306154.1 goes in a bit more detail about this configuration. It’s apparently delivered by default with factory build images and later. The note goes on to explain that this configuration was implemented because of asymetric routing problems associated with the management network:

Database servers are deployed with 3 logical network interfaces configured: management network (typically eth0), client access network (typically bond1 or bondeth0), and private network (typically bond0 or bondib0). The default route for the system uses the client access network and the gateway for that network. All outbound traffic that is not destined for an IP address on the management or private networks is sent out via the client access network. This poses a problem for some connections to the management network in some customer environments.

It goes onto mention a bug where this was reported:


The bug is not public, but the title does show the type of error messages that would appear if a packet with a non-local source address comes out.

This configuration is implemented using RedHat Oracle Linux-style /etc/sysconfig/network-scripts files, with matched rule- and route- files for each interface.

A sample configuration, where the management network is in the 10.10.10/24 subnet, is:

[root@exa1db01 network-scripts]# cat rule-eth0
from table 220
to table 220
[root@exa1db01 network-scripts]# cat route-eth0 dev eth0 table 220
default via dev eth0 table 220

This configuration tells traffic originating from the IP (which is the management interface IP on this particular machine), and also traffic destined to this address, to be directed away from the regular system routing table, to a special routing table 220. route-eth0 configures table 220 with two router: one for the local network, and a default route through a router on the network.

This contrasts with default gateway of the machine itself:

[root@exa1db01 network-scripts]# grep GATEWAY /etc/sysconfig/network

The difference between this type of policy routing and regular routing is that traffic with the _source_ address of will automatically go through default gateway, regardless of the destination. (The bible for Linux routing configuration is the Linux Advanced Routing and Traffic Control HOWTO, for those looking for more details)

I ran into an issue with this configuration when adding a second external network on the bondeth1 interface. I set up the additional interface configuration for a network,

[root@exa1db01 network-scripts]# cat ifcfg-bondeth1
BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000 num_grat_arp=100"

I also added rule and route entries:

[root@exa1db01 network-scripts]# cat rule-bondeth1
from table 211
to table 211
[root@exa1db01 network-scripts]# cat route-bondeth1 dev bondeth1 table 211 via dev bondeth1 table 211
default via dev bondeth1 table 211

This was a dedicated data guard network to a remote server, IP

The problem with this configuration was: it didn’t work. Using tcpdump, I could see incoming requests come in on the bondeth1 interface, but the replies come out the system default route on bondeth0, and not reaching their destination. After some digging, I did find the cause of the problem: in order to determine the packet source IP, the kernel was looking up the destination in the default routing table (table 255). And the route for the network was in non-default table 211. So the packet followed the default route instead, got a source address in the client-access network, and never matched any of the routing rules for the data guard network.

The solution ended up being rather simple: taking out the “table 211″ for the data guard network route, effectively putting it in the default routing table:

[root@exa1db01 network-scripts]# cat route-bondeth1 dev bondeth1 table 211
default via dev bondeth1 table 211 via dev bondeth1

And then we ran into a second issue: the main interface IP could now be reached, but not the virtual IP (VIP). This is because the rule configuration, taken from the samples, doesn’t list the VIP address at all. To avoid this issue, and in case of VIP addresses migrating from other cluster nodes, we set up a netmask in the rule file, making all addresses in the data guard network use this particular routing rule:

[root@exa1db01 network-scripts]# cat rule-bondeth1
from table 211
to table 211

So to sum up, when setting up interfaces in a policy-routed Exadata system remember to:

  • Set up the interface itself and any bonds using ifcfg- files
  • Create a rule- file for the interface, encompassing every possible address the interface could have. I added the entire IP subnet. Add “from” and “to” lines with a unique routing table number
  • Create a route- file for the interface, listing a local network route and a default route with the default router of the subnet, all using the table number defined on the previous step
  • Add to the route- file any static routes required on this interface, but don’t add a table qualifier

The final configuration:

[root@exa1db01 network-scripts]# cat ifcfg-eth8
[root@exa1db01 network-scripts]# cat ifcfg-eth12
[root@exa1db01 network-scripts]# cat ifcfg-bondeth1
BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000 num_grat_arp=100"
[root@exa1db01 network-scripts]# cat rule-bondeth1
from table 211
to table 211
[root@exa1db01 network-scripts]# cat route-bondeth1 dev bondeth1 table 211
default via dev bondeth1 table 211 via dev bondeth1

September 20, 2012

» Troubleshooting ORA-27090 async I/O errors with systemtap

Last week I ran into an issue on a high-volume Oracle database, whereby sessions were periodically failing with ORA-27090 errors. Job queue processes were also seeing this error, and showing messages like this in the database alert log:

Tue Sep 11 20:56:15 2012
Errors in file /orahome/oracle/base/diag/rdbms/dbm/dbm1/trace/dbm1_j001_126103.trc:
ORA-27090: Unable to reserve kernel resources for asynchronous disk I/O
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 3
Additional information: 128
ORA-27090: Unable to reserve kernel resources for asynchronous disk I/O
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 3
Additional information: 128

The tracefile just showed the same ORA-27090 messages, so nothing particularly useful there. oerr is of no help:

$ oerr ora 27090
27090, 00000, "Unable to reserve kernel resources for asynchronous disk I/O"
// *Cause: The system call to reserve kernel resources for asynchronous I/O
// has failed.
// *Action: Check errno

There’s a known bug, 7306820 “ORA-7445 [krhahw] / ORA-27090 during file header read. Instance may crash”, but this bug is fixed in, and this database is running

And on top of that, it’s an Exadata system, so I/O to storage servers goes over the InfiniBand network rather than using async I/O (AIO) calls.

A web search turned up a blog entry from DBHK’s blog, pointing to a value of aio-max-nr being set too low. However aio-max-nr is actually set to the same level as the recommended value, and matching Exadata defaults as well:

# cat /proc/sys/fs/aio-max-nr

The Linux kernel documentation has a brief but meaty description of this parameter:

aio-nr & aio-max-nr:

aio-nr is the running total of the number of events specified on the io_setup system call for all currently active aio contexts. If aio-nr reaches aio-max-nr then io_setup will fail with EAGAIN. Note that raising aio-max-nr does not result in the pre-allocation or re-sizing of any kernel data structures.

Having a peek at aio-nr:

# cat /proc/sys/fs/aio-nr

We’re within 2 of the absolute limit, so it looks highly likely that this limit is indeed the problem. However, the question is: who is using these AIO events? This DB is a huge session hog (8000+ concurrent sessions) but even there, 3m is a pretty high limit. And at this point we can’t even be sure that it’s database processes using up the AIO events.

The only AIO-related information in /proc (or /sys for that matter) is the two files in /proc/sys/fs. To go into more detail requires some more tools.

Solaris admins will no doubt be familiar with DTrace, a kernel tracing framework that can expose all kinds of information in the OS kernel, among other things. Oracle has ported DTrace to Linux, but it requires the latest-and-greatest UEK2 kernel, and not yet supported on Exadata.

I came across another tool that also allows kernel inspection, and _is_ available in Oracle Linux 5: systemtap. systemtap hooks into the call stack, allowing function calls to be traced, arguments captured, and if you’re really brave, actually modified.

With dependencies, I ended up needing to add four packages. As this machine doesn’t (yet) have a working yum repository, I used to obtain the following:


The avahi package is tool for plug-and-play networking that I don’t exactly want running on a server, but the systemtap binary is linked to it for remote copmilation capability. Avahi configures itself to auto-start itself on next boot, so I disabled that:

# chkconfig avahi-daemon off
# chkconfig avahi-dnsconfd off

The systemtap packages complained about missing kernel package depedencies, since this system is running’s UEK kernel, naming the kernel package kernel-u2k instead. I ended up installing when with the –nodeps option to skip dependency checking.

I couldn’t find any pre-made scripts to monitor AIO, but in a 2008 presentation from Oracle Linux engineering does have a bullet point on it:

• Tracking resources tuned via aio_nr and aio_max_nr

So based on some of the many example scripts I set out to build a script to monitor AIO calls. Here is the end result:

stap -ve '
global allocated, allocatedctx, freed

probe syscall.io_setup {
  allocatedctx[pid()] += maxevents; allocated[pid()]++;
  printf("%d AIO events requested by PID %d (%s)\n",
  	maxevents, pid(), cmdline_str());
probe syscall.io_destroy {freed[pid()]++}

probe kprocess.exit {
  if (allocated[pid()]) {
     printf("PID %d exited\n", pid());
     delete allocated[pid()];
     delete allocatedctx[pid()];
     delete freed[pid()];

probe end {
foreach (pid in allocated) {
   printf("PID %d allocated=%d allocated events=%d freed=%d\n",
      pid, allocated[pid], allocatedctx[pid], freed[pid]);

Sample output (using sytemtap’s -v verbose option to see compilation details):

Pass 1: parsed user script and 76 library script(s) using 147908virt/22876res/2992shr kb, in 130usr/10sys/146real ms.
Pass 2: analyzed script: 4 probe(s), 10 function(s), 3 embed(s), 4 global(s) using 283072virt/49864res/4052shr kb, in 450usr/140sys/586real ms.
Pass 3: using cached /root/.systemtap/cache/11/stap_111c870f2747cede20e6a0e2f0a1b1ae_6256.c
Pass 4: using cached /root/.systemtap/cache/11/stap_111c870f2747cede20e6a0e2f0a1b1ae_6256.ko
Pass 5: starting run.
128 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
(and when control-C is pressed):

PID 99043 allocated=6 allocatedevents=12672 freed=3
PID 37074 allocated=12 allocatedevents=25344 freed=6
PID 99039 allocated=18 allocatedevents=38016 freed=9
PID 69142 allocated=24 allocatedevents=50688 freed=12
PID 32885 allocated=36 allocatedevents=76032 freed=18
PID 69099 allocated=6 allocatedevents=12672 freed=3
Pass 5: run completed in 0usr/50sys/9139real ms.

It’s quite obvious here that the AIO allocations are all happening from oracle database processes.

From the summary output we can see that each process seems to run io_setup twice as much as io_destroy; kernel gurus may have an answer to this, but I suspect it has more to do with the data gathering than a massive leak in AIO events.

But the more interesting result is the frequent allocation of 4096 AIO events at a time. On a database with 8000 connections, that would be over 10 times the current limit.

The only major downside of increasing this limit seems to be th avoid exhausting kernel memory. From a 2009 post to the linux-kernel mailing list:

Each time io_setup is called, a ring buffer is allocated that can hold nr_events I/O completions. That ring buffer is then mapped into the process’ address space, and the pages are pinned in memory. So, the reason for this upper limit (I believe) is to keep a malicious user from pinning all of kernel memory.

And with consultation with Oracle support, we set aio-max-nr to 50 million, enough to accommodate three databases with 16k connections all allocating 4096 AIO events. Or in other words, way more than we ever expect to use.

# sysctl -w fs.aio-max-nr=50000000

And since this change, the ORA-27090 errors have gone away.

March 27, 2012

» Using Oracle VM with Amazon EC2

If you’re planning on running Oracle VM with Amazon EC2, there are some important limitations you should know about.

As part of my work getting the Oracle Linux Unbreakable Enterprise Kernel 2 working (yeah that’s a mouthful) I tried using the Oracle-supplied Oracle Linux 6 AMI images that are listed as community AMIs by Amazon:

[marc@shakybox2 ~]$ ec2-describe-images -a --filter "name=*Oracle Linux 6*"
IMAGE   ami-87a076ee    936851542197/Oracle Linux 6.1 x86_64 - OVM      936851542197    available       public          x86_64  machine aki-4e7d9527ebs      paravirtual     <b>ovm</b>
BLOCKDEVICEMAPPING      /dev/sda                snap-ea95e18e   20

The “OVM” part here is interesting. Amazon EC2 is typically set up on their own version of Xen, but this image uses a different, though similarly Xen-based, hypervisor: Oracle VM. This appears to be a result of the Oracle-Amazon partnership so touted by Larry Ellison in his OpenWorld 2010 keynote.

The AMI itself launched as expected (once I had set the instance type to large; it we seem we need the 7.5G of RAM the large instance provides! But after the initial install, I was completely unable to use any kernel other than the 2.6.32-100.34.1.el6uek.x86_64 kernel originally supplied.

Every time the instance would be listed as running, but unreachable with the console log showing a variation of

"main" "root=/dev/sda1" "ro" "4"
vbd 2049 is hd0
******************* BLKFRONT for device/vbd/2049 **********

backend at /local/domain/0/backend/vbd/313/2049
Failed to read /local/domain/0/backend/vbd/313/2049/feature-barrier.
Failed to read /local/domain/0/backend/vbd/313/2049/feature-flush-cache.
41943040 sectors of 512 bytes

Since it’s not possible to specify a –kernel alternate kernel option with Oracle VM hosts, I can’t explicitly attach a different kernel. And the ec2-register command used to create AMI images does not have an option to create an image with the Oracle VM hypervisor at all. This Amazon FAQ document gives a clue:

Q: Can customers import Oracle Virtual Machine templates into AmazonEC2?

Oracle will distribute templates of Oracle Applications that have been produced by Oracle and approved by AWS. Customers will be able to customize these environments and save them as derivative images which also run on the Oracle Virtual Machine.

So it’s possible to customize pre-made templates, but the ability to create new templates is conspicuously not mentioned. And indeed, it’s possible to create a derivative image using ec2-create-image that still has an OVM hypervisor, but there’s no mechanism to use a custom kernel with this command.

So the take away here seems to be:

  • It’s possible to create Oracle VM-based Amazon EC2 instances, but only from Oracle-created AMIs
  • There’s no mechanism to use kernels other than those bundled with the image, so there’s no prospect of applying security updates etc unless Oracle releases a new AMI
  • Fortunately, I was able to get the Oracle Linux UEK2 kernel working with Amazon’s ordinary Xen hypervisor, though there’s no pre-built AMI for it. I made a separate blog post about this.

For you readers: have any of you managed to get a successful Oracle VM-based Amazon EC2 instance working? How do you deal with the issue of kernel updates?

» Testing out Oracle’s Unbreakable Enterprise Kernel 2

As announced a few days ago, Oracle’s core database product is now supported on Oracle Linux 6. Coming a full 13 months after Oracle Linux 6′s launch, and 16 months after Red Hat Enterprise Linux 6, it’s a much anticipated announcement.

Update 28-Mar-12: the official certification information has come out on My Oracle Support. So far it’s only certified for Oracle Linux 6 Unbreakable Enterprise Kernel version 1, and only for version under Linux x86-64. It also means that a certified combination should be possible using the Oracle-supplied OEL 6 EC2 AMI, though it’s frozen at the original 6.2 release kernel. Unfortunately Oracle is not currently available on OTN but rather requires an active license to download from My Oracle Support.

Oracle’s UEK2 (and Red Hat Enterprise Linux 6) add some very interesting features like:

  • btrfs, a big break in filesystem organization from traditional ext2/3, with the promise of lightweight copy-on-write snapshot support
  • transparent hugepages, dynamically allocating hugepages as needed and performing background memory defragmentation to attempt to free up contiguous space
  • transmit packet steering, allowing multiple CPUs to load-balance outgoing network traffic

Although the press release states it’s available “immediately”, I couldn’t find any notes on Oracle’s My Oracle Support support portal relating to the product support; in fact, it still lists Oracle Enterprise 6 as being uncertified as of this writing. So I’m not sure how it will pass the pre-installation operating system checks.

No matter, I’m going to test this out, and an obvious way to do this would be using Amazon EC2, providing high-capacity instances on demand.

After some blind allies getting the Oracle Linux UEK2 kernel working with Amazon EC2 and Oracle VM I found that I could make it work without Oracle VM, with with Amazon’s default Xen hypervisor. Here are the steps I used:

- Sign up for an Amazon EC2 account and set up the EC2 API tools on your client machine. There are lots of tutorials on how to do this online

- Create a new EC2 instance using a “builder” AMI; I chose a 64-bit CentOS 6 image “CentOS 6 PVGrub EBS AMI”, AMI ID ami-1f66b276

[marc@shakybox2 tmp]$ ec2-run-instances -k marc-aws -n 1 -t m1.medium -z us-east-1d ami-1f66b276
RESERVATION     r-d18f28b2      462281317311    default
INSTANCE        i-22d8f846      ami-1f66b276                    pending marc-aws        0             m1.medium        2012-03-24T21:25:11+0000        us-east-1d      aki-427d952b                    monitoring-disabled                                    ebs                                     paravirtual   xen              sg-5fc61437     default

- Assign a name to the instance

[marc@shakybox2 tmp]$ ec2-create-tags i-22d8f846  --tag 'Name=Instance Builder'
TAG     instance        i-22d8f846      Name    Instance Builder

- Authorize incoming SSH

[marc@shakybox2 ~]$ ec2-authorize default -p 22 -s $(ip addr list dev eth0 | awk '/inet / {print $2}')

- Create a new 20G EBS volume; this will be the “golden image” root disk. Attach it to the builder instance

[marc@shakybox2 tmp]$ ec2-create-volume -s 20 -z us-east-1d
VOLUME  vol-d7340cbb    20              us-east-1d      creating        2012-03-24T21:31:39+0000

[marc@shakybox2 tmp]$ ec2-attach-volume -i i-22d8f846 -d /dev/sdd vol-d7340cbb
ATTACHMENT      vol-d7340cbb    i-22d8f846      /dev/sdd        attaching       2012-03-24T21:33:26+0000

- Get the IP address to connect to (substituting the correct image ID and hostname):

[marc@shakybox2 tmp]$ ec2-describe-instances  i-22d8f846
RESERVATION     r-d18f28b2      462281317311    default
INSTANCE        i-22d8f846      ami-1f66b276 ip-10-116-237-78.ec2.internal  running marc-aws        0               m1.medium       2012-03-24T21:25:11+0000      us-east-1d       aki-427d952b                    monitoring-disabled ebs                                      paravirtual     xen             sg-5fc61437     default
BLOCKDEVICE     /dev/sda1       vol-39310955    2012-03-24T21:25:28.000Z        true
TAG     instance        i-22d8f846      Name    Instance Builder

[marc@shakybox2 tmp]$ ssh -i marc-aws.pem

- Finding the volume inside our AMI, which just got hotplugged:

[root@ip-10-116-237-78 ~]# dmesg | tail -2
blkfront: xvdh: barriers disabled
 xvdh: unknown partition table

- Creating a filesystem and mounting. Note I’m not creating a partition table, but rather a raw filesystem. This will make things much easier if the volume ever needs to be resized.

[root@ip-10-116-237-78 ~]# mke2fs -j -L / /dev/xvdh
mke2fs 1.41.12 (17-May-2010)
Filesystem label=/
OS type: Linux
This filesystem will be automatically checked every 35 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@ip-10-116-237-78 ~]# mkdir /mnt/ec2-fs
[root@ip-10-116-237-78 ~]# mount /dev/xvdh /mnt/ec2-fs

- Creating the base directories, as per Jonathan Hui’s excellent blog post

mkdir /mnt/ec2-fs/dev
/sbin/MAKEDEV -d /mnt/ec2-fs/dev/ -x console
/sbin/MAKEDEV -d /mnt/ec2-fs/dev/ -x null
/sbin/MAKEDEV -d /mnt/ec2-fs/dev/ -x zero
mkdir /mnt/ec2-fs/etc
mkdir /mnt/ec2-fs/proc

- Creating /etc/fstab

cat > /mnt/ec2-fs/etc/fstab <<EOF
LABEL=/ /       ext3    defaults 1 1
none    /proc   proc    defaults 0 0
none    /sys    sysfs   defaults 0 0

- Mounting /proc

mount -t proc none /mnt/ec2-fs/proc

- Grabbing the OEL 6 yum config file

[root@ip-10-116-237-78 ~]# cd /root
[root@ip-10-116-237-78 ~]# wget
--2012-03-24 22:42:54--
Connecting to||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1461 (1.4K) 1
Saving to: âpublic-yum-ol6.repoâ

100%[======================================>] 1,461       --.-K/s   in 0s

2012-03-24 22:42:55 (106 MB/s) - âpublic-yum-ol6.repoâ

cat <<-EOF >> public-yum-ol6.repo

- Installing the base OS

[root@ip-10-116-237-78 ~]# yum -c /root/public-yum-ol6.repo --installroot=/mnt/ec2-fs -y groupinstall Core
ol6_latest                                               | 1.1 kB     00:00

- Installing the latest UEK2 kernel

yum -c /root/public-yum-ol6.repo --enablerepo=ol6_UEK_latest --installroot=/mnt/ec2-fs -y install kernel-uek

(not small: 200m+ for the kernel alone)

- Setting up base networking scripts

cat > /mnt/ec2-fs/etc/sysconfig/network-scripts/ifcfg-eth0 <<EOF
echo "NETWORKING=yes" > /mnt/ec2-fs/etc/sysconfig/network
echo "nameserver" > /mnt/ec2-fs/etc/resolv.conf
echo "UseDNS  no" >> /mnt/ec2-fs/etc/ssh/sshd_config
echo "PermitRootLogin without-password" >> /mnt/ec2-fs/etc/ssh/sshd_config
echo "hwcap 0 nosegneg" > /mnt/ec2-fs/etc/

- Script download of SSH private key on startup in case it’s missing (though with EBS-backed storage this shouldn’t be necessary)

cat > /mnt/ec2-fs/usr/local/sbin/ <<EOF
if [ ! -d /root/.ssh ] ;
then mkdir -p /root/.ssh
chmod 700 /root/.ssh
# Fetch public key using HTTP
/usr/bin/curl -f > /tmp/my-key
if [ $? -eq 0 ] ;
cat /tmp/my-key >> /root/.ssh/authorized_keys
chmod 600 /root/.ssh/authorized_keys
rm -f /tmp/my-key
chmod u+x /mnt/ec2-fs/usr/local/sbin/
echo "/usr/local/sbin/" >> /mnt/ec2-fs/etc/rc.d/rc.local

- Clean up temporary files from the installs (mostly the RPM packages)

yum -c /root/public-yum-ol6.repo --installroot=/mnt/ec2-fs clean all

- Set up GRUB boot files (note the kernel version here should

chroot /mnt/ec2-fs
cd /boot/grub
cat > grub.conf <<EOF

title Oracle Linux UEK
        root (hd0)
        kernel $(echo /boot/vmlinuz*uek.x86_64) ro root=LABEL=/ ro console=hvc0 crashkernel=auto LANG=en_US.UTF8 selinux=0
        initrd $(echo /boot/initramfs*uek.x86_64.img)
title Oracle Linux Compatibility Kernel
        root (hd0)
        kernel $(echo /boot/vmlinuz*el6.x86_64) ro root=LABEL=/ ro console=hvc0 crashkernel=auto LANG=en_US.UTF8 selinux=0
        initrd $(echo /boot/initramfs*el6.x86_64.img)
ln -s grub.conf menu.lst

- Set up swap. There’s no need to put this on expensive EBS storage when ephemeral storage will do)

cat > /mnt/ec2-fs/usr/local/sbin/ <<EOF
dd if=/dev/zero of=\$VOL bs=\${SIZE}k count=1 && mkswap \$VOL \$SIZE && swapon \$VOL
chmod +x /mnt/ec2-fs//usr/local/sbin/
echo "/usr/local/sbin/ /dev/xvdb 2097152" >> /mnt/ec2-fs//etc/rc.d/rc.local

- Unmount our root disk

umount /mnt/ec2-fs/proc
umount /mnt/ec2-fs

- Log out of the builder instance; our work there is done


- Create a snapshot of the root volume. Use the volume ID originally used to create the volume

[marc@shakybox2 tmp]$ ec2-create-snapshot -d "UEK2 AMI creation point" vol-d7340cbb
SNAPSHOT        snap-b94519c3   vol-d7340cbb    pending 2012-03-25T02:05:43+0000                462281317311   20      UEK2 AMI creation point

- Check when it’s completed

[marc@shakybox2 tmp]$ ec2-describe-snapshots snap-b94519c3
SNAPSHOT        snap-b94519c3   vol-d7340cbb    completed       2012-03-25T02:05:43+0000        100%  462281317311     20      UEK2 AMI creation point

- Register the snapshot, effectively creating an AMI image. This is a long command and cannot be changed once created, so it deserves some explanation

-n: user-provided unique name
-a: architecture, which must match the 64-bit kernel
-d: description, a text description
–root-device-name: this maps to the “root” parameter given to the PVGRUB bootloader
-b: block mapping. There are two here: one pointing to the root volume snapshot we just created, and one on non-permanent storage we’ll use for swap
–kernel: This kernel is actually a stub kernel running PVGRUB, a bootloader that loads the UEK2 kernel from the root drive. This particular kernel is for a 64-bit unpartitioned image in the us-east region.

The kernel ID is a generic 64-bit AMazon PVGRUB kernel for the US-East region

[marc@shakybox2 tmp]$ ec2-register -n UEK2-AMI -a x86_64 -d "AMI using the Oracle Linux UEK2 kernel" --root-device-name /dev/sda -b /dev/sda=snap-b94519c3 -b /dev/sdb=ephemeral0 --kernel aki-427d952b
IMAGE   ami-c39f41aa

- Now for the moment of truth: launch a VM based on the newly-created AMI

[marc@shakybox2 tmp]$ ec2-run-instances -k marc-aws -n 1 -t m1.medium  ami-c39f41aa
RESERVATION     r-19b0167a      462281317311    default
INSTANCE        i-5688ab32      ami-c39f41aa                    pending marc-aws        0             m1.medium        2012-03-25T00:08:10+0000        us-east-1d      aki-427d952b                    monitoring-disabled                                    ebs                                     paravirtual   xen              sg-5fc61437     default
[marc@shakybox2 tmp]$ ec2-describe-instances i-5688ab32
RESERVATION     r-19b0167a      462281317311    default
INSTANCE        i-5688ab32      ami-c39f41aa       ip-10-62-98-125.ec2.internal   running marc-aws        0               m1.medium       2012-03-25T02:08:10+0000       us-east-1d      aki-427d952b                    monitoring-disabled                   ebs                                     paravirtual     xen             sg-5fc61437    default
BLOCKDEVICE     /dev/sda        vol-d59aa2b9    2012-03-25T00:08:28.000Z        true
[marc@shakybox2 tmp]$ ec2-create-tags --tag="Name=UEK2 Test Instance" i-5688ab32
TAG     instance        i-5688ab32      Name    UEK2 Test Instance

ssh’ing into the machine, we can confirm it’s running the UEK:

[root@ip-10-62-98-125 ~]# uname -a
Linux ip-10-62-98-125 2.6.39-100.5.1.el6uek.x86_64 #1 SMP Tue Mar 6 20:26:00 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

Adding swap

Oracle’s pre-installation steps require swap space. Since Amazon EBS charges for storage by the GB, it makes little sense to pay for persistent storage for swap. The alternative is to use transient storage for this. Since at boot time we can’t be guaranteed of a state it’s safest to zero it out and to create swap at that point. We set aside some space on /dev/sdb (which maps to /dev/xvdb since the Oracle UEK kernel doesn’t to the drive mapping that the Amazon kernel does).

We’ll create a startup script to run in rc.local, the last point in the startup. It will take a while to run, but since sshd and other system services will already be running, it shouldn’t slow down the startup of any other processes.

cat > /usr/local/sbin/ <<EOF
dd if=/dev/zero of=\$VOL bs=\${SIZE}k count=1 && mkswap \$VOL \$SIZE && swapon \$VOL
chmod +x /usr/local/sbin/
echo "/usr/local/sbin/ /dev/xvdb 2097152" >> /etc/rc.d/rc.local


There aren’t a whole lot of troubleshooting tools in EC2, especially compared to something like Oracle databases. There is one invaluable tool to debug AMI builds though: the console output. It usually takes several minutes to appear, but it can help determine what went wrong when an instance is inaccessible.

[marc@shakybox2 tmp]$ ec2-get-console-output i-76634012

It was particularly frustrating to get GRUB to find the root devices; when it can’t find them it just displays a grubdom> prompt without error message. The official documentation recommends a –rootdevice of /dev/sda1 (rather than /dev/sda) and hd00 kernel aki-4e7d9527 but I couldn’t get these to work. It might be because there is no partition table on the root disk, but without access to the interactive grub interface or more diagnostic output I can’t know for sure.


Amazon docs for use specified kernels
Jonathan Hui’s blog post about creating CentOS images from scratch
Amazon docs on how to launch images from snapshots
Wim Coakearts’ blog post on using

Coming up: the actual Oracle database software install

November 15, 2011

» Linux Patching and Oracle: Detect RPM conflicts before they happen

A common scenario in the life of a DBA on a linux server looks something like this:

From: Sue-the-sysadamin To:the-dba Date: 2011-10-28 17:54:34 Dear DBA, We will be patching the linux systems this Friday night, October 28th at 21:00. The list of patches is attached. Let us know if there are any conflicts with Oracle. BTW, we need to know by 14:00 tomorrow. Sue

Yikes! The SysAdmin wants to apply these patches this week, and needs to know if there are any conflicts this afternoon.

So you open up the list of patches. When you see the lengthy list of patches you suddenly realize the long lunch you had planned with your sweetheart is going to be rescheduled.

I recently received several lists of patches that were to be applied to a number of different linux servers that were all running oracle, which led to this blog entry.

Matching Oracle shared lib dependencies to RPM files

Being faced with the task of determining which, if any, components of Oracle would possibly be affected by the list of RPMs to be upgraded across several servers with is a daunting task.  One quick method to deal with this is to look for glibc in the list of RPMs.  The list I received contained the following:

  • glibc-2.5-58.el5_6.4.i686
  • glibc-2.5-58.el5_6.4.x86_64
  • glibc-common-2.5-58.el5_6.4.x86_64
  • glibc-devel-2.5-58.el5_6.4.i386
  • glibc-devel-2.5-58.el5_6.4.x86_64
  • glibc-headers-2.5-58.el5_6.4.x86_64

Since nearly everything will be affected by an update to glibc,  it would be a good idea to test these new RPMs out on the test server for that purpose before deploying them in productions.

What if however you need a more exacting list of which Oracle components are affected by which packages?

This information is a little harder to come by.  While Oracle does supply a list of RPMs that must be installed on a Linux system, that information does not help you in determining which Oracle files are dependent on the RPMs in question.

If the list of RPMs to be updated is small, it may even be that that few, if any of the Oracle files on a server may be affected.

Here’s how that can be determined.  This may not be complete, but I think its a pretty good start.

Find the dependency files

Stepping through this manually, we can check an oracle executable file for the list of other files that is dependent on with ldd, which will display shared library dependencies.

[oracle@oravm01 patch_check]$ ldd $ORACLE_HOME/bin/oracle => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54b700000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54b802000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54b958000) => /lib64/ (0x0000003ed8c00000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54bb2c000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54bef4000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c00d000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c130000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c66e000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c771000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c90b000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54caa5000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54cbb1000) => /usr/lib64/ (0x00002ab54cd33000) => /lib64/ (0x0000003ed8000000) => /lib64/ (0x0000003ed7c00000) => /lib64/ (0x0000003ed8400000) => /lib64/ (0x0000003edb400000) => /lib64/ (0x0000003ed7800000)
        /lib64/ (0x0000003ed7400000)

Here we can see that the oracle binary is dependent on several library files.  Many of them are owned by oracle, but some of them are files that are part of the Linux system.

Find the RPM

The  next step is find which RPM the files are part of.  We can use the rpm command for this.

The -q option is for ‘query’, while the ‘-f’ causes rpm to report which package a file is associated with.

Choosing one of the linux owned files as an example:

[oracle@oravm01 patch_check]$ rpm -qf /lib64/

What happens if rpm -qf is run against one of the oracle owned dependency shared library files?

[oracle@oravm01 patch_check]$ rpm -qf /u01/app/oracle/product/11.2.0/vmdb01/lib/
file /u01/app/oracle/product/11.2.0/vmdb01/lib/ is not owned by any package

As you might expect, rpm has no knowledge of oracle owned files, and reports that fact.

Does the patch affect Oracle?

The final step is to compare the current package with the patch list.  In this case there is a match – glibc-2.5-49 is to be replaced with glibc-2.5.58.

Now all that you need to do is repeat this process for all oracle binary files.  This obviously does not work well as a manual process.

Fortunately this is not too difficult a task to automate.   Those readers already familiar with me will not be surprised to learn that this automation involves the use of Perl.

The Patch Check Script

Or rather, scripts.  The Perl script is complemented by the shell script  The reason for using a driver shell script is that there may be more than one Oracle home directory on the server that you need to check.

One of the few limitations of using Perl with Oracle involves the LD_LIBRARY_PATH variable.   This variable cannot be set from within a Perl script, as the Perl compiler loads in the value of LD_LIBRARY_PATH before running your script, so subsequent attempts to change it are futile.  There are some hacks that will work on some platforms that allow LD_LIBRARY_PATH to be changed at runtime, but I would rather not go that route.  The limitations of using LD_LIBARY_PATH are pretty well known, so it is best just to work around that particular issue.

The RPM patch list

The list of RPM should bin a file, one RPM per line. The file names should not contain the .rpm extension. If the extension is included, you will need to remove it. Either that, or you will need to change the script to work with the .rpm extension.

Why does the list of RPMs not include the .rpm extension?  Simple, that’s how I received the list.

Here is a sample of files from a list of RPMs to be updated:

[oracle@oravm01 patch_check]$ head -5 patch_list.txt

Which Files are searched?

Not all files in an Oracle home are processed.  The find command is used with the -perm  /u+x option to find all files that have the executable bit set for the file owner.

These files are then processed to determine RPM dependencies.

The Perl script

Let’s invoke to see what it does when run against a single Oracle home with the standard setting for LD_LIBRARY_PATH and other oracle environment variables.

Here’s the current environment:

[oracle@oravm01 patch_check]$ set | grep -E 'ORA|LD_LIB'

Now run the script. Notice that STDERR has been redirected to a file – more on that later

[oracle@oravm01 patch_check]$ ./ -verbosity 0 -oracle_home $ORACLE_HOME -linux_patch_list patch_list.txt 2> pc.err

#### Possible Conflicts Found #####
--- gdbm-1.8.0-26.2.1.el5_6.1.x86_64
old: 1.8.020 26.220
new: 1.8.020 26.2.1.el5_6.120
--- glibc-2.5-58.el5_6.4.x86_64
old: 2.520 4920
new: 2.520 58.el5_6.420
--- libgcc-4.1.2-50.el5.x86_64
old: 4.1.220 4820
new: 4.1.220 50.el520
--- libstdc++-4.1.2-50.el5.x86_64
old: 4.1.220 4820
new: 4.1.220 50.el520 has found 4 RPM packages that are scheduled to be updated and that Oracle is dependent on.
This new information may cause the patch schedule to be pushed back until it is determined that these updates will have no adverse affects on Oracle.

Now let’s consider the file pc.err, where all error messages were redirected to.

We see there 336 lines in the error file.

[oracle@oravm01 patch_check]$ wc -l pc.err
336 pc.err

The majority of the errors are from rpm -qf  – as seen earlier the RPM database knows nothing of files owned by Oracle.   As such, these are errors we are not really concerned with.  There may however be errors hiding in the file that we are concerned about.

[oracle@oravm01 patch_check]$ head -7 pc.err
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found

Following is another error  – this one is not an rpm error, so it may be something to be concerned about.

##### ldd error ############
error encountered with /usr/bin/ldd /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/
run the command manually to see the error

Running the command manually results in a segmentation fault.

[oracle@oravm01 patch_check]$ ldd /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/
/usr/bin/ldd: line 116:  5216 Segmentation fault      LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW= LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$@"

A quick google search on ‘ldd segmentation fault’ suggests this is a linux bug. We will not be investigating this further, as we aren’t too concerned with that at this time.

With the large number of errors that we do not really care about being reported by the rpm -qf command, how can we reduce the number of errors?

The shell script

The script serves two purposes.  It reads the file oracle_homes.txt and runs for each ORACLE_HOME.  The script also sets the LD_LIBARY_PATH variable to eliminate many of the false positives seen in pc.err previously. Following is the entire script.  You may want to adjust the values used for LD_LIBARY_PATH to suit your environment.


# set the environment to call (patch check)


[ -r "$OHOMES_FILE" ] || {
echo The file $OHOMES_FILE is missing
exit 1

while read ohome
sid=$(basename $ohome)

echo $ohome

# you may have to experiment a bit to get the needed paths in LD_LIBRARY_PATH
# in 11g there will probably still be a few files that need to be checked manually
export LD_LIBRARY_PATH=$ohome/lib:$ohome/jdk/jre/lib/amd64:$ohome/jdk/jre/lib/amd64/server

./  -verbosity 0 -oracle_home $ohome -linux_patch_list $PATCH_LIST_FILE >$LOGFILE 2>$ERRFILE


Lets run the script and then check the error logs.

[oracle@oravm01 patch_check]$ ./

Checking on the output files shows that the error logs are much smaller than what was seen previously.

[oracle@oravm01 patch_check]$ ls -ltar *.log *.err
-rw-r–r– 1 oracle oinstall   455 Nov 14 13:18 patch_check_grid.err
-rw-r–r– 1 oracle oinstall 34384 Nov 14 13:18 patch_check_grid.log
-rw-r–r– 1 oracle oinstall   296 Nov 14 13:18 patch_check_vmdb01.err
-rw-r–r– 1 oracle oinstall 34384 Nov 14 13:18 patch_check_vmdb01.log

[oracle@oravm01 patch_check]$ wc -l patch*.err
9 patch_check_grid.err
6 patch_check_vmdb01.err
15 total

By including some more library directories in LD_LIBRARY_PATH, rpm -qf was able to resolve many more dependencies, greatly reducing the false positives that were included in the error files.   Here are the total contents of the error files:

[oracle@oravm01 patch_check]$ cat *.err
Use of uninitialized value in hash element at ./ line 393.
Use of uninitialized value in hash element at ./ line 393.
working on /u01/app/oracle/product/11.2.0/grid/jdk/jre/lib/amd64/ dependencies => not found
working on /u01/app/oracle/product/11.2.0/grid/jdk/jre/lib/amd64/ dependencies => not found
rpmq: no arguments given for query
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found

The ‘uninitialized value’ errors are from Perl – this is likely a bug in the script. Looking at the script I can see that while it is a script in the bug, it is not causing any incorrect reporting.

[oracle@oravm01 patch_check]$ grep -n . | grep -3 393
389:                            ($dummy1,$dummy2,$file) = split(/\s+/,$line);
390:                    }
392:                    #print "      ###### FILE: $file\n";
393:                    push @{$self->{olibDeps}->{$file}{dependents}}, $exe;
394:            }
395:    }
397:    # all library dependencies now in %olibDepend

Line 393 should be executed only if there is a value defined for $file.

Changing it as follows this fixed the bug:

push @{$self->{olibDeps}->{$file}{dependents}}, $exe if $file;

The small number of lines now found in the .err files makes it simple to determine if these errors are an issue to be concerned with.

Download the scripts

The scripts can be downloaded here – Patch Check Distribution

The zip file contains a tarball patch_check_dist.tgz.  Use tar or your favorite unarchiving utility to extract the scripts.

June 13, 2011

» Upgrade to MySQL 5.1.56 on Bacula server using 5.0.x and MyISAM tables

Hello there, it’s me again, with another blog about a DBA situation that a typical Linux Administrator may find themselves in.

In this blog, i’m going to review a recent MySQL upgrade I have done on one of the systems I am involved in administering. This is a real world example of an upgrade project, and hopefully when we’re done, there may even be an overall performance boost.

There are several reasons to perform upgrades (of any kind), for me an important one is to keep current for security and bug fixes, but general performance improvements and new features are always welcome as well.

This system is running Bacula, an open source enterprise backup system. In this particular case Bacula is configured to store data in a MySQL database. The data stored include status reports, backup content lists, lists of all the files on all the systems, schedules and other related information. While everything has been operating “generally fine” for several years, this also means several components of the system are quite out of date.

The system is using CentOS 5.3, and the versions of MySQL and Bacula that shipped with it (5.0.45 and 2.2.8 respectively) which are both quite old by todays standards. Officially MySQL 5.0.x is now completely out of support by Oracle/, so upgrading is quite important.

The OS definitely could/should be updated to CentOS 5.6 (current at the time of this writing), but that actually wouldn’t make a significant dent on the version of MySQL (or likely Bacula) itself. The OS upgrade, and the Bacula upgrade discussion i’ll leave to the appropriate System Administration team, i’ll just be reviewing what has been done with MySQL itself.

As the main application (Bacula) will not be upgraded – this means that any new features in MySQL 5.1.56 won’t really be used unless it’s something I can configure in the my.cnf file.

Initially my plan was to use the remi yum archives ( which I have seen used in several situations. The MySQL RPM’s in these yum repositories generally replace the RPM’s that are shipped in RHEL/CentOS, but are current GA versions. For non-DBA’s, I find these archives a really good way of moving to new releases without a lot of effort outside the regular system upgrading activities.

Unfortunately the server in question doesn’t have direct external Internet access. It only has access to some internal repositories, and access to/from other internal systems. I figured that if i’m going to have to copy RPM files manually anyways, I will use the official Oracle/ packages which are generally the preferred method (speaking as a DBA).

The overall plan was/is the following:

1) Schedule outage with all monitoring tools, and notify the appropriate people of a pending outage to the backup system
2) Confirm current backup is in place for the existing database and system files
3) Copy packages downloaded from to a location accessible to the system in question (in my case an NFS mounted home directory)
4) Stop mysql and bacula-director

/etc/init.d/mysql stop
/etc/init.d/bacula-dir stop

5) remove mysql and mysql-server – ignoring the complaint about libmysqlclient dependancies

rpm -e –nodeps mysql mysql-server

6) install the packages downloaded from

rpm -Uvh MySQL-server-community-5.1.56-1.rhel5 MySQL-shared-compat-5.1.56-1.rhel5 MySQL-client-community-5.1.56-1.rhel5 MySQL-devel-community-5.1.56-1.rhel5

7) upgrade the mysql database (using mysql_upgrade) access controls

/usr/sbin/mysqld –user=mysql –skip-grant &
killall mysqld

8) restart mysql to confirm it starts up fine with no config changes.

/etc/init.d/mysqld start
tail -f /var/log/mysqld.log

9) update my.cnf to enable some new features (innodb plugin)

vi /etc/my.cnf

and add the following lines to the [mysqld] section


10) restart mysql and confirm it has now loaded the innodb plugin, and is still operating fine.

/etc/init.d/mysql restart
tail /var/log/mysqld.log -n 20

you should see something like this:

InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3
110505 16:38:33 InnoDB: Initializing buffer pool, size = 768.0M
110505 16:38:33 InnoDB: Completed initialization of buffer pool
110505 16:38:33 InnoDB: highest supported file format is Barracuda.
110505 16:38:33 InnoDB Plugin 1.0.15 started; log sequence number 18094312675
110505 16:38:33 [Note] Event Scheduler: Loaded 0 events
110505 16:38:33 [Note] /usr/sbin/mysqld: ready for connections.
Version: ’5.1.56-community-log’ socket: ‘/var/lib/mysql/mysql.sock’ port: 3306 MySQL Community Server (GPL)

11) Finally, restart the Bacula director

/etc/init.d/bacula-dir start

Other than a stray config value present in the old my.cnf file that was now being interpreted differently (basedir) this part of the upgrade went smoothly.

When we reached this point I confirmed with the SysAdmin responsible for backups that Bacula was still able to perform it’s usual duties, and that the old version of Bacula seems to be fine with the new version of MySQL.

We paused at this stage for about a week, so the system could do a full run through of the weekly backups, and so we could see how well the system performs and if there is any real difference (so far).

Very little speed difference was expected at this point because the only new features enabled are for using the InnoDB storage engine which was not actually being used. It’s not actually being used yet as all the Bacula tables were still using the MyISAM storage engine which is the default engine for older versions of MySQL such as 5.0. This of course means the next major stage of the upgrade was to convert all the Bacula tables from MyISAM to InnoDB and I want to do this because of details i’ll mention below.

I found several references on the web to Bacula supporting the use of the InnoDB storage engine perfectly fine, so i’m not concerned about compatibility differences between the storage engine affecting actual functionality.

There are plenty of differences between InnoDB and MyISAM, but InnoDB has one big advantage over MyISAM that should affect Bacula – the locking method. MyISAM uses table locking, but InnoDB will do row level locking.

This means if you have parallel inserts and updates all going into the same table (like if you’re doing backups from multiple systems at the same time), MyISAM performing global locks could cause some problems – and indeed it has.

In this specific situation, we have seen locked threads on this database during the weekly backups Sunday at 4:00am, so that means to me that we could see a bit of a performance improvement, and a lot less locked threads.

Ok, so now I needed to change all the databases to InnoDB format. By far the easiest way to do this is to run a series of “ALTER TABLE {table} engine=innodb;” statements. As this system has definite times of the day when there is no activity, this is a safe option.

The main drawback of using the ALTER TABLE method to do the conversion is the disk space required. During the conversion, the old table and the new table will both exist at the same time.

The entire database is about 7GB in size and there is currently about 7GB of free space on the file system being used by MySQL. Normally this would be fine if there was a number of medium sized tables as only one table is duplicated at a time. Unfortunately this is not the case with Bacula – it has many small tables, and one really large table called ‘File’. This means converting ‘File’ may be an issue.

At this point I’m attempting to do the remainder of the upgrade without the help of the SysAdmin which means I no longer have access to ‘sudo’ or ‘root’ access. This means several of the typical options for distributing disk space (like soft linking db files to other file systems, etc..) are not really available to me.

I did eventually come up with a solution that worked due to the way Bacula operates and clears old data – I used ‘OPTIMIZE TABLE’. This actually cleared up a fairly significant amount of the disk space being used leaving me with 9GB free which is more than enough to do the conversion.

mysql> optimize table File;

Once the disk space issue was covered, I needed to generate the list of alter statements I want to execute for changing the engine. The following command was used to generate the list:

bash# mysql -uavail -p bacula -s -e ‘select “alter table “,table_name,” engine=innodb;” from information_schema.tables where tables.table_schema=”bacula”‘

The command generated this output which was then pasted into the mysql console:

alter table BaseFiles engine=innodb;
alter table CDImages engine=innodb;
alter table Client engine=innodb;
alter table Counters engine=innodb;
alter table Device engine=innodb;
alter table File engine=innodb;
alter table FileSet engine=innodb;
alter table Filename engine=innodb;
alter table Job engine=innodb;
alter table JobMedia engine=innodb;
alter table Location engine=innodb;
alter table LocationLog engine=innodb;
alter table Log engine=innodb;
alter table Media engine=innodb;
alter table MediaType engine=innodb;
alter table Path engine=innodb;
alter table Pool engine=innodb;
alter table Status engine=innodb;
alter table Storage engine=innodb;
alter table UnsavedFiles engine=innodb;
alter table Version engine=innodb;

I didn’t accurately measure the total time it took, but I would guess it took about an hour to convert everything. Most tables took less than a minute, except the ‘Filename’ table which took about 5 minutes, and the ‘File’ table which took all the rest.

We’ve now completed the major milestone for this upgrade – Bacula is now using the InnoDB storage plugin for all of its tables with the newer version of MySQL.

As suspected while working on this project, there was not a huge performance difference for the overall time it takes to perform the weekly backups, although there still is an improvement. The backups with a large number of files (+300,000) now complete faster than they previously did – approximately 20 minutes faster for a 7hr backup. This was likely purely due to the improvements in locking.

May 20, 2011

» Using KateSQL to connect to an Oracle database in Kate

Among the features announced with the release of version 4.6 of the KDE Software Compilation is KateSQL, a SQL Query plugin for the Kate text editor providing the basic functionality of a SQL client. It leverages the Qt SQL module, allowing you to make a connection to most types of databases. Out of the box it includes support for MySQL and SQLite, and with this post I’ll show you how to get it to connect to Oracle databases as well.

Since I am an Ubuntu user (Note: you’ll need to be on 11.04 and either be running Kubuntu or have the ‘kate’ package installed) these instructions will be specific to that operating system, though they will likely also apply to other Debian-based distributions with minimal modification. However, if you are using an RPM-based distribution such as openSUSE or Fedora, you’ll need to adapt the instructions somewhat to make sure you have the necessary development packages installed.

The first thing we’ll need to do in order to get this to work is to obtain an Oracle client. This can be downloaded from the Oracle Technology Network and requires that you have a valid Single Sign-On account. For the purposes of this article we’re going to use the Linux x86 Oracle Instant Client and the files we’re going to download are:


The first package is the client itself, providing the files required to run OCI applications. The second is the Instant Client SDK which includes header files for developing Oracle applications with Instant Client: in our case an OCI library for the Qt SQL module.

Users of RPM-based distributions can just go ahead and install these packages while others will need to convert them to .deb format first. The best way to do this that I am aware of is with the tool called Alien. Let’s install it, convert the Oracle Instant Client packages, and install them too.

cd /path/to/oracle/instant/client/files
sudo apt-get install alien
sudo alien oracle-instantclient11.2-basic-
sudo alien oracle-instantclient11.2-devel-
sudo dpkg -i oracle-instantclient11.2-basic_11. oracle-instantclient11.2-devel_11.

The next step is to install all of the dependencies necessary to build the Qt SQL OCI library.

sudo apt-get build-dep qt4-x11
sudo apt-get install qt4-qmake libqt4-dev libaio1

Then we will need the actual source code for the qt4-x11 package. This can be obtained with the following command:

apt-get source qt4-x11

To recap, at this point we’ve downloaded and installed the Oracle Instant Client and all of the dependencies required to build the qt4-x11 package, and we’ve downloaded the qt4-x11 source code. Next we’re going to build the Qt SQL OCI library itself.

sudo ln -s /path/to/qt4-x11-4.7.0/src /usr/include/src
sudo cp -r /path/to/qt4-x11-4.7.0/include/QtSql/private /usr/include/qt4/QtSql/
cd /usr/include/src/plugins/sqldrivers/oci/
qmake "INCLUDEPATH+=/usr/include/oracle/11.2/client64" "LIBS+=-L/usr/lib/oracle/11.2/client64/lib -Wl,-rpath,/usr/lib/oracle/11.2/client64/lib -lclntsh -lnnz11"
sudo make install

Now that the module has been built and installed, Kate should be able to connect to an Oracle database. Launch Kate (or close and re-open it if it was already running), click Settings -> Configure Kate, expand the Application tree if necessary, click Plugins, check ‘SQL Plugin‘ and click OK. Then click SQL -> Add connection to launch the connection wizard. In addition to the MySQL and SQLite database drivers, you should now also have drivers named QOCI and QOCI8 available. Choose the appropriate one, enter relevant connection details, and complete the connection wizard. You can now execute queries against the database and display result data.

September 30, 2010

» Linux, Windows, virtual machines and you — DIY VPN jump box

Being in the remote administration business is a strange beast and offers lots of challenges, but when you are working for multiple clients sometimes connecting to the servers can be challenging enough. Here’s a little idea that I had this morning that may save someone some grief, so I thought I would jot it down for all to see.

One of the issues I have connecting to some clients revolves around my linux desktop. Sure I can connect to many VPN devices using vpnc or other tools, but in some cases client policy prohibits such reasonable behaviour due to a) single vendor plugins; b) bad java or plugin issues; c) host checking software or; d) Antivirus requirements that do not recognize linux agents. My problem is that I do not want to administer Unix servers from a Windows system, it’s just … wrong. Like, fundamentally wrong. (on cue, someone I know is calling me a zealot)

Conventional wisdom would dictate that I need to run a Windows VM, open the VPN client, and then connect using Putty or similar. Sure this works. It works fine. Except that some VMs do not interact with the clipboard properly, and I prefer to connect from my linux console because that’s where all my tools are. When I am documenting tickets I rely heavily on being able to seamlessly get things from my console to my documents without having the third party involved. I want to be able to work on all systems the same way, because it improves efficiency. That and the fact that I would rather use less Windows than more.

The solution I came up with is simplistic, but allows me to use the Windows VPN client, and bypass Windows for everything else. Basically a DIY VPN jump box. All you need is a Windows VM, and Cygwin. It’s just crazy enough to work, and does!

Here’s what you do.


  • Start your Windows VM. (Don’t have one already? Check out VirtualBox. Installing it is out of scope.)
  • Log in with an administrator account if you are running Windows XP, Win7 and Vista users can log in as a normal user providing you have rights to run things as administrator.
  • Verify the IP of your VM. This means you will need to use bridged networking, none of this will work with NAT type of networking so caveat emptor.

Cygwin install

  1. Grab the cygwin installer from the Cygwin site. You should also check out the license if you are into that sort of thing.
  2. Run the installer, Win7 and Vista users should right click on the program and “Run as administrator”. Documentation on the installer can be found at the network setup help section. You should be able to take the defaults for most of it. It does an annoying thing where it asks you for a location to download the packages, and it defaults to c:\Program Files\Mozilla, I changed it to my Download folder.
  3. When you get to the “Choose a download site” (aka mirror) list choose, one that is close to you. If you have no preference then any of the mirrors should work, but speed may vary depending on geographic proximity or network link speed at the mirror.
  4. The next screen is the package selection screen. It looks daunting, but here what you need to get this working.
    • In the search box type openssh. This will Narrow down the package selection to one group, called Net.
    • Click the + next to the word Net to expand the group.
    • In the New column there should be a Skip with a little circley-arrowy icon next to it. Click on the circley-arrowy icon, which will replace one of the n/a columns with a selected box (in the bin column). Now sshd will be selected for install.
    • If you need any other special packages, like telnet, you can search for them here. Incidentally, to get telnet you need to search for inetutils.
  5. Now follow the defaults and wait for Cygwin to complete the install. This could take a while.

Post Cygwin install

There is a few steps you’ll need to do manually now to get the ssh daemon running.

  1. If you do not have a password for your Windows user then set one up now (and have someone swat you on the nose with a rolled up newspaper. Bad SA. Bad.). ssh needs a password to work.
  2. Start a cygwin bash session either by using the desktop icon if you chose to create one, or using the link under Start menu->Programs->Cygwin. Win7 and Vista users, right click on the console icon and select ‘Run As Administrator’ the first time.
  3. In the console, run the following command to set up the ssh host keys and whatnot. This could take a while as well, so don’t get discouraged and ^C in the middle of it like I did. This process also sets the service to start on boot, if you do not want it to start automatically you will need to disable it manually.
    ssh-host-config -y
  4. When you get back to a prompt, follow-it up with this which turns on the service immediately. Net savings 10+ clicks!
    cygrunsrv -S sshd
  5. If you are running Win7, Vista or any sort of firewall program you will need to allow port 22, or program C:\Cygwin\usr\sbin\sshd.exe.

Reap the benefits

  1. Connect to your Windows VM desktop.
  2. Start the VPN client and connect to the VPN.
  3. ssh into your Windows VM using the IP you found in the pre-flight check.
  4. Now ssh or telnet into the system on the client end of the VPN tunnel.
  5. You could even ssh tunnel through the VM for GUI jumps or web access.

Hopefully someone will find this useful other than me. It’s so simple I really don’t know why I did not think of it before, but I think it’s probably because I only have one or two VPNs with soft clients. Some people have many more. I plan to play with it a bit to see how low I can set the resource allocation to the VM. I think I can probably cut a Win7 VM down to 256MB with the right combination of settings and still have good results because in this case I only care about network. I’ll let you know how it turns out.

June 8, 2010

» Wherever I May Roam

Roamer, wanderer Nomad, vagabond Call me what you will $ENV{LC_ALL} = "anywhere"; my $time = localtime; say {$anywhere} my $mind; local *anywhere = sub { ... }; Anywhere I roam Where I 'git ghclone environment' is $HOME # 'grep may_roam($_) => @everywhere', # with apologies to Metallica Laziness and a severe addiction to yak shaving [...]

April 23, 2010

» Blogrotate #24: The Weekly Roundup of News for System Administrators

Good afternoon and welcome to another edition of Blogrotate. Though I have been contributing to Blogrotate since its inception, this is the first time I have had the honour of posting it myself. Go me!

Operating Systems

Red Hat has announced the availability of a public beta for Red Hat Enterprise Linux 6 (RHEL 6). There are a number of changes, for which Dave Courbanou at The VAR Guy does a pretty good job of providing an overview. Of note are that Red Hat has completed its migration from Xen to KVM as the supported virtualization technology (which began with RHEL 5.4), and that ext4 is now the default filesystem.

There have been a couple of tidbits of news in the Ubuntu world. The first being a bug with memory leakage in affecting beta 2 of Ubuntu 10.04. The discussion on Slashdot became a debate on the merits of time vs scope-based release schedules. Per the bug report, a fix has since been committed, which is good because — and this leads into the second bit of news — Ubuntu has announced the availability of the release candidate for 10.04. Things are moving fast as we approach its release next Thursday.

And for something that’s not release announcement related, M. Tim Jones has an interesting article over at IBM’s developerWorks about Kernel Shared Memory in the Linux 2.6.32 kernel. Without going into a lot of detail (I’ll let him do that), it’s basically the implementation of a daemon to handle de-duplication of memory pages. This has obvious implications in a virtualization environment as there is the potential to run more virtual machines on a host without increasing the memory footprint.


The big news on this front was that McAfee pushed out a virus definition update that falsely identified svchost.exe as a threat, resulting in Windows automatically rebooting. Peter Bright from Ars Technica has some good coverage of this, and linked to McAfee’s official solution. Meanwhile, Dave Courbanou over at The VAR Guy has a follow up on the situation with some additional detail, and Barry McPherson from McAfee has posted an official response stating that a ’small percentage’ of enterprise accounts were affected. And finally, Ben Grubb of ZDNet Australia reports that Coles had 10 percent of its point-of-sales terminals affected and shut down stores in WA and South Australia as a result.


Oracle has decided to charge for an ODF plugin for MS Office which allows users to import/export documents in Open Document Format. Matt Asay, COO at Canonical, provides some commentary on this stating that “$9,000 is the new ‘free’ for Oracle“.

Jono Bacon, Canonical’s Community Manager, wrote that Canonical has made the single sign-on component of Launchpad available as open source under the AGPL3 license. There is some coverage from The H on this as well. Launchpad itself was released under the AGPL3 license about a year ago.


On a final (interesting) note, ‘Cyber Cynic’ Steven J. Vaughan-Nichols writes that HP and Likewise to release Linux-based storage line about HP and Likewise partnering on a line of StorageWorks products that will make use of the Likewise CIFS stack to support Active Directory authentication.

Well, that’s all I have time for this week. Will Brad be back at the helm next week, or will I continue my reign? You’ll just have to wait and see…

April 21, 2010

» DBD::Oracle and Windows 64bit

I have successfully compiled and installed DBD::Oracle on Windows 2008 Server 64bit operating system today.

I used the latest version of DBD::Oracle 1.24, version for 64bit Windows of Oracle’s
Instant Client Package – Basic along with the Instant Client Package – SQL*Plus and finally the Instant Client Package – SDK.

To get it to make and compile correctly I had to download Microsoft’s Visual Studio Ultimate

which should contain all the files you need. It is rather portly at 2+gb so you might want to grab lunch while you are downloading it.

After all the above downloading DBD::Oracle installed right out of the box.

All one has to do is select ‘Start Menu->All Programs->Microsoft Visual Studio 2010->Visual Studio Tools->Visual Studio x64 Win64 Command Prompt (2010)’
which will open a good old ‘dos’ window.

At this point CD to the directory where you downloaded DBD::Oracle


then set your ‘ORACLE_HOME to the Instant Client directory

     c:\DBD-Oracle set ORACLE_HOME=c:\IC_11

you should also set your NLS like this

     c:\DBD-Oracle set NLS_LANG=.WE8ISO8859P15

Once the above setting are done do a

     c:\DBD-Oracle perl Makefile.PL

and then a

     c:\DBD-Oracle nmake install

Which will produce a whole of warnings (these you can ignore, as they do not seem to effect DBD::Oracle at all) and near the end it should output something like this;

     Generating code
     Finished generating code
     if exist blib\arch\auto\DBD\Oracle\Oracle.dll.manifest mt -nologo -manifest blib\arch\auto\DBD\Oracle\Oracle.dll.manifest -outputresource:blib\arch\auto
     if exist blib\arch\auto\DBD\Oracle\Oracle.dll.manifest del blib\arch\auto\DBD\Oracle\Oracle.dll.manifest
     C:\Perl64\bin\perl.exe -MExtUtils::Command -e "chmod" -- 755 blib\arch\auto\DBD\Oracle\Oracle.dll
     C:\Perl64\bin\perl.exe -MExtUtils::Command -e "cp" -- blib\arch\auto\DBD\Oracle\
     C:\Perl64\bin\perl.exe -MExtUtils::Command -e "chmod" -- 644 blib\arch\auto\DBD\Oracle\
     C:\Perl64\bin\perl.exe "-Iblib\arch" "-Iblib\lib" ora_explain.PL ora_explain
Extracted ora_explain from ora_explain.PL with variable substitutions.
     C:\Perl64\bin\perl.exe -MExtUtils::Command -e "cp" -- ora_explain blib\script\ora_explain
        pl2bat.bat blib\script\ora_explain

At this point you are all done.

Well almost.

It is important that you test your code before you install but you will have to set a few things up first to get it to fully test correctly.

You will need a TNSNAMES.ORA file that points to a valid DB in the Instant Client Directory

Next you will need to set the ORACLE_USER_ID to a valid user

     c:\DBD-Oracle set ORACLE_USER_ID=system/system@XE

You will have to set up TNS_ADMIN to point to the Instant Client Directory

     c:\DBD-Oracle set TNS_ADMIN=c:\IC_11

Most importantly you will have to add the Instant Client directory to your path like this

     c:\DBD-Oracle path = c:\IC_11;%path%

If you do not do this step you will run into the dreaded

Can’t load ‘C:/Perl/lib/auto/DBD/Oracle/Oracle.dll’ for module DBD::Oracle: load_file:%1 is not a valid Win32 application at C:/Perl/lib/ line 202.

Error later on after the compile when you try to use DBD::Oracle.

What is actually going on is that Perl cannot find oci.dll (or one of the other .dlls it needs to run) the

C:/Perl/lib/auto/DBD/Oracle/Oracle.dll’ and the DynaLoader error

is just a false trail as perl is very limited in what it Windows errors it can report on. For more complet info on this sort of error check out this page;

Oracle Troubleshooter HOWTO

by Alexander Foken. It is rather dated but the facts of why perl did not find a dll are still valid.

now you can do this

     c:\DBD-Oracle nmake test

and all the tests should run and it will report.

Finally simply do a

     c:\DBD-Oracle nmake install

and you are all set.

That is about it.

At this point you might want to add the Instant Client directory permanently to your path so you will not run into the Dynaloader error again.

As well you do not need to keep Visual Studio around to use DBD::Oracle so you can uninstall that as well.

March 17, 2009

» How to Have a Good Presentation

In about 15 minutes, Giuseppe Maxia will begin a webinar in which the main focus is a presentation on “How to have a good presentation”. Talk about meta!

Giuseppe posted how to join the free webinar.

The slides can be found at

June 25, 2008

» Liveblogging: Automated System Management

Usenix 2008 - Automated System Management, by Æleen Frisch of Exponential Consulting (and numerous books)

What is automation?

generic [perl|shell] scripts with cron,at

Problem: overlap of effort

So folks developed automation systems. General automation tools are around:

cfengine, puppet, cfg2

These are general — files, directories, etc. Don’t need to use chmod and chown and underlying commands.

However, they don’t really survive reboots well. For that, we tend to use tools more towards jumpstart, kickstart.

Monitoring with Nagios, related tools are rrd-tools such as cacti, cricket, munin, “or any of 8,000 others.” Automating ideas like iostat.

Nessus is a security testing tool.

homegrown, general, performance related, also automated backups — bakula, amanda, legato.

What do you want automated?

“Coffee machines”.

A lot of unsolved problems are human interaction.

Other problems solved — using remote power management.

Inventory management is another issue. HP OpenView is one, but Frisch says folks are not happy with it. You can pay for high-end monitoring systems.

A question came up about an inventory of users on systems. LDAP or NIS or Active Directory is the traditional solution where there are no local accounts. There’s authentication and then authorization, and the automated tools usually have authentication information but not authorization information. (You can handle it, but making groups on these tools is usually painful.) Authorization is usually handled either locally or as “if you’re authenticated you’re authorized”.

We talked about how to power down 500 machines when the air conditioning goes out, or when the power is going down. Combinations of temperature probes, “wake-on-lan”, remote power on and off were discussed.

What do people use to automate installs and configuration on Windows? For installation, the Windows native tools are great. It was noted that efs works better on Windows.

Anyone using Splunk with Windows? One answer — it works OK, there are some daemon tools to convert Windows Event Log to syslog.

Splunk came up as a topic of discussion, how it’s a great log management software and solves a problem we’ve had for decades — how to deal with logs. Frisch says, “Splunk is the most promising thing out there.”

Record keeping of time was brought up, as well as time management. Basically what we do at Pythian, so I explained how we do things. Other folks brought up ticketing systems as well. Jira and RT (Request Tracker) and OTRS (Open Ticket Request System) were brought up as well.

Also for change management, some folks use ClearCase (not open source), and others use rancid, others use Trac or bugzilla + change management system like subversion. Jira was recommended as a product that does both (with an add-on).

Use DHCP to help automate IP assigning. rsync is your friend too.

(it occurs to me that a dishwasher is an interesting problem; why do we have a dishwasher instead of just having a sink/dishwasher hybrid? Similarly, a hamper that does laundry for you when it’s full.)

January 8, 2008

» How to advocate for good backups! Or, how NOT to advocate for good backups!

I try to do a decent job of advocating for caring about good backups and business continuity strategies in my 7 Deadly Habits article.

But this one beats them all:

Grave Warning

Just too funny and great not to share. Found via this reddit article, where there is a lively discussion underway.

December 21, 2007

» Where is Storage QoS?

In the era of consolidation, storage has not been left out. Different systems are made to share the same storage boxes, fiber-channel switches and networks. Inside a typical storage box, we have front-end and back-end controllers, cache, physical spindles shared amongst different applications, databases, backup destinations, and so on.

The impact of backup on normal database activity . . . batch processing in one database impacting transactional processing — these are two real life examples of the consequences of storage consolidation known to almost every DBA. Of course, it’s easy to suggest separating databases to different physical disks, but what about SAN box controllers and shared cache? And don’t forget about the cost factor and ubiquitous consolidation that forces storage administrators to pack as much data as possible into a single SAN or NAS storage device.

Some of our customers use hosting services — they outsource hardware hosting just like they outsource DBA work to Pythian. In such scenarios, hosting service providers usually have storage hardware shared amongst different customers to provide higher utilization and on-demand storage capacity at a lower cost.


December 19, 2007

» Pythian Goodies: The Answer to Free Memory, Swap, Oracle, and Everything

I gave this talk at the UKOUG, and I have received a few requests to post the slides online. Instead of just posting the PowerPoint I took some time to give the presentation again (internally here at Pythian) and this time we recorded the session and we’re posting it here in a variety of formats. This is a bit of a departure from the typical Pythian Goodies, in that it is scripted, and there is a lot of content here in the whitepaper, but there hasn’t been a Goodie in a while so why not!

I’d love to hear from you, so please feel free to ask any follow-up questions to this post in the comments.


Do I have enough memory? Why is my free memory so low? Am I swapping to disk? Can I increase my SGA (db cache) size? Can I add another instance to this server? Are my system resources used optimally? These are all questions that often haunt DBAs. This presentation is The Answer. It covers in detail the different types of memory, how to monitor memory, and how to optimally use it with Oracle. Multiple examples in the presentation demonstrate how certain actions on the database side cause different memory areas to be allocated and used on the OS side. Key underlying differences in operating systems approaches to managing memory will be highlighted, with special attention given to Linux, Solaris, and Windows. Using Linux as an example throughout, this presentation explains how to effectively use tools such as “top”, “vmstat” and “/proc/meminfo” to look into into a system’s allocation and use of memory.

Below you should see a flash video with me giving the session.

Download this presentation!
IPod video (right-click and Save As . . .)
MP3 audio only

And below you will find the complete contents of the whitepaper. This is intended to be a good overall reference resource for how memory works in Oracle, using Linux as an example.


These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Slashdot
  • Google
  • Facebook
  • bodytext
  • Technorati
  • TwitThis
  • Reddit
  • Spurl
  • Furl
  • blogmarks
  • Ma.gnolia
  • E-mail this story to a friend!