return to OCLUG Web Site
A Django site.
October 15, 2012

» Adding Networks to Exadata: Fun with policy routing

I’ve noticed that Exadata servers are now configured to use Linux policy routing. Peeking at My Oracle Support, note 1306154.1 goes in a bit more detail about this configuration. It’s apparently delivered by default with factory build images and later. The note goes on to explain that this configuration was implemented because of asymetric routing problems associated with the management network:

Database servers are deployed with 3 logical network interfaces configured: management network (typically eth0), client access network (typically bond1 or bondeth0), and private network (typically bond0 or bondib0). The default route for the system uses the client access network and the gateway for that network. All outbound traffic that is not destined for an IP address on the management or private networks is sent out via the client access network. This poses a problem for some connections to the management network in some customer environments.

It goes onto mention a bug where this was reported:


The bug is not public, but the title does show the type of error messages that would appear if a packet with a non-local source address comes out.

This configuration is implemented using RedHat Oracle Linux-style /etc/sysconfig/network-scripts files, with matched rule- and route- files for each interface.

A sample configuration, where the management network is in the 10.10.10/24 subnet, is:

[root@exa1db01 network-scripts]# cat rule-eth0
from table 220
to table 220
[root@exa1db01 network-scripts]# cat route-eth0 dev eth0 table 220
default via dev eth0 table 220

This configuration tells traffic originating from the IP (which is the management interface IP on this particular machine), and also traffic destined to this address, to be directed away from the regular system routing table, to a special routing table 220. route-eth0 configures table 220 with two router: one for the local network, and a default route through a router on the network.

This contrasts with default gateway of the machine itself:

[root@exa1db01 network-scripts]# grep GATEWAY /etc/sysconfig/network

The difference between this type of policy routing and regular routing is that traffic with the _source_ address of will automatically go through default gateway, regardless of the destination. (The bible for Linux routing configuration is the Linux Advanced Routing and Traffic Control HOWTO, for those looking for more details)

I ran into an issue with this configuration when adding a second external network on the bondeth1 interface. I set up the additional interface configuration for a network,

[root@exa1db01 network-scripts]# cat ifcfg-bondeth1
BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000 num_grat_arp=100"

I also added rule and route entries:

[root@exa1db01 network-scripts]# cat rule-bondeth1
from table 211
to table 211
[root@exa1db01 network-scripts]# cat route-bondeth1 dev bondeth1 table 211 via dev bondeth1 table 211
default via dev bondeth1 table 211

This was a dedicated data guard network to a remote server, IP

The problem with this configuration was: it didn’t work. Using tcpdump, I could see incoming requests come in on the bondeth1 interface, but the replies come out the system default route on bondeth0, and not reaching their destination. After some digging, I did find the cause of the problem: in order to determine the packet source IP, the kernel was looking up the destination in the default routing table (table 255). And the route for the network was in non-default table 211. So the packet followed the default route instead, got a source address in the client-access network, and never matched any of the routing rules for the data guard network.

The solution ended up being rather simple: taking out the “table 211″ for the data guard network route, effectively putting it in the default routing table:

[root@exa1db01 network-scripts]# cat route-bondeth1 dev bondeth1 table 211
default via dev bondeth1 table 211 via dev bondeth1

And then we ran into a second issue: the main interface IP could now be reached, but not the virtual IP (VIP). This is because the rule configuration, taken from the samples, doesn’t list the VIP address at all. To avoid this issue, and in case of VIP addresses migrating from other cluster nodes, we set up a netmask in the rule file, making all addresses in the data guard network use this particular routing rule:

[root@exa1db01 network-scripts]# cat rule-bondeth1
from table 211
to table 211

So to sum up, when setting up interfaces in a policy-routed Exadata system remember to:

  • Set up the interface itself and any bonds using ifcfg- files
  • Create a rule- file for the interface, encompassing every possible address the interface could have. I added the entire IP subnet. Add “from” and “to” lines with a unique routing table number
  • Create a route- file for the interface, listing a local network route and a default route with the default router of the subnet, all using the table number defined on the previous step
  • Add to the route- file any static routes required on this interface, but don’t add a table qualifier

The final configuration:

[root@exa1db01 network-scripts]# cat ifcfg-eth8
[root@exa1db01 network-scripts]# cat ifcfg-eth12
[root@exa1db01 network-scripts]# cat ifcfg-bondeth1
BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000 num_grat_arp=100"
[root@exa1db01 network-scripts]# cat rule-bondeth1
from table 211
to table 211
[root@exa1db01 network-scripts]# cat route-bondeth1 dev bondeth1 table 211
default via dev bondeth1 table 211 via dev bondeth1

September 20, 2012

» Troubleshooting ORA-27090 async I/O errors with systemtap

Last week I ran into an issue on a high-volume Oracle database, whereby sessions were periodically failing with ORA-27090 errors. Job queue processes were also seeing this error, and showing messages like this in the database alert log:

Tue Sep 11 20:56:15 2012
Errors in file /orahome/oracle/base/diag/rdbms/dbm/dbm1/trace/dbm1_j001_126103.trc:
ORA-27090: Unable to reserve kernel resources for asynchronous disk I/O
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 3
Additional information: 128
ORA-27090: Unable to reserve kernel resources for asynchronous disk I/O
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 3
Additional information: 128

The tracefile just showed the same ORA-27090 messages, so nothing particularly useful there. oerr is of no help:

$ oerr ora 27090
27090, 00000, "Unable to reserve kernel resources for asynchronous disk I/O"
// *Cause: The system call to reserve kernel resources for asynchronous I/O
// has failed.
// *Action: Check errno

There’s a known bug, 7306820 “ORA-7445 [krhahw] / ORA-27090 during file header read. Instance may crash”, but this bug is fixed in, and this database is running

And on top of that, it’s an Exadata system, so I/O to storage servers goes over the InfiniBand network rather than using async I/O (AIO) calls.

A web search turned up a blog entry from DBHK’s blog, pointing to a value of aio-max-nr being set too low. However aio-max-nr is actually set to the same level as the recommended value, and matching Exadata defaults as well:

# cat /proc/sys/fs/aio-max-nr

The Linux kernel documentation has a brief but meaty description of this parameter:

aio-nr & aio-max-nr:

aio-nr is the running total of the number of events specified on the io_setup system call for all currently active aio contexts. If aio-nr reaches aio-max-nr then io_setup will fail with EAGAIN. Note that raising aio-max-nr does not result in the pre-allocation or re-sizing of any kernel data structures.

Having a peek at aio-nr:

# cat /proc/sys/fs/aio-nr

We’re within 2 of the absolute limit, so it looks highly likely that this limit is indeed the problem. However, the question is: who is using these AIO events? This DB is a huge session hog (8000+ concurrent sessions) but even there, 3m is a pretty high limit. And at this point we can’t even be sure that it’s database processes using up the AIO events.

The only AIO-related information in /proc (or /sys for that matter) is the two files in /proc/sys/fs. To go into more detail requires some more tools.

Solaris admins will no doubt be familiar with DTrace, a kernel tracing framework that can expose all kinds of information in the OS kernel, among other things. Oracle has ported DTrace to Linux, but it requires the latest-and-greatest UEK2 kernel, and not yet supported on Exadata.

I came across another tool that also allows kernel inspection, and _is_ available in Oracle Linux 5: systemtap. systemtap hooks into the call stack, allowing function calls to be traced, arguments captured, and if you’re really brave, actually modified.

With dependencies, I ended up needing to add four packages. As this machine doesn’t (yet) have a working yum repository, I used to obtain the following:


The avahi package is tool for plug-and-play networking that I don’t exactly want running on a server, but the systemtap binary is linked to it for remote copmilation capability. Avahi configures itself to auto-start itself on next boot, so I disabled that:

# chkconfig avahi-daemon off
# chkconfig avahi-dnsconfd off

The systemtap packages complained about missing kernel package depedencies, since this system is running’s UEK kernel, naming the kernel package kernel-u2k instead. I ended up installing when with the –nodeps option to skip dependency checking.

I couldn’t find any pre-made scripts to monitor AIO, but in a 2008 presentation from Oracle Linux engineering does have a bullet point on it:

• Tracking resources tuned via aio_nr and aio_max_nr

So based on some of the many example scripts I set out to build a script to monitor AIO calls. Here is the end result:

stap -ve '
global allocated, allocatedctx, freed

probe syscall.io_setup {
  allocatedctx[pid()] += maxevents; allocated[pid()]++;
  printf("%d AIO events requested by PID %d (%s)\n",
  	maxevents, pid(), cmdline_str());
probe syscall.io_destroy {freed[pid()]++}

probe kprocess.exit {
  if (allocated[pid()]) {
     printf("PID %d exited\n", pid());
     delete allocated[pid()];
     delete allocatedctx[pid()];
     delete freed[pid()];

probe end {
foreach (pid in allocated) {
   printf("PID %d allocated=%d allocated events=%d freed=%d\n",
      pid, allocated[pid], allocatedctx[pid], freed[pid]);

Sample output (using sytemtap’s -v verbose option to see compilation details):

Pass 1: parsed user script and 76 library script(s) using 147908virt/22876res/2992shr kb, in 130usr/10sys/146real ms.
Pass 2: analyzed script: 4 probe(s), 10 function(s), 3 embed(s), 4 global(s) using 283072virt/49864res/4052shr kb, in 450usr/140sys/586real ms.
Pass 3: using cached /root/.systemtap/cache/11/stap_111c870f2747cede20e6a0e2f0a1b1ae_6256.c
Pass 4: using cached /root/.systemtap/cache/11/stap_111c870f2747cede20e6a0e2f0a1b1ae_6256.ko
Pass 5: starting run.
128 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
4096 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))
128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))
(and when control-C is pressed):

PID 99043 allocated=6 allocatedevents=12672 freed=3
PID 37074 allocated=12 allocatedevents=25344 freed=6
PID 99039 allocated=18 allocatedevents=38016 freed=9
PID 69142 allocated=24 allocatedevents=50688 freed=12
PID 32885 allocated=36 allocatedevents=76032 freed=18
PID 69099 allocated=6 allocatedevents=12672 freed=3
Pass 5: run completed in 0usr/50sys/9139real ms.

It’s quite obvious here that the AIO allocations are all happening from oracle database processes.

From the summary output we can see that each process seems to run io_setup twice as much as io_destroy; kernel gurus may have an answer to this, but I suspect it has more to do with the data gathering than a massive leak in AIO events.

But the more interesting result is the frequent allocation of 4096 AIO events at a time. On a database with 8000 connections, that would be over 10 times the current limit.

The only major downside of increasing this limit seems to be th avoid exhausting kernel memory. From a 2009 post to the linux-kernel mailing list:

Each time io_setup is called, a ring buffer is allocated that can hold nr_events I/O completions. That ring buffer is then mapped into the process’ address space, and the pages are pinned in memory. So, the reason for this upper limit (I believe) is to keep a malicious user from pinning all of kernel memory.

And with consultation with Oracle support, we set aio-max-nr to 50 million, enough to accommodate three databases with 16k connections all allocating 4096 AIO events. Or in other words, way more than we ever expect to use.

# sysctl -w fs.aio-max-nr=50000000

And since this change, the ORA-27090 errors have gone away.

March 27, 2012

» Using Oracle VM with Amazon EC2

If you’re planning on running Oracle VM with Amazon EC2, there are some important limitations you should know about.

As part of my work getting the Oracle Linux Unbreakable Enterprise Kernel 2 working (yeah that’s a mouthful) I tried using the Oracle-supplied Oracle Linux 6 AMI images that are listed as community AMIs by Amazon:

[marc@shakybox2 ~]$ ec2-describe-images -a --filter "name=*Oracle Linux 6*"
IMAGE   ami-87a076ee    936851542197/Oracle Linux 6.1 x86_64 - OVM      936851542197    available       public          x86_64  machine aki-4e7d9527ebs      paravirtual     <b>ovm</b>
BLOCKDEVICEMAPPING      /dev/sda                snap-ea95e18e   20

The “OVM” part here is interesting. Amazon EC2 is typically set up on their own version of Xen, but this image uses a different, though similarly Xen-based, hypervisor: Oracle VM. This appears to be a result of the Oracle-Amazon partnership so touted by Larry Ellison in his OpenWorld 2010 keynote.

The AMI itself launched as expected (once I had set the instance type to large; it we seem we need the 7.5G of RAM the large instance provides! But after the initial install, I was completely unable to use any kernel other than the 2.6.32-100.34.1.el6uek.x86_64 kernel originally supplied.

Every time the instance would be listed as running, but unreachable with the console log showing a variation of

"main" "root=/dev/sda1" "ro" "4"
vbd 2049 is hd0
******************* BLKFRONT for device/vbd/2049 **********

backend at /local/domain/0/backend/vbd/313/2049
Failed to read /local/domain/0/backend/vbd/313/2049/feature-barrier.
Failed to read /local/domain/0/backend/vbd/313/2049/feature-flush-cache.
41943040 sectors of 512 bytes

Since it’s not possible to specify a –kernel alternate kernel option with Oracle VM hosts, I can’t explicitly attach a different kernel. And the ec2-register command used to create AMI images does not have an option to create an image with the Oracle VM hypervisor at all. This Amazon FAQ document gives a clue:

Q: Can customers import Oracle Virtual Machine templates into AmazonEC2?

Oracle will distribute templates of Oracle Applications that have been produced by Oracle and approved by AWS. Customers will be able to customize these environments and save them as derivative images which also run on the Oracle Virtual Machine.

So it’s possible to customize pre-made templates, but the ability to create new templates is conspicuously not mentioned. And indeed, it’s possible to create a derivative image using ec2-create-image that still has an OVM hypervisor, but there’s no mechanism to use a custom kernel with this command.

So the take away here seems to be:

  • It’s possible to create Oracle VM-based Amazon EC2 instances, but only from Oracle-created AMIs
  • There’s no mechanism to use kernels other than those bundled with the image, so there’s no prospect of applying security updates etc unless Oracle releases a new AMI
  • Fortunately, I was able to get the Oracle Linux UEK2 kernel working with Amazon’s ordinary Xen hypervisor, though there’s no pre-built AMI for it. I made a separate blog post about this.

For you readers: have any of you managed to get a successful Oracle VM-based Amazon EC2 instance working? How do you deal with the issue of kernel updates?

» Testing out Oracle’s Unbreakable Enterprise Kernel 2

As announced a few days ago, Oracle’s core database product is now supported on Oracle Linux 6. Coming a full 13 months after Oracle Linux 6′s launch, and 16 months after Red Hat Enterprise Linux 6, it’s a much anticipated announcement.

Update 28-Mar-12: the official certification information has come out on My Oracle Support. So far it’s only certified for Oracle Linux 6 Unbreakable Enterprise Kernel version 1, and only for version under Linux x86-64. It also means that a certified combination should be possible using the Oracle-supplied OEL 6 EC2 AMI, though it’s frozen at the original 6.2 release kernel. Unfortunately Oracle is not currently available on OTN but rather requires an active license to download from My Oracle Support.

Oracle’s UEK2 (and Red Hat Enterprise Linux 6) add some very interesting features like:

  • btrfs, a big break in filesystem organization from traditional ext2/3, with the promise of lightweight copy-on-write snapshot support
  • transparent hugepages, dynamically allocating hugepages as needed and performing background memory defragmentation to attempt to free up contiguous space
  • transmit packet steering, allowing multiple CPUs to load-balance outgoing network traffic

Although the press release states it’s available “immediately”, I couldn’t find any notes on Oracle’s My Oracle Support support portal relating to the product support; in fact, it still lists Oracle Enterprise 6 as being uncertified as of this writing. So I’m not sure how it will pass the pre-installation operating system checks.

No matter, I’m going to test this out, and an obvious way to do this would be using Amazon EC2, providing high-capacity instances on demand.

After some blind allies getting the Oracle Linux UEK2 kernel working with Amazon EC2 and Oracle VM I found that I could make it work without Oracle VM, with with Amazon’s default Xen hypervisor. Here are the steps I used:

- Sign up for an Amazon EC2 account and set up the EC2 API tools on your client machine. There are lots of tutorials on how to do this online

- Create a new EC2 instance using a “builder” AMI; I chose a 64-bit CentOS 6 image “CentOS 6 PVGrub EBS AMI”, AMI ID ami-1f66b276

[marc@shakybox2 tmp]$ ec2-run-instances -k marc-aws -n 1 -t m1.medium -z us-east-1d ami-1f66b276
RESERVATION     r-d18f28b2      462281317311    default
INSTANCE        i-22d8f846      ami-1f66b276                    pending marc-aws        0             m1.medium        2012-03-24T21:25:11+0000        us-east-1d      aki-427d952b                    monitoring-disabled                                    ebs                                     paravirtual   xen              sg-5fc61437     default

- Assign a name to the instance

[marc@shakybox2 tmp]$ ec2-create-tags i-22d8f846  --tag 'Name=Instance Builder'
TAG     instance        i-22d8f846      Name    Instance Builder

- Authorize incoming SSH

[marc@shakybox2 ~]$ ec2-authorize default -p 22 -s $(ip addr list dev eth0 | awk '/inet / {print $2}')

- Create a new 20G EBS volume; this will be the “golden image” root disk. Attach it to the builder instance

[marc@shakybox2 tmp]$ ec2-create-volume -s 20 -z us-east-1d
VOLUME  vol-d7340cbb    20              us-east-1d      creating        2012-03-24T21:31:39+0000

[marc@shakybox2 tmp]$ ec2-attach-volume -i i-22d8f846 -d /dev/sdd vol-d7340cbb
ATTACHMENT      vol-d7340cbb    i-22d8f846      /dev/sdd        attaching       2012-03-24T21:33:26+0000

- Get the IP address to connect to (substituting the correct image ID and hostname):

[marc@shakybox2 tmp]$ ec2-describe-instances  i-22d8f846
RESERVATION     r-d18f28b2      462281317311    default
INSTANCE        i-22d8f846      ami-1f66b276 ip-10-116-237-78.ec2.internal  running marc-aws        0               m1.medium       2012-03-24T21:25:11+0000      us-east-1d       aki-427d952b                    monitoring-disabled ebs                                      paravirtual     xen             sg-5fc61437     default
BLOCKDEVICE     /dev/sda1       vol-39310955    2012-03-24T21:25:28.000Z        true
TAG     instance        i-22d8f846      Name    Instance Builder

[marc@shakybox2 tmp]$ ssh -i marc-aws.pem

- Finding the volume inside our AMI, which just got hotplugged:

[root@ip-10-116-237-78 ~]# dmesg | tail -2
blkfront: xvdh: barriers disabled
 xvdh: unknown partition table

- Creating a filesystem and mounting. Note I’m not creating a partition table, but rather a raw filesystem. This will make things much easier if the volume ever needs to be resized.

[root@ip-10-116-237-78 ~]# mke2fs -j -L / /dev/xvdh
mke2fs 1.41.12 (17-May-2010)
Filesystem label=/
OS type: Linux
This filesystem will be automatically checked every 35 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@ip-10-116-237-78 ~]# mkdir /mnt/ec2-fs
[root@ip-10-116-237-78 ~]# mount /dev/xvdh /mnt/ec2-fs

- Creating the base directories, as per Jonathan Hui’s excellent blog post

mkdir /mnt/ec2-fs/dev
/sbin/MAKEDEV -d /mnt/ec2-fs/dev/ -x console
/sbin/MAKEDEV -d /mnt/ec2-fs/dev/ -x null
/sbin/MAKEDEV -d /mnt/ec2-fs/dev/ -x zero
mkdir /mnt/ec2-fs/etc
mkdir /mnt/ec2-fs/proc

- Creating /etc/fstab

cat > /mnt/ec2-fs/etc/fstab <<EOF
LABEL=/ /       ext3    defaults 1 1
none    /proc   proc    defaults 0 0
none    /sys    sysfs   defaults 0 0

- Mounting /proc

mount -t proc none /mnt/ec2-fs/proc

- Grabbing the OEL 6 yum config file

[root@ip-10-116-237-78 ~]# cd /root
[root@ip-10-116-237-78 ~]# wget
--2012-03-24 22:42:54--
Connecting to||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1461 (1.4K) 1
Saving to: âpublic-yum-ol6.repoâ

100%[======================================>] 1,461       --.-K/s   in 0s

2012-03-24 22:42:55 (106 MB/s) - âpublic-yum-ol6.repoâ

cat <<-EOF >> public-yum-ol6.repo

- Installing the base OS

[root@ip-10-116-237-78 ~]# yum -c /root/public-yum-ol6.repo --installroot=/mnt/ec2-fs -y groupinstall Core
ol6_latest                                               | 1.1 kB     00:00

- Installing the latest UEK2 kernel

yum -c /root/public-yum-ol6.repo --enablerepo=ol6_UEK_latest --installroot=/mnt/ec2-fs -y install kernel-uek

(not small: 200m+ for the kernel alone)

- Setting up base networking scripts

cat > /mnt/ec2-fs/etc/sysconfig/network-scripts/ifcfg-eth0 <<EOF
echo "NETWORKING=yes" > /mnt/ec2-fs/etc/sysconfig/network
echo "nameserver" > /mnt/ec2-fs/etc/resolv.conf
echo "UseDNS  no" >> /mnt/ec2-fs/etc/ssh/sshd_config
echo "PermitRootLogin without-password" >> /mnt/ec2-fs/etc/ssh/sshd_config
echo "hwcap 0 nosegneg" > /mnt/ec2-fs/etc/

- Script download of SSH private key on startup in case it’s missing (though with EBS-backed storage this shouldn’t be necessary)

cat > /mnt/ec2-fs/usr/local/sbin/ <<EOF
if [ ! -d /root/.ssh ] ;
then mkdir -p /root/.ssh
chmod 700 /root/.ssh
# Fetch public key using HTTP
/usr/bin/curl -f > /tmp/my-key
if [ $? -eq 0 ] ;
cat /tmp/my-key >> /root/.ssh/authorized_keys
chmod 600 /root/.ssh/authorized_keys
rm -f /tmp/my-key
chmod u+x /mnt/ec2-fs/usr/local/sbin/
echo "/usr/local/sbin/" >> /mnt/ec2-fs/etc/rc.d/rc.local

- Clean up temporary files from the installs (mostly the RPM packages)

yum -c /root/public-yum-ol6.repo --installroot=/mnt/ec2-fs clean all

- Set up GRUB boot files (note the kernel version here should

chroot /mnt/ec2-fs
cd /boot/grub
cat > grub.conf <<EOF

title Oracle Linux UEK
        root (hd0)
        kernel $(echo /boot/vmlinuz*uek.x86_64) ro root=LABEL=/ ro console=hvc0 crashkernel=auto LANG=en_US.UTF8 selinux=0
        initrd $(echo /boot/initramfs*uek.x86_64.img)
title Oracle Linux Compatibility Kernel
        root (hd0)
        kernel $(echo /boot/vmlinuz*el6.x86_64) ro root=LABEL=/ ro console=hvc0 crashkernel=auto LANG=en_US.UTF8 selinux=0
        initrd $(echo /boot/initramfs*el6.x86_64.img)
ln -s grub.conf menu.lst

- Set up swap. There’s no need to put this on expensive EBS storage when ephemeral storage will do)

cat > /mnt/ec2-fs/usr/local/sbin/ <<EOF
dd if=/dev/zero of=\$VOL bs=\${SIZE}k count=1 && mkswap \$VOL \$SIZE && swapon \$VOL
chmod +x /mnt/ec2-fs//usr/local/sbin/
echo "/usr/local/sbin/ /dev/xvdb 2097152" >> /mnt/ec2-fs//etc/rc.d/rc.local

- Unmount our root disk

umount /mnt/ec2-fs/proc
umount /mnt/ec2-fs

- Log out of the builder instance; our work there is done


- Create a snapshot of the root volume. Use the volume ID originally used to create the volume

[marc@shakybox2 tmp]$ ec2-create-snapshot -d "UEK2 AMI creation point" vol-d7340cbb
SNAPSHOT        snap-b94519c3   vol-d7340cbb    pending 2012-03-25T02:05:43+0000                462281317311   20      UEK2 AMI creation point

- Check when it’s completed

[marc@shakybox2 tmp]$ ec2-describe-snapshots snap-b94519c3
SNAPSHOT        snap-b94519c3   vol-d7340cbb    completed       2012-03-25T02:05:43+0000        100%  462281317311     20      UEK2 AMI creation point

- Register the snapshot, effectively creating an AMI image. This is a long command and cannot be changed once created, so it deserves some explanation

-n: user-provided unique name
-a: architecture, which must match the 64-bit kernel
-d: description, a text description
–root-device-name: this maps to the “root” parameter given to the PVGRUB bootloader
-b: block mapping. There are two here: one pointing to the root volume snapshot we just created, and one on non-permanent storage we’ll use for swap
–kernel: This kernel is actually a stub kernel running PVGRUB, a bootloader that loads the UEK2 kernel from the root drive. This particular kernel is for a 64-bit unpartitioned image in the us-east region.

The kernel ID is a generic 64-bit AMazon PVGRUB kernel for the US-East region

[marc@shakybox2 tmp]$ ec2-register -n UEK2-AMI -a x86_64 -d "AMI using the Oracle Linux UEK2 kernel" --root-device-name /dev/sda -b /dev/sda=snap-b94519c3 -b /dev/sdb=ephemeral0 --kernel aki-427d952b
IMAGE   ami-c39f41aa

- Now for the moment of truth: launch a VM based on the newly-created AMI

[marc@shakybox2 tmp]$ ec2-run-instances -k marc-aws -n 1 -t m1.medium  ami-c39f41aa
RESERVATION     r-19b0167a      462281317311    default
INSTANCE        i-5688ab32      ami-c39f41aa                    pending marc-aws        0             m1.medium        2012-03-25T00:08:10+0000        us-east-1d      aki-427d952b                    monitoring-disabled                                    ebs                                     paravirtual   xen              sg-5fc61437     default
[marc@shakybox2 tmp]$ ec2-describe-instances i-5688ab32
RESERVATION     r-19b0167a      462281317311    default
INSTANCE        i-5688ab32      ami-c39f41aa       ip-10-62-98-125.ec2.internal   running marc-aws        0               m1.medium       2012-03-25T02:08:10+0000       us-east-1d      aki-427d952b                    monitoring-disabled                   ebs                                     paravirtual     xen             sg-5fc61437    default
BLOCKDEVICE     /dev/sda        vol-d59aa2b9    2012-03-25T00:08:28.000Z        true
[marc@shakybox2 tmp]$ ec2-create-tags --tag="Name=UEK2 Test Instance" i-5688ab32
TAG     instance        i-5688ab32      Name    UEK2 Test Instance

ssh’ing into the machine, we can confirm it’s running the UEK:

[root@ip-10-62-98-125 ~]# uname -a
Linux ip-10-62-98-125 2.6.39-100.5.1.el6uek.x86_64 #1 SMP Tue Mar 6 20:26:00 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

Adding swap

Oracle’s pre-installation steps require swap space. Since Amazon EBS charges for storage by the GB, it makes little sense to pay for persistent storage for swap. The alternative is to use transient storage for this. Since at boot time we can’t be guaranteed of a state it’s safest to zero it out and to create swap at that point. We set aside some space on /dev/sdb (which maps to /dev/xvdb since the Oracle UEK kernel doesn’t to the drive mapping that the Amazon kernel does).

We’ll create a startup script to run in rc.local, the last point in the startup. It will take a while to run, but since sshd and other system services will already be running, it shouldn’t slow down the startup of any other processes.

cat > /usr/local/sbin/ <<EOF
dd if=/dev/zero of=\$VOL bs=\${SIZE}k count=1 && mkswap \$VOL \$SIZE && swapon \$VOL
chmod +x /usr/local/sbin/
echo "/usr/local/sbin/ /dev/xvdb 2097152" >> /etc/rc.d/rc.local


There aren’t a whole lot of troubleshooting tools in EC2, especially compared to something like Oracle databases. There is one invaluable tool to debug AMI builds though: the console output. It usually takes several minutes to appear, but it can help determine what went wrong when an instance is inaccessible.

[marc@shakybox2 tmp]$ ec2-get-console-output i-76634012

It was particularly frustrating to get GRUB to find the root devices; when it can’t find them it just displays a grubdom> prompt without error message. The official documentation recommends a –rootdevice of /dev/sda1 (rather than /dev/sda) and hd00 kernel aki-4e7d9527 but I couldn’t get these to work. It might be because there is no partition table on the root disk, but without access to the interactive grub interface or more diagnostic output I can’t know for sure.


Amazon docs for use specified kernels
Jonathan Hui’s blog post about creating CentOS images from scratch
Amazon docs on how to launch images from snapshots
Wim Coakearts’ blog post on using

Coming up: the actual Oracle database software install

November 15, 2011

» Linux Patching and Oracle: Detect RPM conflicts before they happen

A common scenario in the life of a DBA on a linux server looks something like this:

From: Sue-the-sysadamin To:the-dba Date: 2011-10-28 17:54:34 Dear DBA, We will be patching the linux systems this Friday night, October 28th at 21:00. The list of patches is attached. Let us know if there are any conflicts with Oracle. BTW, we need to know by 14:00 tomorrow. Sue

Yikes! The SysAdmin wants to apply these patches this week, and needs to know if there are any conflicts this afternoon.

So you open up the list of patches. When you see the lengthy list of patches you suddenly realize the long lunch you had planned with your sweetheart is going to be rescheduled.

I recently received several lists of patches that were to be applied to a number of different linux servers that were all running oracle, which led to this blog entry.

Matching Oracle shared lib dependencies to RPM files

Being faced with the task of determining which, if any, components of Oracle would possibly be affected by the list of RPMs to be upgraded across several servers with is a daunting task.  One quick method to deal with this is to look for glibc in the list of RPMs.  The list I received contained the following:

  • glibc-2.5-58.el5_6.4.i686
  • glibc-2.5-58.el5_6.4.x86_64
  • glibc-common-2.5-58.el5_6.4.x86_64
  • glibc-devel-2.5-58.el5_6.4.i386
  • glibc-devel-2.5-58.el5_6.4.x86_64
  • glibc-headers-2.5-58.el5_6.4.x86_64

Since nearly everything will be affected by an update to glibc,  it would be a good idea to test these new RPMs out on the test server for that purpose before deploying them in productions.

What if however you need a more exacting list of which Oracle components are affected by which packages?

This information is a little harder to come by.  While Oracle does supply a list of RPMs that must be installed on a Linux system, that information does not help you in determining which Oracle files are dependent on the RPMs in question.

If the list of RPMs to be updated is small, it may even be that that few, if any of the Oracle files on a server may be affected.

Here’s how that can be determined.  This may not be complete, but I think its a pretty good start.

Find the dependency files

Stepping through this manually, we can check an oracle executable file for the list of other files that is dependent on with ldd, which will display shared library dependencies.

[oracle@oravm01 patch_check]$ ldd $ORACLE_HOME/bin/oracle => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54b700000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54b802000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54b958000) => /lib64/ (0x0000003ed8c00000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54bb2c000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54bef4000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c00d000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c130000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c66e000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c771000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54c90b000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54caa5000) => /u01/app/oracle/product/11.2.0/vmdb01/lib/ (0x00002ab54cbb1000) => /usr/lib64/ (0x00002ab54cd33000) => /lib64/ (0x0000003ed8000000) => /lib64/ (0x0000003ed7c00000) => /lib64/ (0x0000003ed8400000) => /lib64/ (0x0000003edb400000) => /lib64/ (0x0000003ed7800000)
        /lib64/ (0x0000003ed7400000)

Here we can see that the oracle binary is dependent on several library files.  Many of them are owned by oracle, but some of them are files that are part of the Linux system.

Find the RPM

The  next step is find which RPM the files are part of.  We can use the rpm command for this.

The -q option is for ‘query’, while the ‘-f’ causes rpm to report which package a file is associated with.

Choosing one of the linux owned files as an example:

[oracle@oravm01 patch_check]$ rpm -qf /lib64/

What happens if rpm -qf is run against one of the oracle owned dependency shared library files?

[oracle@oravm01 patch_check]$ rpm -qf /u01/app/oracle/product/11.2.0/vmdb01/lib/
file /u01/app/oracle/product/11.2.0/vmdb01/lib/ is not owned by any package

As you might expect, rpm has no knowledge of oracle owned files, and reports that fact.

Does the patch affect Oracle?

The final step is to compare the current package with the patch list.  In this case there is a match – glibc-2.5-49 is to be replaced with glibc-2.5.58.

Now all that you need to do is repeat this process for all oracle binary files.  This obviously does not work well as a manual process.

Fortunately this is not too difficult a task to automate.   Those readers already familiar with me will not be surprised to learn that this automation involves the use of Perl.

The Patch Check Script

Or rather, scripts.  The Perl script is complemented by the shell script  The reason for using a driver shell script is that there may be more than one Oracle home directory on the server that you need to check.

One of the few limitations of using Perl with Oracle involves the LD_LIBRARY_PATH variable.   This variable cannot be set from within a Perl script, as the Perl compiler loads in the value of LD_LIBRARY_PATH before running your script, so subsequent attempts to change it are futile.  There are some hacks that will work on some platforms that allow LD_LIBRARY_PATH to be changed at runtime, but I would rather not go that route.  The limitations of using LD_LIBARY_PATH are pretty well known, so it is best just to work around that particular issue.

The RPM patch list

The list of RPM should bin a file, one RPM per line. The file names should not contain the .rpm extension. If the extension is included, you will need to remove it. Either that, or you will need to change the script to work with the .rpm extension.

Why does the list of RPMs not include the .rpm extension?  Simple, that’s how I received the list.

Here is a sample of files from a list of RPMs to be updated:

[oracle@oravm01 patch_check]$ head -5 patch_list.txt

Which Files are searched?

Not all files in an Oracle home are processed.  The find command is used with the -perm  /u+x option to find all files that have the executable bit set for the file owner.

These files are then processed to determine RPM dependencies.

The Perl script

Let’s invoke to see what it does when run against a single Oracle home with the standard setting for LD_LIBRARY_PATH and other oracle environment variables.

Here’s the current environment:

[oracle@oravm01 patch_check]$ set | grep -E 'ORA|LD_LIB'

Now run the script. Notice that STDERR has been redirected to a file – more on that later

[oracle@oravm01 patch_check]$ ./ -verbosity 0 -oracle_home $ORACLE_HOME -linux_patch_list patch_list.txt 2> pc.err

#### Possible Conflicts Found #####
--- gdbm-1.8.0-26.2.1.el5_6.1.x86_64
old: 1.8.020 26.220
new: 1.8.020 26.2.1.el5_6.120
--- glibc-2.5-58.el5_6.4.x86_64
old: 2.520 4920
new: 2.520 58.el5_6.420
--- libgcc-4.1.2-50.el5.x86_64
old: 4.1.220 4820
new: 4.1.220 50.el520
--- libstdc++-4.1.2-50.el5.x86_64
old: 4.1.220 4820
new: 4.1.220 50.el520 has found 4 RPM packages that are scheduled to be updated and that Oracle is dependent on.
This new information may cause the patch schedule to be pushed back until it is determined that these updates will have no adverse affects on Oracle.

Now let’s consider the file pc.err, where all error messages were redirected to.

We see there 336 lines in the error file.

[oracle@oravm01 patch_check]$ wc -l pc.err
336 pc.err

The majority of the errors are from rpm -qf  – as seen earlier the RPM database knows nothing of files owned by Oracle.   As such, these are errors we are not really concerned with.  There may however be errors hiding in the file that we are concerned about.

[oracle@oravm01 patch_check]$ head -7 pc.err
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found

Following is another error  – this one is not an rpm error, so it may be something to be concerned about.

##### ldd error ############
error encountered with /usr/bin/ldd /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/
run the command manually to see the error

Running the command manually results in a segmentation fault.

[oracle@oravm01 patch_check]$ ldd /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/
/usr/bin/ldd: line 116:  5216 Segmentation fault      LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW= LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$@"

A quick google search on ‘ldd segmentation fault’ suggests this is a linux bug. We will not be investigating this further, as we aren’t too concerned with that at this time.

With the large number of errors that we do not really care about being reported by the rpm -qf command, how can we reduce the number of errors?

The shell script

The script serves two purposes.  It reads the file oracle_homes.txt and runs for each ORACLE_HOME.  The script also sets the LD_LIBARY_PATH variable to eliminate many of the false positives seen in pc.err previously. Following is the entire script.  You may want to adjust the values used for LD_LIBARY_PATH to suit your environment.


# set the environment to call (patch check)


[ -r "$OHOMES_FILE" ] || {
echo The file $OHOMES_FILE is missing
exit 1

while read ohome
sid=$(basename $ohome)

echo $ohome

# you may have to experiment a bit to get the needed paths in LD_LIBRARY_PATH
# in 11g there will probably still be a few files that need to be checked manually
export LD_LIBRARY_PATH=$ohome/lib:$ohome/jdk/jre/lib/amd64:$ohome/jdk/jre/lib/amd64/server

./  -verbosity 0 -oracle_home $ohome -linux_patch_list $PATCH_LIST_FILE >$LOGFILE 2>$ERRFILE


Lets run the script and then check the error logs.

[oracle@oravm01 patch_check]$ ./

Checking on the output files shows that the error logs are much smaller than what was seen previously.

[oracle@oravm01 patch_check]$ ls -ltar *.log *.err
-rw-r–r– 1 oracle oinstall   455 Nov 14 13:18 patch_check_grid.err
-rw-r–r– 1 oracle oinstall 34384 Nov 14 13:18 patch_check_grid.log
-rw-r–r– 1 oracle oinstall   296 Nov 14 13:18 patch_check_vmdb01.err
-rw-r–r– 1 oracle oinstall 34384 Nov 14 13:18 patch_check_vmdb01.log

[oracle@oravm01 patch_check]$ wc -l patch*.err
9 patch_check_grid.err
6 patch_check_vmdb01.err
15 total

By including some more library directories in LD_LIBRARY_PATH, rpm -qf was able to resolve many more dependencies, greatly reducing the false positives that were included in the error files.   Here are the total contents of the error files:

[oracle@oravm01 patch_check]$ cat *.err
Use of uninitialized value in hash element at ./ line 393.
Use of uninitialized value in hash element at ./ line 393.
working on /u01/app/oracle/product/11.2.0/grid/jdk/jre/lib/amd64/ dependencies => not found
working on /u01/app/oracle/product/11.2.0/grid/jdk/jre/lib/amd64/ dependencies => not found
rpmq: no arguments given for query
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found
working on /u01/app/oracle/product/11.2.0/vmdb01/jdk/jre/lib/amd64/ dependencies => not found

The ‘uninitialized value’ errors are from Perl – this is likely a bug in the script. Looking at the script I can see that while it is a script in the bug, it is not causing any incorrect reporting.

[oracle@oravm01 patch_check]$ grep -n . | grep -3 393
389:                            ($dummy1,$dummy2,$file) = split(/\s+/,$line);
390:                    }
392:                    #print "      ###### FILE: $file\n";
393:                    push @{$self->{olibDeps}->{$file}{dependents}}, $exe;
394:            }
395:    }
397:    # all library dependencies now in %olibDepend

Line 393 should be executed only if there is a value defined for $file.

Changing it as follows this fixed the bug:

push @{$self->{olibDeps}->{$file}{dependents}}, $exe if $file;

The small number of lines now found in the .err files makes it simple to determine if these errors are an issue to be concerned with.

Download the scripts

The scripts can be downloaded here – Patch Check Distribution

The zip file contains a tarball patch_check_dist.tgz.  Use tar or your favorite unarchiving utility to extract the scripts.

June 13, 2011

» Upgrade to MySQL 5.1.56 on Bacula server using 5.0.x and MyISAM tables

Hello there, it’s me again, with another blog about a DBA situation that a typical Linux Administrator may find themselves in.

In this blog, i’m going to review a recent MySQL upgrade I have done on one of the systems I am involved in administering. This is a real world example of an upgrade project, and hopefully when we’re done, there may even be an overall performance boost.

There are several reasons to perform upgrades (of any kind), for me an important one is to keep current for security and bug fixes, but general performance improvements and new features are always welcome as well.

This system is running Bacula, an open source enterprise backup system. In this particular case Bacula is configured to store data in a MySQL database. The data stored include status reports, backup content lists, lists of all the files on all the systems, schedules and other related information. While everything has been operating “generally fine” for several years, this also means several components of the system are quite out of date.

The system is using CentOS 5.3, and the versions of MySQL and Bacula that shipped with it (5.0.45 and 2.2.8 respectively) which are both quite old by todays standards. Officially MySQL 5.0.x is now completely out of support by Oracle/, so upgrading is quite important.

The OS definitely could/should be updated to CentOS 5.6 (current at the time of this writing), but that actually wouldn’t make a significant dent on the version of MySQL (or likely Bacula) itself. The OS upgrade, and the Bacula upgrade discussion i’ll leave to the appropriate System Administration team, i’ll just be reviewing what has been done with MySQL itself.

As the main application (Bacula) will not be upgraded – this means that any new features in MySQL 5.1.56 won’t really be used unless it’s something I can configure in the my.cnf file.

Initially my plan was to use the remi yum archives ( which I have seen used in several situations. The MySQL RPM’s in these yum repositories generally replace the RPM’s that are shipped in RHEL/CentOS, but are current GA versions. For non-DBA’s, I find these archives a really good way of moving to new releases without a lot of effort outside the regular system upgrading activities.

Unfortunately the server in question doesn’t have direct external Internet access. It only has access to some internal repositories, and access to/from other internal systems. I figured that if i’m going to have to copy RPM files manually anyways, I will use the official Oracle/ packages which are generally the preferred method (speaking as a DBA).

The overall plan was/is the following:

1) Schedule outage with all monitoring tools, and notify the appropriate people of a pending outage to the backup system
2) Confirm current backup is in place for the existing database and system files
3) Copy packages downloaded from to a location accessible to the system in question (in my case an NFS mounted home directory)
4) Stop mysql and bacula-director

/etc/init.d/mysql stop
/etc/init.d/bacula-dir stop

5) remove mysql and mysql-server – ignoring the complaint about libmysqlclient dependancies

rpm -e –nodeps mysql mysql-server

6) install the packages downloaded from

rpm -Uvh MySQL-server-community-5.1.56-1.rhel5 MySQL-shared-compat-5.1.56-1.rhel5 MySQL-client-community-5.1.56-1.rhel5 MySQL-devel-community-5.1.56-1.rhel5

7) upgrade the mysql database (using mysql_upgrade) access controls

/usr/sbin/mysqld –user=mysql –skip-grant &
killall mysqld

8) restart mysql to confirm it starts up fine with no config changes.

/etc/init.d/mysqld start
tail -f /var/log/mysqld.log

9) update my.cnf to enable some new features (innodb plugin)

vi /etc/my.cnf

and add the following lines to the [mysqld] section


10) restart mysql and confirm it has now loaded the innodb plugin, and is still operating fine.

/etc/init.d/mysql restart
tail /var/log/mysqld.log -n 20

you should see something like this:

InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3
110505 16:38:33 InnoDB: Initializing buffer pool, size = 768.0M
110505 16:38:33 InnoDB: Completed initialization of buffer pool
110505 16:38:33 InnoDB: highest supported file format is Barracuda.
110505 16:38:33 InnoDB Plugin 1.0.15 started; log sequence number 18094312675
110505 16:38:33 [Note] Event Scheduler: Loaded 0 events
110505 16:38:33 [Note] /usr/sbin/mysqld: ready for connections.
Version: ’5.1.56-community-log’ socket: ‘/var/lib/mysql/mysql.sock’ port: 3306 MySQL Community Server (GPL)

11) Finally, restart the Bacula director

/etc/init.d/bacula-dir start

Other than a stray config value present in the old my.cnf file that was now being interpreted differently (basedir) this part of the upgrade went smoothly.

When we reached this point I confirmed with the SysAdmin responsible for backups that Bacula was still able to perform it’s usual duties, and that the old version of Bacula seems to be fine with the new version of MySQL.

We paused at this stage for about a week, so the system could do a full run through of the weekly backups, and so we could see how well the system performs and if there is any real difference (so far).

Very little speed difference was expected at this point because the only new features enabled are for using the InnoDB storage engine which was not actually being used. It’s not actually being used yet as all the Bacula tables were still using the MyISAM storage engine which is the default engine for older versions of MySQL such as 5.0. This of course means the next major stage of the upgrade was to convert all the Bacula tables from MyISAM to InnoDB and I want to do this because of details i’ll mention below.

I found several references on the web to Bacula supporting the use of the InnoDB storage engine perfectly fine, so i’m not concerned about compatibility differences between the storage engine affecting actual functionality.

There are plenty of differences between InnoDB and MyISAM, but InnoDB has one big advantage over MyISAM that should affect Bacula – the locking method. MyISAM uses table locking, but InnoDB will do row level locking.

This means if you have parallel inserts and updates all going into the same table (like if you’re doing backups from multiple systems at the same time), MyISAM performing global locks could cause some problems – and indeed it has.

In this specific situation, we have seen locked threads on this database during the weekly backups Sunday at 4:00am, so that means to me that we could see a bit of a performance improvement, and a lot less locked threads.

Ok, so now I needed to change all the databases to InnoDB format. By far the easiest way to do this is to run a series of “ALTER TABLE {table} engine=innodb;” statements. As this system has definite times of the day when there is no activity, this is a safe option.

The main drawback of using the ALTER TABLE method to do the conversion is the disk space required. During the conversion, the old table and the new table will both exist at the same time.

The entire database is about 7GB in size and there is currently about 7GB of free space on the file system being used by MySQL. Normally this would be fine if there was a number of medium sized tables as only one table is duplicated at a time. Unfortunately this is not the case with Bacula – it has many small tables, and one really large table called ‘File’. This means converting ‘File’ may be an issue.

At this point I’m attempting to do the remainder of the upgrade without the help of the SysAdmin which means I no longer have access to ‘sudo’ or ‘root’ access. This means several of the typical options for distributing disk space (like soft linking db files to other file systems, etc..) are not really available to me.

I did eventually come up with a solution that worked due to the way Bacula operates and clears old data – I used ‘OPTIMIZE TABLE’. This actually cleared up a fairly significant amount of the disk space being used leaving me with 9GB free which is more than enough to do the conversion.

mysql> optimize table File;

Once the disk space issue was covered, I needed to generate the list of alter statements I want to execute for changing the engine. The following command was used to generate the list:

bash# mysql -uavail -p bacula -s -e ‘select “alter table “,table_name,” engine=innodb;” from information_schema.tables where tables.table_schema=”bacula”‘

The command generated this output which was then pasted into the mysql console:

alter table BaseFiles engine=innodb;
alter table CDImages engine=innodb;
alter table Client engine=innodb;
alter table Counters engine=innodb;
alter table Device engine=innodb;
alter table File engine=innodb;
alter table FileSet engine=innodb;
alter table Filename engine=innodb;
alter table Job engine=innodb;
alter table JobMedia engine=innodb;
alter table Location engine=innodb;
alter table LocationLog engine=innodb;
alter table Log engine=innodb;
alter table Media engine=innodb;
alter table MediaType engine=innodb;
alter table Path engine=innodb;
alter table Pool engine=innodb;
alter table Status engine=innodb;
alter table Storage engine=innodb;
alter table UnsavedFiles engine=innodb;
alter table Version engine=innodb;

I didn’t accurately measure the total time it took, but I would guess it took about an hour to convert everything. Most tables took less than a minute, except the ‘Filename’ table which took about 5 minutes, and the ‘File’ table which took all the rest.

We’ve now completed the major milestone for this upgrade – Bacula is now using the InnoDB storage plugin for all of its tables with the newer version of MySQL.

As suspected while working on this project, there was not a huge performance difference for the overall time it takes to perform the weekly backups, although there still is an improvement. The backups with a large number of files (+300,000) now complete faster than they previously did – approximately 20 minutes faster for a 7hr backup. This was likely purely due to the improvements in locking.

May 20, 2011

» Using KateSQL to connect to an Oracle database in Kate

Among the features announced with the release of version 4.6 of the KDE Software Compilation is KateSQL, a SQL Query plugin for the Kate text editor providing the basic functionality of a SQL client. It leverages the Qt SQL module, allowing you to make a connection to most types of databases. Out of the box it includes support for MySQL and SQLite, and with this post I’ll show you how to get it to connect to Oracle databases as well.

Since I am an Ubuntu user (Note: you’ll need to be on 11.04 and either be running Kubuntu or have the ‘kate’ package installed) these instructions will be specific to that operating system, though they will likely also apply to other Debian-based distributions with minimal modification. However, if you are using an RPM-based distribution such as openSUSE or Fedora, you’ll need to adapt the instructions somewhat to make sure you have the necessary development packages installed.

The first thing we’ll need to do in order to get this to work is to obtain an Oracle client. This can be downloaded from the Oracle Technology Network and requires that you have a valid Single Sign-On account. For the purposes of this article we’re going to use the Linux x86 Oracle Instant Client and the files we’re going to download are:


The first package is the client itself, providing the files required to run OCI applications. The second is the Instant Client SDK which includes header files for developing Oracle applications with Instant Client: in our case an OCI library for the Qt SQL module.

Users of RPM-based distributions can just go ahead and install these packages while others will need to convert them to .deb format first. The best way to do this that I am aware of is with the tool called Alien. Let’s install it, convert the Oracle Instant Client packages, and install them too.

cd /path/to/oracle/instant/client/files
sudo apt-get install alien
sudo alien oracle-instantclient11.2-basic-
sudo alien oracle-instantclient11.2-devel-
sudo dpkg -i oracle-instantclient11.2-basic_11. oracle-instantclient11.2-devel_11.

The next step is to install all of the dependencies necessary to build the Qt SQL OCI library.

sudo apt-get build-dep qt4-x11
sudo apt-get install qt4-qmake libqt4-dev libaio1

Then we will need the actual source code for the qt4-x11 package. This can be obtained with the following command:

apt-get source qt4-x11

To recap, at this point we’ve downloaded and installed the Oracle Instant Client and all of the dependencies required to build the qt4-x11 package, and we’ve downloaded the qt4-x11 source code. Next we’re going to build the Qt SQL OCI library itself.

sudo ln -s /path/to/qt4-x11-4.7.0/src /usr/include/src
sudo cp -r /path/to/qt4-x11-4.7.0/include/QtSql/private /usr/include/qt4/QtSql/
cd /usr/include/src/plugins/sqldrivers/oci/
qmake "INCLUDEPATH+=/usr/include/oracle/11.2/client64" "LIBS+=-L/usr/lib/oracle/11.2/client64/lib -Wl,-rpath,/usr/lib/oracle/11.2/client64/lib -lclntsh -lnnz11"
sudo make install

Now that the module has been built and installed, Kate should be able to connect to an Oracle database. Launch Kate (or close and re-open it if it was already running), click Settings -> Configure Kate, expand the Application tree if necessary, click Plugins, check ‘SQL Plugin‘ and click OK. Then click SQL -> Add connection to launch the connection wizard. In addition to the MySQL and SQLite database drivers, you should now also have drivers named QOCI and QOCI8 available. Choose the appropriate one, enter relevant connection details, and complete the connection wizard. You can now execute queries against the database and display result data.

April 7, 2011

» MySQL Backup Concepts for (Linux) System Administrators – Part 1

Hello, My name is James McOrmond. I am a MySQL DBA (and part time SysAdmin) at the ‘The Pythian Group’ headquarters here in Ottawa.

In my previous employment I was the primary System Administrator for a local Linux OS software company (for over 8 years). Server backups were of course something I was doing as part of my job, but i’ve learned that the backups I was doing for the MySQL databases were probably not what a DBA would have set up.

In this blog post, I will cover 3 basic types of MySQL backups for stand-alone database systems. I will not be covering fancy GUI applications, or really complicated processes – just the basic concepts – which is what I think System Administrators (ie – non DBA’s) need to know to have a good/valid backup.

1) ‘mysqldump’

This is the most basic form of MySQL backup – and likely the one most system administrators have used in the past.

This program reads data from mysqld, and ‘dumps’ clear text to stdout which you can redirect to a file anywhere on the system, or directly to a file. The text output is actually a sequence of SQL “INSERT” commands which would be used to reload the data. Commonly I see the filenames containing portions of the current date, but you can do whatever you want based on your overall backup retention policies.

Various arguments can be given to mysqldump to include all databases on the system, just a subset of databases, or even limiting to specific tables in one database. If you’re looking for “quick and dirty” – the ‘–all-databases’ argument will get you a dump of all the databases on the system. If you have stored procedures or events, you’ll also want to make sure these are part of your dumps.

    mysqldump --all-databases | gzip - >/mnt/backups/mysqlbackup-`date +%Y-%m-%d_%H.%M.%S`.gz

If your databases are large, it’s probably advisable to at least export each database schema into a different file. This would allow you to have different retention schedules for each database. It also makes restoring a single database (or table) easier – if for example someone deletes a single table. This of course never happens in the ‘real world’ right? :-)

While using mysqldump is definitely the “easiest” way to perform a backup, it also has a significant issue that make many serious DBA’s shutter if these are the only backups available.

The main problem of course relates to making sure the data is consistent across all tables within the same backup. If you don’t, one table may reference entries in another table that may have been modified (or even no longer exists) between the start and end of the backup process.

The most convenient way to enforce this is by using the ‘–lock-all-tables’ argument. “Unfortunately” this does exactly what it’s supposed to do – it puts a read only lock all tables (in all databases). That means no other process can modify data in the database until the backup is done. If your database has mostly reads – or if you can put your application into a read only or maintenance mode, this may be fine, but otherwise you will probably eventually notice several write statements in locked state – waiting for the backup to finish.

If your backup takes hours, having a read lock on the data may be quite inconvenient, and you should investigate another backup method.

2) File System Snapshots

File System snapshots (as the name suggests), happen at the File System layer, not within the database itself. Depending on your environment, you may already be doing snapshots to backup other parts of your environment.

If you are using a SAN for your storage, snapshotting is likely a built in feature. If not – you can use LVM snapshots at the OS level, as long as you have created the necessary logical volumes of the correct areas of your system.

As with mysqldump, you need to make sure the data is consistent in your backup, or else it may not be very useful to you when you try to do a restore, at 3am on a Saturday morning, with a customer on the phone ‘asking’ you when everything will be working again :-)

There are two kinds of flows typical with this type of backup:

   perform a 'FLUSH TABLES"
   create the snapshot
   perform an 'UNLOCK TABLES'

At this point, you’re safe to mount the snapshot to a new location, and then copy the database files somewhere else – possibly using whatever method you’re using to backup the rest of the system (tar, rsync or a tool like bacula).

The interruption to the active database is fairly minimal here – only between the “FLUSH TABLES WITH READ LOCK”, and the “UNLOCK TABLES” commands. Performing the first “FLUSH TABLES” ahead of time makes sure the interruption is minimal.

Unfortunately, while this sounds like the perfect solution, some storage engines like InnoDB may essentially ignore the “FLUSH TABLES WITH READ LOCK”, so data may not be perfectly consistent on disk when the snapshot happens. It will be closer than just doing a mysqldump, but still not perfect.

All is not lost, as mysqld will actually do a “crash recovery” based on the various log files available to clean up the database as part of the startup process.

The best and easiest way to be absolutely sure the data is consistent for the snapshot, is to actually shut down the mysqld process entirely while the snapshot is being performed.

   perform a 'FLUSH TABLES"
   shutdown the database
   create the snapshot
   restart the database

The outage in this case will be complete – no reads or writes will be possible at all. The duration of the outage will be quite short so this outage may be acceptable. It’s entirely possible it will be less than a minute from the database shutdown, to the restart.

Your application should still be put into a maintenance mode if possible, but it should be scheduled during an off time, and you may decide it is short enough that end users won’t notice.

3) xtrabackup

xtrabackup is an open source tool available from Percona. Among other things, it is capable of performing backups of InnoDB databases – while the database is still running.

Percona provides yum/apt repositories for Red Hat/CentOS/Fedora, Debian and Ubuntu, as well a generic .tar.gz for everyone else. You should be able to get it installed on your chosen distribution fairly easily. The package will contain many different files most notably the xtrabackup binary itself, as well as the innobackupex-1.5.1 wrapper script.

This innobackupex wrapper script will save you a lot of time when you’re trying to get your backups done. In it’s simplest mode, it will take a single argument – the backup destination directory, and then it will do a fairly complete backup into that directory. If your my.cnf is not in /etc/ then you should use the –defaults-file argument.

Typically you’ll want these backups run through a program like tar, so the –stream option is your friend. This also allows you to compress the backups using a program like gzip, which is of course always a good idea.

   innobackupex-1.5.1 --stream=tar /mnt/backups | gzip - > mysqlbackup.`date +%Y-%m-%d_%H.%M.%S`.tar.gz

To really have the backup at the fully consistent level we want, xtrabackup needs to be done in multiple stages. The first stage simply does the backup into a directory (which it will name based on the date), and the second one re-runs innobackupex with the –apply-log argument so it applies the updates. If you wish, you can also then create a compressed tarball of this backup directory.

These additional stages however don’t need to be done at the time of your backups, they can be done at the time of the restore which is likely far less often. It all depends on when you want your system spending the CPU cycles.

Final Thoughts

This blog is by no means a fully exhaustive list of the tools available, but they essentially cover one example of the 3 basic types – logical, cold and hot.

It is entirely valid and common to use more than one backup type.  Often I have seen a logical backup along with one of the other two backup types.

Care should be taken with the security of your backups of course, and please remember that if you specific user name and password on the command line, that they can be viewed by any user logged into the system that can run commands like “ps” or “top”.

If you are using multiple servers in a master/slave configuration, command line arguments also exist in the different tools for recording relevant binlog positions and those definitely should be used.

…But that is a topic for another time.

Some relevant links

mysqldump documentation on –
Percona Xtrabackup –

» Presence, Chat, Voice, Video Conference

I have recently joined the Pythian Group as a SA, and have started to learn about the environment and the tools we use for the daily activities and the control. Even I may not have experience about the technologies which you guys are using, but have worked with the technology and feel that I should share the same with our team which may end up with some good.

I have noticed that we are using the IRC for most of the in-house inter team communications, in which we do not see the presence of the team members and the status. Whereas this can be achieved through the technology and we can be more informed before the start of any communication with the desired person. There are many new trends in the market for the desired solution from the key players in the market like;

  • Microsoft
  • Cisco

Microsoft has a solution for this technology which is called the Microsoft unified Communications. This setup is based on the traditional design of the Microsoft environment like, AD, Exchange, IIS and Unified Communication Server Called Lync.

Here I would like to give some details about the unified communications which is as follows;

Unified communications (UC) is the integration of real-time communication services such as instant messaging (chat), presence information, telephony (including IP telephony), video conferencing, call control and speech recognition with non-real-time communication services such as unified messaging (integrated voicemail, e-mail, SMS and fax). UC is not a single product, but a set of products that provides a consistent unified user interface and user experience across multiple devices and media types. UC also refers to a trend to offer Business process integration, i.e. to simplify and integrate all forms of communications in view to optimize business processes and reduce the response time, manage flows, and eliminate device and media dependencies.

UC allows an individual to send a message on one medium and receive the same communication on another medium. For example, one can receive a voicemail message and choose to access it through e-mail or a cell phone. If the sender is online according to the presence information and currently accepts calls, the response can be sent immediately through text chat or video call. Otherwise, it may be sent as a non real-time message that can be accessed through a variety of media.

Instant messaging (IM) is a form of real-time direct text-based communication between two or more people using personal computers or other devices, along with shared software clients. The user’s text is
conveyed over a network, such as the Internet. More advanced instant messaging software clients also allow enhanced modes of communication, such as live voice or video calling.

Presence information. In computer and telecommunications networks, presence information is a status indicator that conveys ability and willingness of a potential communication partner—for example a user—to communicate. A user’s client provides presence information (presence state) via a network connection to a presence service, which is stored in what constitutes his personal availability record (called a presentity) and can be made available for distribution to other users (called watchers) to convey his availability for communication. Presence information has wide application in many communication services and is one of the innovations driving the popularity of instant messaging or recent implementations of voice over IP clients.

A videoconference or video conference (also known as a video teleconference) is a set of interactive telecommunication technologies which allow two or more locations to interact via two-way video and audio transmissions simultaneously. It has also been called ‘visual collaboration’ and is a type of groupware. Videoconferencing differs from videophone calls in that it’s designed to serve a conference rather than individuals. It is an intermediate form of video telephony, first deployed commercially by AT&Tduring the early 1970s using their Picture phone technology.

In telephony, call control refers to the software within a telephone switch that supplies its central function. Call control decodes addressing information and routes telephone calls from one end point to another. It also creates the features that can be used to adapt standard switch operation to the needs of users. Common examples of such features are;

  • “Call Waiting”
  • “Call Forward on Busy”
  • “Do Not Disturb”

Call control software, because of its central place in the operation of the telephone network, is marked by both complexity and reliability. Call control systems will typically require many thousands of person years in development. They will contain millions of lines of high level code. However they must and do meet reliability requirements that specify switch down time of only a few minutes in forty years.

The required functionality and reliability of call control is a major challenge for Voice over IP (VoIP) systems. VoIP systems are based on Internet standards and technology which have not previously attempted to satisfy such complex and demanding requirements as those that specify call control. An alternative name often used is call processing.

Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to text. The term “voice recognition” is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software. Recognizing the speaker can simplify the task of translating speech. Speech recognition is a broader solution which refers to technology that can recognize speech without being targeted at single speaker—such as a call center system that can recognize arbitrary voices.

Unified Messaging (or UM) is the integration of different electronic messaging and communications media (e-mail, SMS, Fax, voicemail, video messaging, etc.) technologies into a single interface, accessible from a variety of different devices. While traditional communications systems delivered messages into several different types of stores such as voicemail systems, e-mail servers, and stand-alone fax machines, with Unified Messaging all types of messages are stored in one system. Voicemail messages, for example, can be delivered directly into the user’s inbox and played either through a headset or the computer’s speaker. This simplifies the user’s experience (only one place to check for messages) and can offer new options for workflow such as appending notes or documents to forwarded voicemails.

Microsoft Lync Server (previously Microsoft Office Communications Server) is an enterprise real-time communications server, providing the infrastructure for enterprise instant messaging, presence, file transfer, peer-to-peer and multiparty voice and video calling, ad hoc and structured conferences (audio, video and web) and PSTN connectivity. These features are available within an organization, between organizations, and with external users on the public internet, or standard phones, on the PSTN as well as SIP trunking.


From controlling costs to managing compliance, Microsoft® Lync™ delivers value that speaks to the
needs of today’s organizations.

Do more. With less.

Control costs Voice over IP (VoIP) enables communications among geographically dispersed company locations without long distance charges. Integrated audio, video, and Web conferencing helps reduce travel costs as well as the cost of third-party conferencing solutions.

Improve productivity Rich presence information helps employees find each other and choose the most effective way to communicate at a given time. Instead of e-mailing documents back and forth for approval, workers can rely on real-time collaboration through enhanced conferencing with desktop, application, and virtual whiteboard sharing—or contact a collaborator from within Microsoft Office or other applications. The unified Microsoft Lync 2010 client provides access to enterprise voice, enterprise messaging, and conferencing from one simplified interface.

Support the mobile workforce Mobile workers get access to rich Unified Communications tools from practically anywhere with an Internet connection, no VPN needed. An updated Lync Mobile client makes joining and managing conferences, searching the Global Address List, and viewing presence information easy. Rich presence in Lync Server 2010 has been updated with mobile location information, making on-the-go workers easier to find and contact. A single user experience across PC, phone, mobile phone, and browser gives workers more ways to stay in touch.

Gain operational efficiencies By integrating Unified Communications and rich presence into business workflows, latency and delays can be reduced or eliminated. For geographically dispersed teams, group chat can enable efficient, topic-specific, multi-party discussions that persist over time.

Be more responsive to customers, partners, and employees Enhanced delegation through Lync 2010, one-click call routing and management features in Microsoft Lync 2010 Attendant for receptionists, and rich presence information in both help ensure that opportunities are routed to the right person at the right time.
Maintain regulatory compliance Built-in security, encryption, archiving, and call detail records help meet regulatory requirements. By using your own servers and network, you maintain control over sensitive data that would otherwise be transmitted over public telephone networks and third-party conferencing platforms.

So this is what we can have with the help of Microsoft and Cisco, but the good news is that we can achieve the desired goal with the help of Linux as well witout investing to the licensing cost :), we can it in coleberation with Asterisk and the XMPP the Jabber Application.

I hope you guys will enjoy this post, and we can discuss further if interested

February 14, 2011

» Installing TOra with Oracle Support on Ubuntu 10.10

Good afternoon sports fans. I’ve had a couple of requests to update my world famous blog on installing TOra. Frankly I have been wanting to get this out for a while but duties other than blogging have taken precedence. That and I think my blogging ran out of entropy and needed some other IO to get going again. Well it’s now time for you all to let out that breath you have been holding since the ‘perfect 10′ was released (on 10.10.10 no less).

Once again the instructions are much the same, but there was a couple of things I had to change which were mostly around the order of things. I installed this on Kubuntu 10.10 x86_64 systems with the newest Oracle 11.2 client. The version of TOra this time around is 2.1.2 which works fine except for one really really frustrating thing, that I will mention at the end to keep you in suspense. I ran through this 3 times on 3 separate systems (VM, work and home) and in all cases I got through it mostly unharmed with only a few humps. So here we go, it’s time to…

Make some place to build the package

Normally I will create a safe haven for this type of work, away from the other clutter on my drive. When doing this I did it all under ~/Documents/blogs/tora-meercat with subdirectories for oracle and building tora separate. It’s all up to you, but I keep it sorted because they probably will not get deleted for my normal retentions of ~30 years (and counting).

Hereby referred to as /path/to/tora.

Get the packages

First thing you’ll want is the newest oracle client packages. You can get them from the Oracle instant client download page. You need the following:

  • oracle-instantclient11.2-basiclite-
  • oracle-instantclient11.2-devel-
  • oracle-instantclient11.2-sqlplus-

Now we need the utilities and dependencies to build the package. I had to swap the order from last time because the TOra build deps now need packages that we used to install later. Say yes to any suggestions it has, some of them may already be installed.

sudo apt-get install libqt3-mt-dev libqt3-compat-headers libqscintilla-dev build-essential g++ gcc autoconf automake flex zlib1g-dev docbook-xsl debhelper alien libaio1 dpatch fakeroot xsltproc texi2html texinfo libqt3-mt-psql libqt3-mt-odbc config-package-dev cmake qt4-dev-tools

And now the TOra build deps.

sudo apt-get build-dep tora

Now go to your build directory and get the source.

cd /path/to/tora
apt-get source tora

Next we will install our Oracle packages. Just like last time, we do this:

sudo alien -i *.rpm

Environment Variables

Now that we laid the groundwork, it’s time to set up the build environment.

export ORACLE_HOME="/usr/lib/oracle/11.2/client64"

This time around we did not set the CMAKE environment. The reason is simple. It did not work, and was apparently not needed. I am not sure why, but here we are. Instead we’re back to making a symbolic link to the oracle includes.

sudo ln -s /usr/include/oracle/11.2/client64 $ORACLE_HOME/include

Once again I want to ensure that I the above environment variables are available when we run this in our X desktop. We’ve done this different ways in past blogs, today we’re going to add them to our .xsessionrc.

$ cat ~/.xsessionrc
export ORACLE_HOME=/usr/lib/oracle/11.2/client64

You’ll need to log out and in again for it to take effect, so run it from the command line until you get a chance to do that.

Increment the version (not required)

Feel free to skip to the next section.

This time I wanted to choose the “increase my package version to avoid update issues” method we’ve discussed in previous editions of this blog. Check back to the older blogs to see the other methods of avoiding updates overwriting your package, but I like to mix things up a bit. If you want to do this, change the bottom to whatever you like but a note that the system is VERY picky about the formatting. Watch all the spacing and line breaks, they all need to be there. There is also a double space between the email address and date, so watch for it.

cat << EOF >> /path/to/tora/tora-2.1.2/debian/changelog
> tora (2.2.2-1) unstable; urgency=low
> * Package renumbered to prevent updates from clobbering it
> -- Brad Hudson Mon, 10 Jan 2011 12:24:00 -0500

Interlude. Do you recall a few weeks back when I posted a comment that the blog was about done? This is where I was. Then 2+ weeks of back to back (to back) raging infernos. Then a bit of time to decompress. Now, where were we … :)

Compile and create the .deb file

Easy as pie. Here’s a twist, run this using the time util and see how long it takes to compile. On my work PC it took 6m57.792s, give or take a ms.

time fakeroot debian/rules binary

Install it

If all goes to plan you should have a .deb package file, install it as below. Your package name may differ if you chose not to increment the version as I had above.

sudo dpkg -i tora_2.2.2-1ubuntu1_amd64.deb

To test it, simply run tora in the same shell in which you compiled it. Your menu item will not see Oracle until you log out and in again.

One really frustrating thing

I have run through this process on at least 3 systems. It works on all 3 and functions correctly while I am using it, I can run basic sql and browse the schema (which is about all I really ever use it for). It also has a problem on all 3. No matter where I launch the program from it will hang when I try to close it. I have been unable to locate the source of grief, but I have ruled out connectivity for sure because I have done this from both local and remote network segments. It does not have this issue when Oracle support is not available. I never had this issue in previous versions, it may be a bug. I ran an strace on the program and once it hangs it does not seem to be doing anything, but I can still ctrl-c (or xkill) the program. I have not been able to find any open bug reports on it but my time has been very limited. If you find something out about this or find a cure please post a comment. I’ll do same if I get to solving it.

That’s it for this edition. I guess I am only two months away from having another one of these due out. Thanks for reading!

June 18, 2010

» Keeping Up

I found I never published this post as it was sitting in my drafts few months now — it was written in 13th February, 2010. I’m publishing it without any changes. I learn therefore I am! I’ve just wrote few bits about learning a new technology and after skimming through my Google Reader, I noticed [...]

June 8, 2010

» Wherever I May Roam

Roamer, wanderer Nomad, vagabond Call me what you will $ENV{LC_ALL} = "anywhere"; my $time = localtime; say {$anywhere} my $mind; local *anywhere = sub { ... }; Anywhere I roam Where I 'git ghclone environment' is $HOME # 'grep may_roam($_) => @everywhere', # with apologies to Metallica Laziness and a severe addiction to yak shaving [...]

May 14, 2010

» Blogrotate #27: The Weekly Roundup of News for System Administrators

Good afternoon and welcome to issue 27. The number 27 according to numerology is “the symbol of the divine light” so I’ll try to do that ideal justice. We’re off to a good start, what with me actually getting this out on schedule and such, so let’s get to it while the day is still [...]

May 13, 2010

» An SSH tool to make your life easier

A MySQL user group member saw that I use Poderosa as my ssh-on-Windows tool, and asked why I did not use PuTTY. My response was that I like having tabbed windows and hate having to keep opening another PuTTY program every time I want to open another connection. With Poderosa I can open a new [...]

May 10, 2010

» Blogrotate #26: The Weekly Roundup of News for System Administrators

Hi there and welcome to Blogrotate in which I, your humble host and blogger, bring to you interesting stories and events from the past week in the SysAdmin world. It’s been yet another busy week, which is why this is coming out on a Sunday again, so I am going to have to short list this edition but there’s still plenty of tasty nuggets to be found. Read on.

Operating Systems

It’s been discovered that Microsoft released three patches last month without including them in the release notes. Two of the patches were to fix security holes in MS Exchange servers. While this is nothing new it completely removes the ability for a sysadmin to evaluate the impact of the patches on critical corporate systems, which is necessary before rolling out the updates. Not to mention it makes it really difficult to diagnose a change in behaviour if you have no idea there was a change made. See more gory details in Security firm reveals Microsoft’s ‘silent’ patches.

Sun/Oracle removed public firmware downloads is a strange piece by someone called techbert describing how he logged into the sunsolve to download some firmware for his systems only to find that they were no longer publicly available. This is a bad move that will likely annoy a lot of people looking for firmware on old, repurposed or used systems. It’s a good thing that my Sparc 1+ is already patched!

If you are an old fogey like me you may still be using newsgroups over nntp for many things. Even Microsoft provided support for their products using newsgroups, but as of June 2010 they will begin decommissioning the groups in favour of their online community message boards. See the official news release Microsoft Responds to the Evolution of Communities.


Ars Technica has a look at the current market share for internet browsers. Chrome has surged to a new high while IE has dropped below 60% market share. FireFox and Safari (Mac) have remained fairly static which means that Chrome is taking the bite right our of Microsoft’s share of the pie. For more visit Emil Protalinski and his aritcle Chrome continues surge as IE drops below 60% market share

From the Google Chrome Blog, a piece about performance improvements in the latest Chrome beta claiming “30% and 35% improvement on the V8 and SunSpider benchmarks over the previous beta channel release” and “as much as 213% and 305% on these two benchmarks since our very first beta”. Take a look at Pedal to the Chrome metal: Our fastest beta to date for Windows, Mac and Linux.

Since we’re all about the chrome this week check out this link that Bill provided, a video demonstrating the rendering speed of Chrome. See the youtube video Google Chrome Speed Tests (SFW[1]). I think the results were skewed because a) the french fry chopper introduced resistance to the potato reducing it’s velocity and; b) actual sound waves move faster than paint moving against gravity no matter how much force you put into the paint. Still they are all really neat.


Linux Journal has a nifty little piece by Bill Childers about setting up your own Ubuntu server in the Amazon cloud. Check out Put Your Servers in the Cloud with Amazon EC2 and Ubuntu.

Data Center Knowledge
has an interesting report by Rich Miller called Rackspace Hits 100,000 Customers. Based on the numbers he presents it seems that “the managed hosting unit accounted for less than 1 percent of customer growth, but 77 percent of revenue growth” meaning basically that they’re gaining a lot of cloud customers but it’s not their money maker, it’s just gravy. Still the gains in cloud presence are impressive.

Matt Asay at C-net writes about increasing competition between Red Hat and VMware in the virtualization and middleware markets, and suggests that Red Hat may look at growing by acquiring a NoSQL technology and VMware may enter a bid for Novell to enter the operating system space. See VMware and Red Hat: The war for the data center.


The world is hungry for storage capacity and there are limits on storage density using current technology. But wait! There is hope for the future. There is a new paper showing how a combination of thermally-assisted magnetic recording (TAR) and bit-patterned recording (BPR) can be used to “store data at densities of up to one terabit per square inch, and suggest the media could be stable up to ten terabits per square inch”. See Casey Johnston‘s article New hard drive write method packs in one terabit per inch.

And finally a couple of amusing items for you. Head on over to IT World and check out Vintage Tech Ads: The 15 Funniest Videos and also Priceless! The 25 Funniest Vintage Tech Ads.

That’ll have to do it for this week. Leave your own picks in comments and we’ll see you next week.

[1] In a surprising turn of events when I went to do my usual linking of the Internet slang definition for SFW above I found that the definitions themselves included language that would be considered NSFW. So for those who do not know SFW == Safe For Work and NSFW == Not Safe for Work.

May 8, 2010

» Liveblogging: Mentoring: It’s for everyone!

Liveblog of the Professional IT Community Conference session Mentoring: It’s for everyone

Ways to learn:
Kinetic (doing it)

Everyone learns differently, but most people learn with some combination of all these three.

However, you can also learn by training [that's the truth, I learned a LOT by writing the book, even things I knew, I ended up needing to research more].

Ways to train:
Questioning (Socratic Method)

What is a mentor?
noun: experienced and trusted adviser. It’s not just someone who teaches, it’s someone who advises.
experienced person in a company, college or schools who trains and counsels new employees or students.

verb: to advise or train (someone, esp. a younger colleague).

A mentorship is a safe place to ask questions.

A mentor is a trainer, but a trainer who also is a professional advisor.

Finding a mentor
- you respect/admire
- works with similar technology
- has a compatible personality
- you have a good rapport with

Being a mentor
- Teach technical skills
- Provide advanced technical/design guidance
- Model and teach professional skills
- Be interested and invested in the [student's] career

» Liveblogging: Tech Women Rule!

I am moderating and liveblogging the Professional IT Community Conference panel called Tech Women Rule! Creative Solutions for being a (or working with a) female technologist.

One point to keep in mind: The goal is not equality for equality’s sake. The goal is to have a diverse range of experience to make your company/project/whatever the best it could be.

That being said, these issues are not just around women; they are about anyone who is “different”, whether it’s race, ethnicity, gender, sexual orientation, cultural.

So what are some of the solutions?

0) Better align expectations with reality. Are you expecting more from someone who is one gender than another? If a woman makes a mistake is it worse because she has to prove herself? Is it worse because she is representative of her gender? If she does something good is the achievement elevated more because of her gender? Either is bad.

1) Respect people for who they are. Everyone deserves respect; if someone is not at your technical level, they still deserve respect.

If someone says something that is completely wrong from a technical perspective, do not assume that they have no idea what they are talking about. It could be that they are the exact case in which that technical scenario is appropriate for them. If they are correct, your attitude will be refreshing and you might learn something. If they are indeed wrong, ask them about a scenario in which their thinking falls apart, or otherwise guide them through learning why what they are saying is wrong.

2) Be nice. Don’t condescend.

3) Be helpful. “RTFM, n00b!” is not helpful, and certainly does not follow rule #2.

4) Don’t do #1-3 for women only. Don’t treat women nicely because they’re women, and be a jerk to men because they’re men. Being helpful is good for anyone, not just women.

5) Cooperate, do not compete. Whether you are co-workers, working together on a software project, or just in a conversation, the game of “one-upping” another is a lot less useful than working together.

6) When hiring or when in an interview, concentrate on skills, not knowledge. “Skills” refers to attributes such as their ability to listen, how judgmental they are about a legacy system, whether they are open to new ideas, whether they disdain anything that is not cutting edge, and even technical skills such as thinking patterns, research patterns, algorithms, etc.

If someone says “I don’t know” in an interview, ask them “how would you go about figuring it out?” If someone says “I think it’s x and y” ask “how would you confirm/test that?” If a backup failed, do they start the backup over or do they try to figure out why it failed?

Are they thorough? Do they follow through? It is a lot easier to teach knowledge than it is to teach something like “debugging skills”.

7) Specifically encourage people to speak up. Train yourself to NOTICE when folks are not speaking up, and ask them if they have any suggestions or ideas.

8) If you are running an IT conference, specifically ask qualified women you know to speak, not about “women in IT”. If you hear of an IT conference, tell specific women you know that you think they would be a great speaker. Get women to speak at local user groups to get practice in a less intimidating space.

Read HOWTO Encourage Women in Linux

You Just Don’t Understand
The Male Mind at Work
How to succeed in business without a penis: a guide for working women (this is a humor book)

Join and/or send messages to Systers, the world’s largest e-mail list of women in computing and technology fields.

May 7, 2010

» Liveblogging: Senior Skills: Python for Sysadmins

Why Python?

- Low WTF per minute factor
- Passes the 6-month test (if you write python code, going back in 6 months, you pretty much know what you were trying to do)
- Small Shift/no-Shift ratio (ie, you use the “Shift” key a lot in Perl because you use $ % ( ) { } etc, so you can tell what something is by context, not by $ or %)
- It’s hard to make a mess
- Objects if you need them, ignore them if you don’t.

Here’s a sample interpreter session. The >>> is the python prompt, and the … is the second/subsequent line prompt:

>>> x='hello, world!';
>>> x.upper()
>>> def swapper(mystr):
... return mystr.swapcase()
  File "<stdin>", line 2
    return mystr.swapcase()
IndentationError: expected an indented block

You need to put a space on the second line because whitespace ‘tabbing’ is enforced in Python:

>>> def swapper(mystr):
...  return mystr.swapcase()
>>> swapper(x)
>>> x
'hello, world!'

partition is how to get substrings based on a separator:

>>> def parts(mystr, sep=','):
...  return mystr.partition(sep)
>>> parts(x, ',')
('hello', ',', ' world!')

You can replace text, too, using replace.

>>> def personalize(greeting, name='Brian'):
...  """Replaces 'world' with a given name"""
...  return greeting.replace('world', name)
>>> personalize(x, 'Brian')
'hello, Brian!'

By the way, the stuff in the triple quotes is automatic documentation. A double underscore, also called a “dunder”, is to print the stuff in the triple quotes:

>>> print personalize.__doc__
Replaces 'world' with a given name

Loop over a list of functions and do that function to some data:

>>> funclist=[swapper, personalize, parts]
>>> for func in funclist:
...  func(x)
'hello, Brian!'
('hello', ',', ' world!')


>>> v=range(1,10)
>>> v
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> v[1]
>>> v[5]
>>> v[-1]
>>> v[-3]

List slicing with “:”
>>> v[:2]
[1, 2]
>>> v[4:]
[5, 6, 7, 8, 9]
>>> v[4:9]
[5, 6, 7, 8, 9]
Note that there’s no error returned even though there’s no field 9. If you did v[9], you’d get an error:
>>> v[9]
Traceback (most recent call last):
File ““, line 1, in
IndexError: list index out of range

Python uses pointers (or pointer-like things) so v[1:-1] does not print the first and last values:

>>> v[1:-1]
[2, 3, 4, 5, 6, 7, 8]

The full array syntax is [start:end:index increment]:

>>> v[::2]
[1, 3, 5, 7, 9]
>>> v[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
>>> v[1:-1:4]
[2, 6]
>>> v[::3]
[1, 4, 7]

Make an array of numbers with range

>>> l=range(10)
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Make a list from another list

>>> [pow(num,2) for num in l]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

append appends to the end of a list

>>> l.append( [pow(num,2) for num in l])
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]]
>>> l.pop()
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

extend takes a sequence and puts it at the end of the array.

>>> l.extend([pow(num,2) for num in l])
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A list can be made of a transformation, an iteration and optional filter:
[ i*i for i in mylist if i % 2 == 0]
transformation is i*i
iteration is for i in mylist
optional filter is if i % 2 == 0

>>> L=range(1,6)
>>> L
[1, 2, 3, 4, 5]
>>> [ i*i for i in L if i % 2 == 0]
[4, 16]

Tuples are immutable lists, and they use () instead of []
A tuple always has 2 elements, so a one-item tuple is defined as

Dictionaries aka associative arrays/hashes:

>>> d = {'user':'jonesy', 'room':'1178'}
>>> d
{'user': 'jonesy', 'room': '1178'}
>>> d['user']
>>> d.keys()
['user', 'room']
>>> d.values()
['jonesy', '1178']
>>> d.items()
[('user', 'jonesy'), ('room', '1178')]
>>> d.items()[0]
('user', 'jonesy')
>>> d.items()[0][1]
>>> d.items()[0][1].swapcase()

There is no order to dictionaries, so don’t rely on it.

Quotes and string formatting
- You can use single and double quotes inside each other
- Inside triple quotes, you can use single and double quotes
- Variables are not recognized in strings, uses printf-style string formatting:

>>> word='World'
>>> punc='!'
>>> print "Hello, %s%s" % (word, punc)
Hello, World!

Braces, semicolons, indents
- Use indents instead of braces
- End-of-line instead of semicolons

if x == y:
 print "x == y"
for k,v in mydict.iteritems():
 if v is None:
 print "v has a value: %s" % v

This seems like it might be problematic because of long blocks of code, but apparently code blocks don’t get that long. You can also use folds in vim [now I need to look up what folds in vim are].

You can’t assign a value in a conditional statement’s expression — because you can’t use an = sign. This is on purpose, it avoids bugs resulting from typing if x=y instead of if x==y.

The construct has no place in production code anyway, since you give up catching any exceptions.

Python modules for sysadmins:
- sys
- os
- urlib/urlib2
- time, datetime (and calendar)
- fileinput
- stat
- filecmp
- glob (to use wildcards)
- shutil
- gzip
- tarfile
- hashlib, md5, crypt
- logging
- curses
- smtplib and email
- cmd

The Zen of Python
To get this, type ‘python’ in a unix environment, then type ‘import this’ at the commandline. I did this on my Windows laptop running Cygwin:

cabral@pythianbos2 ~
$ python
Python 2.5.2 (r252:60911, Dec  2 2008, 09:26:14)
[GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

This was liveblogged, please let me know any issues, as they may be typos….

» Liveblogging: Senior Skills: Sysadmin Patterns

The Beacon Pattern:
- This is a “Get out of the business” pattern
- Identify an oft-occurring and annoying task
- Automate and document it to the point of being able to hand it off to someone far less technical

- System admins were being put in charge of scheduling rooms in the building
- They wrote a PHP web application to help them automate the task
- They refined the app, documented how to use it, and handed it off to a secretary
- They have to maintain the app, but it’s far less work.

The Community Pattern:

- Prior to launch of a new service, create user documentation for it.
- Point a few early adopters at the documentation and see if they can use the service with minimal support
- Use feedback to improve documentation, and the service
- Upon launch, create a mailing list, forum, IRC channel, or Jabber chat room and ask early adopters to help test it out.
- Upon launch, your early adopters are the community, and they’ll tell new users to use the tools you’ve provided instead of calling you.

- A beowulf cluster for an academic department
- Documented like crazy, early adopters were given early access to the cluster (demand was high)
- Crated a mailing list, early adopters were added to it with their consent, functionality was tested with them.
- Email announcing launch mentioned the early adopters in a ‘thank you’ secion, and linked them to their mailing list.

The DRY pattern
DRY = Don’t repeat yourself
Identify duplicate code in your automation scripts
Put subroutines that exist in an include file, and include them in your scripts.

- “sysadmin library”
- /var/lib/adm/.*pl
- Elapsed time and # of lines to script a task for which the library was useful plunged dramatically
– new tasks were thought up that were not considered before but were obvious now (ie, users that want to change their username)
– migrating to new services became much easier

The Chameleon Pattern
- Identify commonalities among your services
- Leverage those to create “Chameleon” servers that can be re-purposed on the fly
- Abstract as much of this away from the physical hardware
- Doesn’t need to involve virtualization, though it’s awfully handy if you can do it that way.
[this one is a bit harder to do with MySQL config files]

[puppet/cfengine were mentioned...] – more than a script: a methodology

- But isn’t installing packages you don’t need bad? Depends on the package….ie, gcc is bad for enterprise

“Junior annoynances”

Terminal issues

open terminal, login to machine1
think issue is with machine2, talks to machine1.
log out of machine1
log into machine2

opens 2 terminals each of machine1 and machine2 to start

networking issue ticket arrives
logs into server
runs tcpdump

networking issue ticket arrives
logs into server
looks at logs

“Fix” vs. “Solution” ie “taking orders”
Junior will try fix a problem, senior will try to figure out what the problem is. ie, “I need a samba directory mounted under an NFS mount” a junior admin will try to do exactly that, a senior admin will ask “what are you trying to do with that?” because maybe all they need is a symlink.

Signs you might be a fanboy:
- Disparaging users of latest stable release of $THING for not using the nightly (unstable) build which fixes more issues
- Creating false/invalid comparisons based on popular opinion instead of experience/facts
- Going against internal standards, breaking environmental consistency, to use $THING instead of $STANDARD (but this is also how disruptive technology works)
- Being in complete denial that most technology at some point or another stinks.
- Evaluating solutions based on “I like” instead of “we need” and “this does”.

» Liveblogging: Seeking Senior and Beyond

I am attending the Professional IT Community Conference – it is put on by the League of Professional System Administrators (LOPSA), and is a 2-day community conference. There are technical and “soft” topics — the audience is system administrators. While technical topics such as Essential IPv6 for Linux Administrators are not essential for my job, many of the “soft” topics are directly applicable and relevant to DBAs too. (I am speaking on How to Stop Hating MySQL tomorrow.)

So I am in Seeking Senior and Beyond: The Tech Skills That Get You Promoted. The first part talks about the definition of what it means to be senior, and it completely relates to DBA work:
works and plays well with other
understands “ability”
leads by example
lives to share knowledge
understands “Service”
thoughtful of the consequences of their actions
understands projects
cool under pressure

Good Qualities:

Bad Qualities:
[my own addition - no follow through, lack of attention to detail]

The Dice/Monster Factor – what do job sites see as important for a senior position?

They back up the SAGE 5-year experience requirement
Ability to code in newer languages (Ruby/Python) is more prevalent (perhaps cloud-induced?)

The cloud allows sysadmin tasks to be done by anyone… developers can do sysadmin work, and you end up seeing schizophrenic job descriptions such as

About the 5-year requirement:
- Senior after 5 years? What happens after 10 years?
- Most electricians, by comparison, haven’t even completed an *apprenticeship* in 5 years.

Senior Administrators Code
- not just 20-line shell scripts
- coding skills are part of a sysadmin skill
- ability to code competently *is* a factor that separates juniors from seniors
- hiring managers expect senior admins to be competent coders.

If you are not a coder
- pick a language, any language
- do not listen to fans, find one that fits how you think, they all work…..
- …that being said, some languages are more practical than others (ie, .NET probably is not the best language to learn if you are a Unix sysadmin).

Popular admin languages:
- Perl: classic admin scripting language. Learn at least the basics, because you will see it in any environment that has been around for more than 5 years.

- Ruby: object-oriented language for people who mostly like Perl (except for its OO implementation)

- Python: object-oriented language for people who mostly hate Perl, objects or no objects. For example, you don’t have to create a String object to send an output.

But what if you do not have time to learn how to program?

- senior admins are better at managing their time than junior admins, so perhaps managing time
- time management means you’ll have more time to do things, it doesn’t mean all work work work.
- Read Time Management for System Administrators – there is Google Video of a presentation by the author, Tom Limoncelli.

Consider “The Cloud”
- starting to use developer APIs to perform sysadmin tasks, so learning programming is good.
- still growing, could supplant large portions of datacenter real estate
- a coder with sysadmin knowledge: Good
- a sysadmin with coding knowledge: Good
- a coder without sysadmin knowledge: OK
- a sysadmin with no coding interest/experience: Tough place to be in

Senior Admins Have Problems Too
Many don’t document or share knowledge
Maany don’t do a good job keeping up with their craft
Cannot always be highlighted as an example of how to deal with clients
Often reinvent the wheel – also usually there is no repository
Often don’t progress beyond the “senior admin” role

….on the other hand…..
cynicism can be good…..

learn from the good traits
observe how others respond to their bad traits
think about how you might improve upon that
strive to work and play well with others, even if you don’t have a mentor for good/bad examples.

Now he’s going into talking about Patterns in System Administration….