Friday, August 26, 2011

Creating a VM for OpenStack


Here at HCC, we have a few VM-based projects going.  One is the Condor-based VM launching that Ashu referenced in his previous posting.  That project is to take an existing capability (Condor batch system hooked to the grid) and extending it; instead of launching processes, one can launch an entire VM.

One of our other employees, Josh, has been working from the other direction: taking a common "cloud platform", OpenStack, and seeing if it can be adopted to our high-throughput needs.  The OpenStack work is in its beginning phases, but bits and pieces are starting to become functional.

Last night, I tried out install for the first time.  One of the initial tasks I wanted to accomplish is to create a custom VM.  A lot of the OpenStack documentation is fairly Ubuntu specific, so I've taken their pages and adopted them for installing from a CentOS 5.6 machine.  Unfortunately, I didn't take any nice screen shots like Ashu did, but I hope this will be useful to others.

Long term, we plan to open OpenStack up to select OSG VOs for testing. While we are still in the "tear it down and rebuild once a week" mode, it's just been opened up to select HCC users.

So, without further ado, I present...

Creating a new Fedora image using HCC's OpenStack

These notes are based on the upstream openstack documents here:


It all starts with an account.

For local users, contact hcc-support to get your access credentials.  They will come in a zipfile.  Download the zipfile into your home directory and unpack it.  Among other things, there will be a novarc file.  Source this:

source novarc

This will set up environment variables in your shell pointing to your login credentials. Do not share these with other people! You will need to do this each time you open a new shell.

To create the image, you will need root access on a development machine with KVM installed.  I used a CentOS 5.6 machine and did:

yum groupinstall kvm

to get the various necessary KVM packages.  I als

First, create a new raw image file:

qemu-img create -f raw /tmp/server.img 5G

This will be the block device that is presented to your virtual machine; make it as large as necessary. Our current hardware is pretty space-limited: smaller is encouraged. Next, download the Fedora boot ISO:

curl > /tmp/bfo.iso

This is a small, 670KB ISO file that contains just enough information to bootstrap the Anaconda installer.  Next, we'll boot it as a virtual machine on your local system.

sudo /usr/libexec/qemu-kvm -m 2048 -cdrom /tmp/bfo.iso -drive file=/tmp/server.img -boot d -net nic -net user -vnc -cpu qemu64 -M rhel5.6.0 -smp 2 -daemonize

This will create a simple virtual machine (2 cores, 2GB RAM) with /tmp/server.img as a drive, and boot the machine from /tmp/bfo.iso.  It will also allow you to connect to the VM via a VNC viewer.

If you are physically on the host machine, you can use a VNC viewer for screen ":0".  If you are logged in remotely (I log in from my Mac), you'll want to port-forward:

ssh -L 5900:localhost:5900

From your laptop, connect to localhost:0 with a VNC viewer.  Note that the most common VNC viewers on the Mac (the built-in Remote Viewer and Chicken of the VNC) don't work with KVM.  I found that "JollyFastVNC" works, but costs $5 from the App Store.

Once logged in, select the version of Fedora you'd like to install, and "click next" until the installation is done.  Fedora 15 is sure nice :)

Fedora will want to reboot the machine, but the reboot will fail because KVM is set to only boot from the CD.  So, once it tries to reboot, kill KVM and start it again with the following arguments:

sudo /usr/libexec/qemu-kvm -m 2048 -drive file=/tmp/server.img -net nic -net user -vnc -cpu qemu64 -M rhel5.6.0 -smp 2 -daemonize

Again, connect via VNC, and do any post-install customization.  Start by updating and turning on SSH:

yum update
yum install openssh-server
chkconfig sshd on

You will need to tweak /etc/fstab to make it suitable for a cloud instance.  Nova-compute may resize the disk at the time of launch of instances based on the instance type chosen. This can make the UUID of the disk invalid.  Further, we will remove the LVM setup, and just have the root partition present (no swap, no /boot).

Edit /mnt/etc/fstab.  Change the following three lines:

/dev/mapper/VolGroup-lv_root /                       ext4    defaults        1 1
UUID=0abae194-64c8-4d13-a4c0-6284d9dcd7b4 /boot                   ext4    defaults        1 2
/dev/mapper/VolGroup-lv_swap swap                    swap    defaults        0 0

to just one line:

LABEL=uec-rootfs              /          ext4           defaults     0    0

Since, Fedora does not ship with an init script for OpenStack, we will do a nasty hack for pulling the correct SSH key at boot. Edit the /etc/rc.local file and add the following lines before the line "touch /var/lock/subsys/local":

depmod -a
modprobe acpiphp

# simple attempt to get the user ssh key using the meta-data service
mkdir -p /root/.ssh
echo >> /root/.ssh/authorized_keys
curl -m 10 -s | grep 'ssh-rsa' >> /root/.ssh/authorized_keys
echo "************************"
cat /root/.ssh/authorized_keys
echo "************************"

Once you are finished customizing, go ahead and power off:


Converting to an acceptable OpenStack format

The image that needs to be uploaded to OpenStack needs to be an ext4 filesystem image; we currently have a raw block device image.  We will extract this filesystem from running a few commands on the host machine.  First, we need to find out the starting sector of the partition. Run:

fdisk -ul /tmp/server.img

You should see an output like this (the error messages are harmless):

last_lba(): I don't know how to handle files with mode 81a4
You must set cylinders.
You can do this from the extra functions menu.

Disk /dev/loop0: 5368 MB, 5368709120 bytes
255 heads, 63 sectors/track, 652 cylinders, total 10485760 sectors
Units = sectors of 1 * 512 = 512 bytes

      Device Boot      Start         End      Blocks   Id  System
/dev/loop0p1   *        2048     1026047      512000   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/loop0p2         1026048    10485759     4729856   8e  Linux LVM
Partition 2 does not end on cylinder boundary.

Note the following commands assume the units are 512 bytes.  You will need the start and end number for the "Linux LVM"; in this case, it is 1026048 and 10485759.

Copy the entire partition to a new file

dd if=/tmp/server.img of=/tmp/server.lvm.img skip=1026048 count=$((10485759-1026048)) bs=512

For "skip" and "count", use the begin and end you copy/pasted from the fdisk output. Now we have our LVM image; we'll need to activate it.  First, mount the LVM image on the loopback device and look for the volume group name:

[bbockelm@localhost ~]$ sudo /sbin/losetup /dev/loop0 /tmp/server.lvm.img
[bbockelm@localhost ~]$ sudo /sbin/pvscan
  PV /dev/sdb1    VG vg_home     lvm2 [7.20 TB / 0    free]
  PV /dev/sda2    VG vg_system   lvm2 [73.88 GB / 0    free]
  PV /dev/loop0   VG VolGroup    lvm2 [4.50 GB / 0    free]
  Total: 3 [1.28 TB] / in use: 3 [1.28 TB] / in no VG: 0 [0   ]

Note the third listing is for our loopback device (/dev/loop0) and a volume group named, simply, "VolGroup".  We'll want to activate that:

[bbockelm@localhost ~]$ sudo /sbin/vgchange -ay VolGroup
  2 logical volume(s) in volume group "VolGroup" now active

We can now see the Fedora root file system in /dev/VolGroup/lv_root.  We use dd to make a copy of this disk:

sudo dd if=/dev/VolGroup/lv_root of=/tmp/serverfinal.img

I get the following output:

[bbockelm@localhost ~]$ sudo dd if=/dev/VolGroup/lv_root of=/tmp/serverfinal2.img
3145728+0 records in
3145728+0 records out
1610612736 bytes (1.6 GB) copied, 14.5444 seconds, 111 MB/s

It's time to unmount all our devices.  Start by removing the LVM:

[bbockelm@localhost ~]$ sudo /sbin/vgchange -an VolGroup
  0 logical volume(s) in volume group "VolGroup" now active

Then, unmount our loopback device:

[bbockelm@localhost ~]$ sudo /sbin/losetup -d /dev/loop0

We will do one last tweak: change the label on our filesystem image to "uec-rootfs":

sudo /sbin/tune2fs -L uec-rootfs /tmp/serverfinal.img

*Note* that your filesystem image is ext4; if your host is RHEL5.x (this is my case!), your version of tune2fs will not be able to complete this operation.  In this case, you will need to restart your VM in KVM with the newly-extracted serverfinal.img as a second hard drive.  I did the following KVM invocation:

sudo /usr/libexec/qemu-kvm -m 2048 -drive file=/tmp/server.img -net nic -net user -vnc -cpu qemu64 -M rhel5.6.0 -smp 2 -daemonize -drive file=/tmp/serverfinal.img

The second drive shows up as /dev/sdb; go ahead and re-execute tune2fs from within the VM:

[root@localhost ~]# tune2fs -L uec-rootfs /dev/sdb

Extract Kernel and Initrd for OpenStack

Fedora creates a small boot partition separate from the LVM we extracted previously.  We'll need to mount it, and copy out the kernel and initrd.  First, mount the loopback device and map the partitions.

[bbockelm@localhost ~]$ sudo /sbin/losetup -f /tmp/server.img
[bbockelm@localhost ~]$ sudo /sbin/kpartx -a /dev/loop0

The boot partition should now be available at /dev/mapper/loop0p1.  Mount this:

[bbockelm@localhost ~]$ sudo mkdir  /tmp/server_image/
[bbockelm@localhost ~]$ sudo mount /dev/mapper/loop0p1  /tmp/server_image/

Now, copy out the kernel and initrd:

[bbockelm@localhost ~]$ cp /tmp/server_image/vmlinuz- ~
[bbockelm@localhost ~]$ cp /tmp/server_image/initramfs- ~

Unmount and unmap:

[bbockelm@localhost ~]$ sudo umount /tmp/server_image
[bbockelm@localhost ~]$ sudo /sbin/kpartx -d /dev/loop0
[bbockelm@localhost ~]$ sudo /sbin/losetup -d /dev/loop0

Upload into OpenStack

We need to bundle, then upload the kernel, initrd, and finally the image.  First, the kernel:

[bbockelm@localhost ~]$ euca-bundle-image -i ~/vmlinuz- --kernel true
Checking image
Encrypting image
Splitting image...
Part: vmlinuz-
Generating manifest /tmp/vmlinuz-
[bbockelm@localhost ~]$ euca-upload-bundle -b testbucket -m /tmp/vmlinuz-
Checking bucket: testbucket
Uploading manifest file
Uploading part: vmlinuz-
Uploaded image as testbucket/vmlinuz-
[bbockelm@localhost ~]$ euca-register testbucket/vmlinuz-
IMAGE	aki-0000000a

Write down the kernel ID; it is aki-0000000a above. Then, the initrd:

euca-bundle-image -i ~/initramfs- --ramdisk true
euca-upload-bundle -b testbucket -m /tmp/initramfs-
euca-register testbucket/initramfs-

My initrd's ID was ari-0000000b. Finally, the disk image itself

euca-bundle-image --kernel aki-0000000a --ramdisk ari-0000000b -i /tmp/serverfinal.img -r x86_64

This will save the image into /tmp and named "serverfinal.img.manifest.xml". I didn't particularly care for the name, so I changed it to "fedora-15.img.manifest.xml".  Now, upload:

euca-upload-bundle -b testbucket -m /tmp/fedora-15.img.manifest.xml
euca-register testbucket/serverfinal2.img.manifest.xml

Congratulations! You now have a brand-new Fedora-15 image ready to use. Fire up HybridFox and see if you were successful.

No comments:

Post a Comment