Sunday, March 23, 2008

rsync and sparse files

Unfortunately there's not a syscall in Linux to check if a file is sparse or not. There're some nice ideas to extend the lseek() syscall and implement such feature in ZFS (SEEK_HOLE and SEEK_DATA for sparse files), but there's nothing ready for production filesystems yet.

The common approach for user space applications is to implement a heuristic to check if a file can be treated as sparse (and save disk space) or not (and just write bytes to disk).

rsync, for example, checks every chunk of 1024 bytes before writing data to a generic destination file. If a chunk starts or ends with 0-s, these 0-s are just skipped by lseek().

Unfortunately, in some filesystems, typically optimized for large sequential I/O throughputs (like IBM GPFS, IBM SAN FS, or distributed filesystems in general), a lot of lseek()s operations can strongly impact on performances.

In this cases it can be very helpful to enlarge the block size used to handle sparse files.

For example, using a sparse write size of 32KB, I've been able to increase the transfer rate of an order of magnitude copying the output files of scientific applications from GPFS to GPFS or GPFS to SAN FS.

Read this thread on rsync mailing list.

And here is the patch to add --sparse-block=SIZE option to rsync, allowing to tune this parameter at run-time.

Saturday, March 22, 2008

Xen vs KVM/QEMU for testing SystemImager installs

Xen is great, but for SystemImager testing I've found many advantages using KVM/QEMU.

First of all, I don't need to run all my applications in a dom0 (or domU), that means I don't have any virtualization overhead for various openoffice instances, thunderbird, firefox, kopete, lastfm, kernel builds, SystemImager builds, and all the common activities I do in my laptop. With KVM/QEMU only the guest OSes are affected by virtualization overhead (more overhead with KVM/QEMU for the I/O anyway, respect to Xen guests). In fact, from this point of view KVM/QEMU is better suited for desktop machines.

Second, to test a standard SystemImager install I always need to use the unmodified BOEL or UYOK kernel, that means to run Xen fully-virtualized guests (HVM mode) and falls back to the QEMU hardware emulated model for I/O operations, aka "QEMU device manager" (qemu-dm), a patched version of the original qemu device emulation. So, in principle, there's not a big difference respect to run guest OSes with KVM/QEMU in terms of performance.

Third, I really like the command line syntax of kvm/qemu. In principle the xm syntax of Xen is better, because it supports both a configuration file and/or command line parameters, but I always prefer the qemu syntax, probably it's because I'm more familiar with it.

And what about VMWare? Well.. I've not been able to create x86_64 guests with VMPlayer, and I've not even found anything in the web that pointed me in the right direction. I liked the NAT-only network configuration of VMPlayer, but I think I can survive also with the little more complex bridged network setup for QEMU/KVM or Xen.

Here is my network config to start a KVM/QEMU virtual machine in Ubuntu 7.10:

/etc/dbus-1/event.d/25NetworkManager stop

modprobe tun
modprobe kvm-intel

brctl addbr br0
ifconfig br0 up
ifconfig eth0 up
brctl addif br0 eth0

tunctl -b -u 1000 -t qtap0
brctl addif br0 qtap0
ifconfig qtap0 up promisc

chmod a+rw /dev/net/tun
chmod a+rw /dev/kvm

kvm -cdrom /home/righiandr/systemimager.iso \
-hda /home/righiandr/xen/domains/Debian4/disk.img -boot d \
-m 512 -net nic,model=rtl8139 -net tap,ifname=qtap0,script=no -smp 1

NOTE: systemimager.iso is made by si_mkautoinstallcd.

Automating Xen VM deployment with SystemImager

Now, that I own an Intel VT capable PC, I'm able to test any kind of automatic Xen VM deployment with SystemImager directly from my laptop, even HVM Xen VMs.

Next step will be to test the deployment of a Xen VM with a Windows raw-disk image, cloned by SystemImager from a VMWare (vmplayer) machine. With this we're finally able to cover all the kind of migrations: virtual-to-virtual (that means migrations between equal or even different virtualization technologies), physical-to-virtual, virtual-to-physical, and physical-to-physical (obviously last one was the main task of SystemImager). And not only with Linux! Anyway, at the moment, this still needs some manual tweaks (read the howtos above) and I would really like to spend some efforts to implement a SystemImager GUI to fully automatize all these steps.

Sunday, March 16, 2008

picasa upload & download

After uploaded a lot of photos to picasa web albums via this small script I wrote (picasaupload), I realized that there's not a "magic key " to download the pictures back again to my pc. So, here is another script that works in the other way: picasadownload.

As suggested by the name this script allows to download images from a Google Picasa web album, passed as command line argument.

For example to download the pictures I made at Celsa, a wonderful castle near Siena:

Thursday, March 13, 2008

A new notebook...

After carrying around my notebook with all my books and notes I finally considered to buy an ultraportable model: a Dell Latitude D430.

My specific model is configured as follows:
  • Intel U7600 Ultra Low Voltage (ULV) Core 2 Duo
  • 2GB DDR2 533MHz SDRAM (1 x 1GB + 1GB integrated)
  • 12.1" WXGA Display (1280 x 800)
  • 80GB Toshiba 5200RPM 1.8'' HDD (I'll move to a SSD soon ;-))
  • MediaBase 8x DVD+/-RW Drive - W Euro
  • Intel 4965AGN wireless card
  • FreeDOS inside! Woo-hoo!!! (H.Simpson accent) no Microcrap included! :-) just reinstalled with Ubuntu 7.10.
And now I can finally build x86_64 SystemImager packages at home (or better everywhere) and play with Xen and a processor with Intel VT extensions!