<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4397409626710913610</id><updated>2012-02-17T00:43:17.847+01:00</updated><category term='linux'/><category term='prompt'/><category term='cgroup'/><category term='IO controller'/><category term='cache'/><category term='CINECA'/><category term='rsync'/><category term='lock'/><category term='SystemImager'/><category term='page writeback'/><category term='battery'/><category term='multi-core'/><category term='website'/><category term='BCX'/><category term='bash'/><category term='samsung'/><category term='BitTorrent'/><category term='iozone'/><category term='picasa'/><category term='mutt'/><category term='android'/><category term='homepage'/><category term='framebuffer'/><category term='sparse file'/><category term='kernel'/><category term='gt-i9100'/><category term='multi-thread'/><category term='gcc'/><category term='buffer overflow'/><category term='linuxday'/><category term='ubuntu'/><category term='deadlock'/><category term='gmail'/><title type='text'>arighi's blog</title><subtitle type='html'>Andrea Righi</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>49</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-1363225522965524823</id><published>2011-08-30T11:52:00.000+02:00</published><updated>2011-08-30T11:52:56.341+02:00</updated><title type='text'>install busybox from source on Samsung GT-I9100</title><content type='html'>If you want a complete set of unix tools in your phone, here are the steps to cross-compile and install busybox from source (&lt;b&gt;there's no need to root the phone and/or install everything using the busybox installer from the market&lt;/b&gt;).&lt;br /&gt;&lt;br /&gt;Download the latest ARM gnueabi toolchain from &lt;a href="http://www.codesourcery.com/sgpp/lite/arm/portal/release1803"&gt;codesourcery&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Get the busybox source code from the git repository:&lt;br /&gt;&amp;nbsp; $ git clone git://busybox.net/busybox.git&lt;br /&gt;&lt;br /&gt;Copy my busybox config file (or use your own config if you prefer, this can be just a starting point):&lt;br /&gt;&lt;br /&gt;$ cd busybox&lt;br /&gt;$ wget -O .config &lt;a href="http://www.develer.com/~arighi/android/busybox/config"&gt;http://www.develer.com/~arighi/android/busybox/config&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;[optional]&lt;/i&gt; Change the busybox config if you want by running:&lt;br /&gt;$ make menuconfig&lt;br /&gt;&lt;br /&gt;Cross-compile busybox:&lt;br /&gt;$ make oldconfig &amp;amp;&amp;amp; make&lt;br /&gt;&lt;br /&gt;At the end of the build the busybox binary should be available as a statically linked ELF for ARM:&lt;br /&gt;$ file busybox&lt;br /&gt;busybox: ELF 32-bit LSB executable, ARM, version 1 (SYSV), statically linked, for GNU/Linux 2.6.16, stripped&lt;br /&gt;&lt;br /&gt;Upload the busybox binary to the device (you don't need root access to your phone):&lt;br /&gt;$ adb push busybox /data/local/tmp/&lt;br /&gt;&lt;br /&gt;Now busybox it's ready to be used:&lt;br /&gt;$ adb shell /data/local/tmp/busybox CMD&lt;br /&gt;&lt;br /&gt;Example:&lt;br /&gt;$ adb shell /data/local/tmp/busybox lsusb&lt;br /&gt;Bus 001 Device 001: ID 1d6b:0002&lt;br /&gt;Bus 001 Device 002: ID 1519:0020&lt;br /&gt;&lt;br /&gt;=== NOTE: all the following steps are optional ===&lt;br /&gt;&lt;br /&gt;If you have enabled the adb root shell access to your phone (i.e., by rooting the phone or by installing my&amp;nbsp;&lt;a href="howtohttp://arighi.blogspot.com/2011/08/howto-custom-kernel-on-samsung-galaxy-s.html"&gt;custom kernel&lt;/a&gt;), you can also install busybox in the /system partition and have all the commands available in the $PATH.&lt;br /&gt;&lt;br /&gt;Remount the /system partition in read-write mode on your phone:&lt;br /&gt;$ adb shell "mount -oremount,rw&amp;nbsp;/dev/block/mmcblk0p9 /system&lt;br /&gt;&lt;br /&gt;Upload the busybox binary to the /system partition:&lt;br /&gt;$ adb push busybox /system/xbin/busybox&lt;br /&gt;&lt;br /&gt;Install busybox by using busybox itself:&lt;br /&gt;$ adb shell "chmod 755 /system/xbin/busybox"&lt;br /&gt;$ adb shell "/system/xbin/busybox --install -s /system/xbin"&lt;br /&gt;&lt;br /&gt;Remount /system in read-only:&lt;br /&gt;$ adb shell "mount -oremount,ro&amp;nbsp;/dev/block/mmcblk0p9 /system"&lt;br /&gt;&lt;br /&gt;Now all the busybox applets are in your $PATH:&lt;br /&gt;&lt;br /&gt;$ adb shell lsusb&lt;br /&gt;Bus 001 Device 001: ID 1d6b:0002&lt;br /&gt;Bus 001 Device 002: ID 1519:0020&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-1363225522965524823?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/1363225522965524823/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=1363225522965524823' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1363225522965524823'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1363225522965524823'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2011/08/install-busybox-from-source-on-samsung.html' title='install busybox from source on Samsung GT-I9100'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-817194586764907231</id><published>2011-08-30T00:33:00.001+02:00</published><updated>2011-10-05T00:28:07.621+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='samsung'/><category scheme='http://www.blogger.com/atom/ns#' term='kernel'/><category scheme='http://www.blogger.com/atom/ns#' term='gt-i9100'/><category scheme='http://www.blogger.com/atom/ns#' term='android'/><title type='text'>HOWTO: custom kernel on Samsung Galaxy S II I9100</title><content type='html'>Recently I've replaced my Bravo HTC Desire with a new Android phone: a &lt;a href="http://www.samsung.com/global/microsite/galaxys2/html/"&gt;Samsung Galaxy S II I-9100&lt;/a&gt;. I couldn't resist too much with the stock kernel, so finally I've found some spare time to cook a custom kernel starting from the original Samsung kernel source code &lt;a href="https://opensource.samsung.com/reception/reception_main.do?method=reception_search&amp;amp;searchValue=I9100"&gt;GT-I9100_OpenSource_Update2&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In this post I report (for me to remember later and for those who are interested) all the&amp;nbsp;steps that I did to setup the build environment, cross compile the custom kernel and flash it into the phone.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;DISCLAIMER:&amp;nbsp;&lt;/b&gt;&lt;b&gt;I take no responsibility for anything that may go wrong by you following these instructions. &lt;/b&gt;&lt;b&gt;Proceed at your own risk!&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;=== Requirements ===&lt;br /&gt;&lt;br /&gt;- A Samsung Galaxy S II (not necessarily rooted! you'll get a root shell when you'll flash the new kernel)&lt;br /&gt;- The latest android &lt;a href="http://developer.android.com/sdk/index.html"&gt;SDK&lt;/a&gt;&lt;br /&gt;- The arm-none-eabi cross-compile toolchain (you can get it from the &lt;a href="https://sourcery.mentor.com/sgpp/lite/arm/portal/release1802"&gt;CodeSourcery website&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;=== HOWTO ===&lt;br /&gt;&lt;br /&gt;Download and install the arm toolchain from the CodeSourcery website: be sure that arm-none-eabi-gcc is in your $PATH.&lt;br /&gt;&lt;br /&gt;Get the &lt;a href="https://github.com/arighi/gt-i9100"&gt;autobuild script&lt;/a&gt;:&lt;br /&gt;&amp;nbsp;$ git clone git://github.com/arighi/gt-i9100.git&lt;br /&gt;&lt;br /&gt;Run the script:&lt;br /&gt;&amp;nbsp;$ ./build-kernel.sh&lt;br /&gt;&lt;br /&gt;The script downloads the "&lt;i&gt;-arighi&lt;/i&gt;" &lt;a href="https://github.com/arighi/linux-gt-i9100"&gt;kernel source code&lt;/a&gt;, a &lt;a href="https://github.com/arighi/initramfs-gt-i9100"&gt;initramfs template&lt;/a&gt; and builds a new kernel ready to be flashed into the device.&lt;br /&gt;&lt;br /&gt;At the end of the autobuild process the file&amp;nbsp;&lt;b&gt;kernel-gt-i9100-arighi.tar&lt;/b&gt;&amp;nbsp;can be used to flash the new kernel to the phone using Odin (search on the web or in the xda-developers forum, there are tons of howtos/tutorials for this).&lt;br /&gt;&lt;br /&gt;=== Results ===&lt;br /&gt;&lt;br /&gt;The score with Quadrant benchmark is not bad at all, I got always &amp;gt; 4000, but remember that we're cheating during the IO test, due to the fsync-disable patch.&lt;br /&gt;&lt;br /&gt;Anyway overall result looks good enough.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-fkxDkFW-jpM/Tluroqy683I/AAAAAAAAHgM/dkjXZI2PE1k/s1600/device.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/-fkxDkFW-jpM/Tluroqy683I/AAAAAAAAHgM/dkjXZI2PE1k/s320/device.png" width="208" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;== Additional notes ===&lt;br /&gt;&lt;br /&gt;- All the custom *.ko files are included into the initramfs to avoid potential errors/problems with the original kernel modules (so it is always possible to flash back the original kernel later, all the old kernel modules are still there, untouched).&lt;br /&gt;&lt;br /&gt;- After you've flashed the -arighi kernel the first time you will also have root access to your device. The initramfs template enables adb root shell&amp;nbsp;(ro.secure == 0), so an adb shell&amp;nbsp;will immediately drop you to a root shell. This means you can also re-flash your device from Linux directly using the &lt;a href="https://github.com/arighi/gt-i9100/blob/master/flash-kernel.sh"&gt;flash-kernel.sh&lt;/a&gt; script.&lt;br /&gt;&lt;br /&gt;- For the complete list of all the patches applied to this kernel have a look at the git log &lt;a href="https://github.com/arighi/linux-gt-i9100/commits/master"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;- IMPORTANT: the &lt;a href="https://github.com/arighi/linux-gt-i9100/commit/924375d3872db6c16894e9d2e7c3f1b408df0e45"&gt;fsync-disable patch&lt;/a&gt; (enabled in the kernel by default) can increase performance and battery life, but it is dangerous, because it might eat your data!! It makes the software no longer crash safe, so if you start to randomly kill your apps you may lose some data&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;[UPDATE: the fsync-disable patch is no more enabled by default in the kernel, to enable it just set&amp;nbsp;CONFIG_FILE_SYNC_DISABLE=y in the kernel .config)]&lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-817194586764907231?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/817194586764907231/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=817194586764907231' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/817194586764907231'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/817194586764907231'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2011/08/howto-custom-kernel-on-samsung-galaxy-s.html' title='HOWTO: custom kernel on Samsung Galaxy S II I9100'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-fkxDkFW-jpM/Tluroqy683I/AAAAAAAAHgM/dkjXZI2PE1k/s72-c/device.png' height='72' width='72'/><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-904651226057457563</id><published>2011-01-20T00:09:00.005+01:00</published><updated>2011-01-20T00:21:15.680+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='kernel'/><category scheme='http://www.blogger.com/atom/ns#' term='android'/><title type='text'>Android: automated per-uid task group</title><content type='html'>Android is a privilege-separated operating system, in which each application&amp;nbsp;runs with a distinct system identity: the Linux user ID (uid).&lt;br /&gt;&lt;br /&gt;With this patch (&lt;a href="http://www.develer.com/~arighi/android/cm-kernel/sched-automated-per-uid-task-groups.patch"&gt;sched-automated-per-uid-task-groups.patch&lt;/a&gt;) the kernel automatically creates a&amp;nbsp;distinct&amp;nbsp;task group for each uid (when a process calls set_user()) and places all the tasks that belong to a single uid into the same task group. In this way&amp;nbsp;each application can get a fair amount of the CPU bandwidth (guaranteed by the CFS&amp;nbsp;scheduler), independently by the number of task/threads spawned.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;The patch is against the CyanogenMod's 6.1.1 kernel (2.6.35.10) and I tested it successfully on my HTC Desire (Bravo).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I've used the following testcase:&lt;/div&gt;&lt;div&gt;&amp;nbsp;&amp;nbsp;- run 4 cpu hogs in background as user app_35 (com.android.email in my case):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;# su - app_35&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;$ for i in `seq 4`; do yes &amp;gt;/dev/null &amp;amp; done&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&amp;nbsp;&amp;nbsp;- run the &lt;a href="http://www.aurorasoftworks.com/products/quadrant"&gt;Quadrant benchmark&lt;/a&gt; in parallel&amp;nbsp;and measure the result with and without the patch applied&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;Without the patch (output of top):&lt;/div&gt;&lt;div&gt;&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;PID &amp;nbsp;PPID USER &amp;nbsp; &amp;nbsp; STAT &amp;nbsp; VSZ %MEM CPU %CPU COMMAND&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6533 &amp;nbsp; 123 10070 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 202m 48.6 &amp;nbsp; 0 &lt;b&gt;20.0&lt;/b&gt; com.aurorasoftworks.quadrant.ui.st&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6506 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 20.0 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6507 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 20.0 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6508 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 20.0 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6509 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 20.0 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Benchmark result: &lt;b&gt;676&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&amp;nbsp;&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;uid 10035 (cpu hog) &amp;nbsp;: 60.0 % cpu quota&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;uid 10070 (benchmark): 20.0 % cpu quota&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With automated per-uid task group (output of top):&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;PID &amp;nbsp;PPID USER &amp;nbsp; &amp;nbsp; STAT &amp;nbsp; VSZ %MEM CPU %CPU COMMAND&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6784 &amp;nbsp; 123 10070 &amp;nbsp; &amp;nbsp;S &amp;nbsp; &amp;nbsp; 209m 51.4 &amp;nbsp; 0 &lt;b&gt;50.0&lt;/b&gt; com.aurorasoftworks.quadrant.ui.st&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6852 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 12.5 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6853 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 12.5 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6854 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 12.5 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;6855 &amp;nbsp; &amp;nbsp; 1 10035 &amp;nbsp; &amp;nbsp;R &amp;nbsp; &amp;nbsp; 1128 &amp;nbsp;0.2 &amp;nbsp; 0 12.5 yes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Benchmark result: &lt;b&gt;816&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;uid 10035 (cpu hog) &amp;nbsp;: 50.0 % cpu quota&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;uid 10070 (benchmark): 50.0 % cpu quota&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Total speedup: &lt;b&gt;~1.2&lt;/b&gt; (the benchmark is about 20% faster in this case), because in the last case the benchmark gets ~50% of the CPU time and in the other case it gets only ~20%, despite the fact that there are 2 "pair" applications that should be correctly considered as equal from the user's perspective.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;This patch is based on the patch "&lt;a href="http://marc.info/?l=linux-kernel&amp;amp;m=128978361700898&amp;amp;w=2"&gt;sched: automated per tty task groups&lt;/a&gt;" by Mike Galbraith.&lt;/i&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-904651226057457563?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/904651226057457563/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=904651226057457563' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/904651226057457563'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/904651226057457563'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2011/01/android-automated-per-uid-task-group.html' title='Android: automated per-uid task group'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4658422063885192869</id><published>2011-01-10T23:28:00.001+01:00</published><updated>2011-01-10T23:30:35.961+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='kernel'/><category scheme='http://www.blogger.com/atom/ns#' term='android'/><title type='text'>HOWTO: install a custom kernel on HTC Desire</title><content type='html'>=== Disclaimer ===&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I take no responsibility for anything that may go wrong by you following these instructions.&amp;nbsp;Proceed at your own risk!&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I tested this howto with a Bravo HTC Desire, rooted with Unrevoked 3.22.&lt;br /&gt;&lt;br /&gt;=== Requirements ===&lt;br /&gt;&lt;br /&gt;- A rooted &lt;a href="http://www.htc.com/www/product/desire/overview.html"&gt;HTC Desire (Bravo)&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;- The android &lt;a href="http://developer.android.com/sdk/index.html"&gt;SDK&lt;/a&gt; + &lt;a href="http://developer.android.com/sdk/ndk/index.html"&gt;NDK&lt;/a&gt; (to get the cross-compile toolchain):&lt;br /&gt;&lt;br /&gt;- The latest &lt;a href="http://developer.htc.com/"&gt;HTC Desire kernel&lt;/a&gt;&amp;nbsp;(choose "&lt;i&gt;HTC Desire - Froyo MR - 2.6.32 kernel source code&lt;/i&gt;")&lt;br /&gt;&lt;br /&gt;- The &lt;a href="https://github.com/koush/AnyKernel"&gt;koush's AnyKernel&lt;/a&gt; template (to generate the update.zip at the end of the build process)&lt;br /&gt;&lt;br /&gt;=== HOWTO ===&lt;br /&gt;&lt;br /&gt;- Prepare the cross-compiler environment (replace /opt/android with the path where you have installed the Andorid NDK):&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ export PATH=/opt/android/android-ndk-r5/toolchains/arm-eabi-4.4.0/prebuilt/linux-x86/bin:$PATH&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;At this point arm-eabi-gcc, as well as other binutils and compiler binaries, should be in your $PATH.&lt;br /&gt;&lt;br /&gt;- Untar the kernel&lt;br /&gt;&lt;br /&gt;- save the kernel config (if you want to restore the original kernel config):&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ adb pull /proc/config.gz&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- &lt;i&gt;[optional]&lt;/i&gt; Apply the following patches to the kernel:&lt;br /&gt;&amp;nbsp;&amp;nbsp; -&amp;nbsp;&lt;a href="http://www.develer.com/~arighi/android/linux/0001-sync-disable-fsync-fdatasync-sync_file_range-syscall.patch"&gt;0001-sync-disable-fsync-fdatasync-sync_file_range-syscall&lt;/a&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; - &lt;a href="http://www.develer.com/~arighi/android/linux/0002-writeback-change-default-dirty-memory-settings.patch"&gt;0002-writeback-change-default-dirty-memory-settings&lt;/a&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; - &lt;a href="http://www.develer.com/~arighi/android/linux/0003-sched-replace-CFS-with-the-BFS-scheduler.patch"&gt;0003-sched-replace-CFS-with-the-BFS-scheduler&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;- &lt;i&gt;[optional]&lt;/i&gt; Take my kernel configuration:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ wget -O bravo-2.6.32-gd96f2c0/.config http://www.develer.com/~arighi/android/linux/config&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Or use the previously saved config.gz either.&lt;br /&gt;&lt;br /&gt;- Build the kernel:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ cd bravo-2.6.32-gd96f2c0/&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ make ARCH=arm CROSS_COMPILE=arm-eabi- oldconfig&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ make ARCH=arm CROSS_COMPILE=arm-eabi-&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now you should find the fresh new kernel, ready to be flashed on your HTC Desire, in arch/arm/boot/zImage.&lt;br /&gt;&lt;br /&gt;- Apply the following patch to the koush's AnyKernel source (to fix a syntax error when trying to flash update.zip from ClockworkMod):&lt;br /&gt;&amp;nbsp;&amp;nbsp; -&amp;nbsp;&lt;a href="http://www.develer.com/~arighi/android/anykernel/0001-updater-script-specify-the-mount-options-for-the-sys.patch"&gt;0001-updater-script-specify-the-mount-options-for-the-sys&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;- Replace the zImage in the AnyKernel template with your zImage:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ cp bravo-2.6.32-gd96f2c0/arch/arm/boot/zImage AnyKernel/kernel/zImage&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- Go back to the template directory (you will see three subdirectories: META-INF, kernel &amp;amp; system) and generate the update.zip:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ zip -r ../update.zip *&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- Connect your phone via a USB cable (be sure to turn on USB debugging on your phone: Settings -&amp;gt; Applications -&amp;gt; Development -&amp;gt; USB debugging)&lt;br /&gt;&lt;br /&gt;- Push update.zip and the wireless module to the SD card of your phone:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ adb push update.zip /sdcard/update.zip&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ adb push bravo-2.6.32-gd96f2c0/drivers/net/wireless/bcm4329_204/bcm4329.ko /sdcard/bcm4329.ko&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- Reboot your phone in ClockworkMod recovery (power-on while holding Volume down key and select RECOVERY)&lt;br /&gt;&lt;br /&gt;- In ClockworkMod select "apply sdcard:update.zip" (confirm: choose Yes)&lt;br /&gt;&lt;br /&gt;- Reboot your phone via "reboot system now"&lt;br /&gt;&lt;br /&gt;At this point your custom new kernel should boot.&lt;br /&gt;&lt;br /&gt;=== Fix the bcm4329 wireless module loading without S-OFF ===&lt;br /&gt;&lt;br /&gt;The bcm4329.ko module can't be properly overwritten in the /system partition without S-OFF the device and so give access in read-write to the /system partition. However, we can enforce the usage of our new module binding any other writable directory to /system/lib/modules (i.e., /data/local).&lt;br /&gt;&lt;br /&gt;Prerequisites:&lt;br /&gt;- the latest &lt;a href="http://www.codesourcery.com/sgpp/lite/arm/portal/subscription?@template=lite"&gt;ARM toolchain&lt;/a&gt; downloadable from the CodeSourcery site&lt;br /&gt;&lt;br /&gt;Download and install the ARM toolchain and be sure that arm-none-linux-gnueabi-gcc is in your $PATH.&lt;br /&gt;&lt;br /&gt;- get the latest version of busybox from git (or download a recent stable version if you prefer):&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ git clone git://busybox.net/busybox.git&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Howto:&lt;br /&gt;- use my busybox config file:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ cd busybox&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ wget -O .config http://www.develer.com/~arighi/android/busybox/config&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- cross-compile busybox:&lt;br /&gt;$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- oldconfig&lt;br /&gt;$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi-&lt;br /&gt;&lt;br /&gt;- copy the busybox binary into the /data/local directory on your phone:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ adb push busybox /data/local/busybox&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- bind the directory /system/lib/modules with /data/local:&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ adb shell&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ su&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;# cat /sdcard/bcm4329.ko &amp;gt; /data/local/bcm4329.ko&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;# /data/local/busybox mount --bind /data/local /system/lib/modules&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;After this trick go on your phone check Settings -&amp;gt; Wireless &amp;amp; networks -&amp;gt; Wi-Fi. The wireless connection should start normally.&lt;br /&gt;&lt;br /&gt;=== Results ===&lt;br /&gt;&lt;br /&gt;Here is my score with this kernel using the Quadrant benchmark: &lt;b&gt;1370!&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_Co437B-zJP4/TSuF4h_fVQI/AAAAAAAAHY4/FWjvCrFkQdw/s1600/kernel-custom.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://2.bp.blogspot.com/_Co437B-zJP4/TSuF4h_fVQI/AAAAAAAAHY4/FWjvCrFkQdw/s320/kernel-custom.png" width="212" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4658422063885192869?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4658422063885192869/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4658422063885192869' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4658422063885192869'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4658422063885192869'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2011/01/howto-install-custom-kernel-on-htc.html' title='HOWTO: install a custom kernel on HTC Desire'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_Co437B-zJP4/TSuF4h_fVQI/AAAAAAAAHY4/FWjvCrFkQdw/s72-c/kernel-custom.png' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-587173017379925917</id><published>2009-09-11T18:53:00.000+02:00</published><updated>2009-09-11T18:53:20.661+02:00</updated><title type='text'>Linux kernel hacking: file notification system with kernel tracepoints</title><content type='html'>An example of how to use kernel tracepoints to create a simple real-time backup / file change notification system. The article (only in italian, sorry) is available at &lt;a href="http://stacktrace.it/2009/09/linux-kernel-hacking-real-time-backup-con-i-kernel-tracepoints/"&gt;stacktrace.it&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-587173017379925917?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/587173017379925917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=587173017379925917' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/587173017379925917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/587173017379925917'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/09/linux-kernel-hacking-file-notification.html' title='Linux kernel hacking: file notification system with kernel tracepoints'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-9062166706622954248</id><published>2009-06-26T14:30:00.005+02:00</published><updated>2009-06-26T16:26:52.495+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mutt'/><category scheme='http://www.blogger.com/atom/ns#' term='gmail'/><title type='text'>mutt + gmail notifier</title><content type='html'>I really enjoy the power of mutt, and I've to say that I've not too far from reaching the email Nirvana with it :). OK, it's not the email client for everybody, it's just for the people that prefer the keyboard to the mouse and love the command line interfaces.&lt;br /&gt;&lt;br /&gt;There's only one missing feature in mutt: a nice way to notifiy new emails. The problem with mutt is that I need to periodically switch to the mutt shell to check for new emails. And I don't even like the crappy notification balloons that cover the useful part of the desktop (e.g., thunderbird). A small tray icon could be a solution (and I did it this way for a while, patching mutt), but with the icon I don't see immediately the message that I receive.&lt;br /&gt;&lt;br /&gt;This led me to notice a large unused area in the gnome panel at the top (recently I moved from Fluxbox to Gnome, yeah! :) now I've a ultra-very-fast SSD I can also use a fancy desktop environment). So, why not to use the top panel to notify the subject of the last email I received in my mailbox? ta-da! the solution: a small python gnome applet that periodically fetches the last unread email from a generic IMAPS folder in gmail and prints the subject to the panel.&lt;br /&gt;&lt;br /&gt;Here's the code: &lt;a href="http://download.systemimager.org/%7Earighi/gmail-check/"&gt;gmail-check&lt;/a&gt;, very minimalist and designed just for my particular desktop environment, so it may not work in some cases...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-9062166706622954248?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/9062166706622954248/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=9062166706622954248' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/9062166706622954248'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/9062166706622954248'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/06/mutt-gmail-notifier.html' title='mutt + gmail notifier'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-9188645081106440364</id><published>2009-06-08T13:58:00.002+02:00</published><updated>2009-06-08T14:16:26.413+02:00</updated><title type='text'>New SSD disk</title><content type='html'>I just got one new SSD disk MTRON MOBI 3000 for my Dell Latitude D430 notebook. It's very small, only 32GB, but it definitely ROCKS!!! I can boot in about 12 seconds, without any deep tuning of the kernel and boot services, but the _responsiveness_ is the most relevant thing, apart the read/write 100MB/s throughput (that is not so important for a desktop system). The impressive part is the ~5500 iops (IO operations per second) obtained using a workload of 4KB random reads/writes!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-9188645081106440364?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/9188645081106440364/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=9188645081106440364' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/9188645081106440364'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/9188645081106440364'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/06/new-ssd-disk.html' title='New SSD disk'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-1762890798429268481</id><published>2009-05-14T23:39:00.003+02:00</published><updated>2009-05-14T23:50:46.732+02:00</updated><title type='text'>Linux kernel hacking: process containers</title><content type='html'>A basic overview of the Linux cgroups. My article is available at &lt;a href="http://stacktrace.it/2009/05/linux-kernel-hacking-contenitori-di-processi/"&gt;stacktrace.it&lt;/a&gt; (in italian).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-1762890798429268481?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/1762890798429268481/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=1762890798429268481' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1762890798429268481'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1762890798429268481'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/05/linux-kernel-hacking-process-containers.html' title='Linux kernel hacking: process containers'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4503361081314503199</id><published>2009-05-02T16:00:00.004+02:00</published><updated>2009-05-02T16:11:48.075+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='battery'/><category scheme='http://www.blogger.com/atom/ns#' term='bash'/><category scheme='http://www.blogger.com/atom/ns#' term='prompt'/><title type='text'>battery life in bash prompt</title><content type='html'>I've just reconfigured my .bashrc to execute this bash script that allows to show the perentage of battery life at the beginning of the command prompt. Geeze, really nice! :) At this point I can turn off the guidance-power-manager applet and enjoy a faster boot.&lt;pre&gt;&lt;br /&gt;#!/bin/bash&lt;br /&gt;GRAY="1;30"&lt;br /&gt;CYAN="0;36"&lt;br /&gt;LIGHT_CYAN="1;36"&lt;br /&gt;LIGHT_BLUE="1;34"&lt;br /&gt;YELLOW="1;33"&lt;br /&gt;WHITE="0;1"&lt;br /&gt;NO_COLOR="0"&lt;br /&gt;LIGHT_RED="1;31"&lt;br /&gt;LIGHT_GREEN="1;32"&lt;br /&gt;BROWN="0;33"&lt;br /&gt;&lt;br /&gt;function battery_info()&lt;br /&gt;{&lt;br /&gt;    BATT_INFO=$(acpi -b | awk -F', ' '{print $2}')&lt;br /&gt;    AC_INFO=$(acpi -aB | awk -F': ' '{print $2}')&lt;br /&gt;&lt;br /&gt;    if [ $AC_INFO = "off-line" ]; then&lt;br /&gt;        BATT_PERC=${BATT_INFO:0:${#BATT_INFO}-1}&lt;br /&gt;&lt;br /&gt;        if [ $BATT_PERC -ge 75 ]; then&lt;br /&gt;            COLOR=$LIGHT_GREEN&lt;br /&gt;        elif [ $BATT_PERC -le 25 ]; then&lt;br /&gt;            COLOR=$LIGHT_RED&lt;br /&gt;        else&lt;br /&gt;            COLOR=$YELLOW&lt;br /&gt;        fi&lt;br /&gt;    else&lt;br /&gt;        COLOR=$NO_COLOR&lt;br /&gt;    fi&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;PROMPT_COMMAND=battery_info&lt;br /&gt;PS1="\[\033[\$(echo -n \$COLOR)m\]\$(echo -n \$BATT_INFO)\&lt;br /&gt;\[\033[${NO_COLOR}m\] \u@\h:\[\033[${WHITE}m\]\w\[\033[${NO_COLOR}m\]\$ "&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4503361081314503199?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4503361081314503199/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4503361081314503199' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4503361081314503199'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4503361081314503199'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/05/battery-life-in-bash-prompt.html' title='battery life in bash prompt'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-7808575696353889408</id><published>2009-04-28T23:24:00.003+02:00</published><updated>2009-04-28T23:37:54.799+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iozone'/><category scheme='http://www.blogger.com/atom/ns#' term='buffer overflow'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>iozone: buffer overflow in ubuntu jaunty</title><content type='html'>In the latest Ubuntu Jaunty iozone immediately crashes with a nice *** buffer overflow detected *** message, that means it is practically unusable. Fortunately the cause of the bug is very simple: a wrong length used to copy a string by gethostname(). I posted a fix &lt;a href="https://bugs.launchpad.net/ubuntu/+source/iozone3/+bug/320615"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-7808575696353889408?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/7808575696353889408/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=7808575696353889408' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7808575696353889408'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7808575696353889408'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/04/iozone-buffer-overflow-in-ubuntu-jaunty.html' title='iozone: buffer overflow in ubuntu jaunty'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4462012240276504664</id><published>2009-04-16T17:17:00.002+02:00</published><updated>2009-04-16T17:31:02.735+02:00</updated><title type='text'>cgroup: io-throttle controller (v13)</title><content type='html'>A new version of my IO controller for Linux cgroups.&lt;br /&gt;&lt;br /&gt;LWN.net coverage at &lt;a href="http://lwn.net/Articles/328484/"&gt;http://lwn.net/Articles/328484/&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4462012240276504664?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4462012240276504664/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4462012240276504664' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4462012240276504664'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4462012240276504664'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/04/cgroup-io-throttle-controller-v13.html' title='cgroup: io-throttle controller (v13)'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4621337164116452580</id><published>2009-03-10T16:22:00.020+01:00</published><updated>2009-05-14T15:59:21.420+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multi-thread'/><category scheme='http://www.blogger.com/atom/ns#' term='lock'/><category scheme='http://www.blogger.com/atom/ns#' term='multi-core'/><title type='text'>Performance improvement of parallel applications with user-space spinlocks</title><content type='html'>Mutex locks provided by the operating system are expensive to create and acquire and a high contention can dramatically reduce or even eliminate the advantage of parallelism.&lt;br /&gt;&lt;br /&gt;GCC provides some atomic operations, directly mapped to the atomic instructions actually provided by the underlying hardware. These are the same instructions that are used to implement the traditional locking primitives (mutexes), but they can be used to implement user-space locking primitives and save a significant amount of overhead.&lt;br /&gt;&lt;br /&gt;However, this is not always the best solution: user-space spinlocks have their own drawbacks and overheads. Spinlocks are good when another CPU has the lock and it is likely to release it as soon as possible. In general, where there are more threads than physical CPUs, a spinlock simply wastes the CPU, until the OS decides to preempt it. Moreover, with spinlocks, the CPUs spins (really? :)) forever trying to acquire the lock. This leads to more power consumption and more heat to be dissipated, so this could be a really _bad_ solution for many embedded / ultra-portable devices (this is the same reason because asynchronous interrupt-driven handlers are better than polling for embedded devices). [ Actually, there are also some memory barrier and out-of-order execution troubles, but we will not discuss about this topic now... maybe I'll report some details in another post... ].&lt;br /&gt;&lt;br /&gt;Anyway, in the following example we will see a typical problem where the usage of user-space spinlocks can bring evident benefits for performance (however, there is still the power consumption issue, but we don't care about it in this case).&lt;br /&gt;&lt;br /&gt;Setting the USER_SPINLOCK option in the Makefile it is possible to choose between (0) the traditional pthread_mutex primitives and (1) custom user-space spinlock implementation.&lt;br /&gt;&lt;br /&gt;The problem is the classical &lt;a href="http://en.wikipedia.org/wiki/Dining_philosophers_problem"&gt;dining philosopher problem&lt;/a&gt;.&lt;br /&gt;&lt;pre class="code"&gt;&lt;br /&gt;=== Makefile ===&lt;br /&gt;&lt;br /&gt;N_CPUS=$(shell getconf _NPROCESSORS_ONLN)&lt;br /&gt;CACHELINE_SIZE=$(shell getconf LEVEL1_DCACHE_LINESIZE)&lt;br /&gt;&lt;br /&gt;USER_SPINLOCK=1&lt;br /&gt;&lt;br /&gt;TARGET=userspace-spinlock&lt;br /&gt;&lt;br /&gt;all:&lt;br /&gt;    gcc -g -O3 -lpthread -o$(TARGET) \&lt;br /&gt;        -DN_CPUS=$(N_CPUS) \&lt;br /&gt;        -DCACHELINE_SIZE=$(CACHELINE_SIZE) \&lt;br /&gt;        -DUSER_SPINLOCK=$(USER_SPINLOCK) \&lt;br /&gt;        $(TARGET).c&lt;br /&gt;&lt;br /&gt;clean:&lt;br /&gt;    rm -f $(TARGET)&lt;br /&gt;&lt;br /&gt;=== userspace-spinlock.c ===&lt;br /&gt;&lt;br /&gt;#define _GNU_SOURCE&lt;br /&gt;&lt;br /&gt;#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;#include &amp;lt;pthread.h&amp;gt;&lt;br /&gt;#include &amp;lt;sched.h&amp;gt;&lt;br /&gt;#include &amp;lt;errno.h&amp;gt;&lt;br /&gt;#include &amp;lt;time.h&amp;gt;&lt;br /&gt;#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;#include &amp;lt;sys/types.h&amp;gt;&lt;br /&gt;#include &amp;lt;sys/syscall.h&amp;gt;&lt;br /&gt;&lt;br /&gt;static pthread_t threads[N_CPUS];&lt;br /&gt;static int input[N_CPUS];&lt;br /&gt;&lt;br /&gt;#if USER_SPINLOCK&lt;br /&gt;static int shared[N_CPUS * CACHELINE_SIZE];&lt;br /&gt;&lt;br /&gt;static inline void lock(int *l)&lt;br /&gt;{&lt;br /&gt;    while (__sync_lock_test_and_set(l, 1));&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;static inline void unlock(int *l)&lt;br /&gt;{&lt;br /&gt;    *l = 0;&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;#else /* USER_SPINLOCK */&lt;br /&gt;static pthread_mutex_t shared[N_CPUS * CACHELINE_SIZE];&lt;br /&gt;&lt;br /&gt;static inline void lock(pthread_mutex_t *l)&lt;br /&gt;{&lt;br /&gt;    pthread_mutex_lock(l);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;static inline void unlock(pthread_mutex_t *l)&lt;br /&gt;{&lt;br /&gt;    pthread_mutex_unlock(l);&lt;br /&gt;}&lt;br /&gt;#endif /* USER_SPINLOCK */&lt;br /&gt;&lt;br /&gt;/*&lt;br /&gt; * From GETTID(2):&lt;br /&gt; *&lt;br /&gt; *   Glibc does not provide a wrapper for this system call; call it using&lt;br /&gt; *   syscall(2).&lt;br /&gt; *&lt;br /&gt; */&lt;br /&gt;static inline pid_t gettid(void)&lt;br /&gt;{&lt;br /&gt;    return syscall(SYS_gettid);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;static void *thread(void *arg)&lt;br /&gt;{&lt;br /&gt;    int n = *(int *)arg;&lt;br /&gt;    int first, second;&lt;br /&gt;    cpu_set_t cmask;&lt;br /&gt;    pid_t pid = gettid();&lt;br /&gt;    int i = 1E7;&lt;br /&gt;&lt;br /&gt;    /* Set CPU affinity */&lt;br /&gt;    CPU_ZERO(&amp;amp;cmask);&lt;br /&gt;    CPU_SET(n % N_CPUS, &amp;amp;cmask);&lt;br /&gt;    if (sched_setaffinity(pid, sizeof(cmask), &amp;amp;cmask) &amp;lt; 0) {&lt;br /&gt;        fprintf(stderr,&lt;br /&gt;            "could not set cpu affinity to core %d.", n % N_CPUS);&lt;br /&gt;        exit(1);&lt;br /&gt;    }&lt;br /&gt;    /*&lt;br /&gt;     * Acquire two locks and avoid deadlock; see also the dining&lt;br /&gt;     * philosopher problem.&lt;br /&gt;     */&lt;br /&gt;    if (n % 2) {&lt;br /&gt;        first = (n + 1) % N_CPUS;&lt;br /&gt;        second = n;&lt;br /&gt;    } else {&lt;br /&gt;        first = n;&lt;br /&gt;        second = (n + 1) % N_CPUS;&lt;br /&gt;    }&lt;br /&gt;    while (i) {&lt;br /&gt;        lock(&amp;amp;shared[first * CACHELINE_SIZE]);&lt;br /&gt;        lock(&amp;amp;shared[second * CACHELINE_SIZE]);&lt;br /&gt;        i--;&lt;br /&gt;        unlock(&amp;amp;shared[second * CACHELINE_SIZE]);&lt;br /&gt;        unlock(&amp;amp;shared[first * CACHELINE_SIZE]);&lt;br /&gt;&lt;br /&gt;    }&lt;br /&gt;    return NULL;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;int main(int argc, char **argv)&lt;br /&gt;{&lt;br /&gt;    struct sched_param sched;&lt;br /&gt;    int i;&lt;br /&gt;&lt;br /&gt;    fprintf(stdout, "running on %d cpus\n", N_CPUS);&lt;br /&gt;    sched.sched_priority = sched_get_priority_max(SCHED_RR);&lt;br /&gt;    if (sched_setscheduler(0, SCHED_RR, &amp;amp;sched) == -1) {&lt;br /&gt;        perror("error setting SCHED_RR");&lt;br /&gt;        return -1;&lt;br /&gt;    }&lt;br /&gt;    for (i = 0; i &amp;lt; N_CPUS; i++) {&lt;br /&gt;        input[i] = i;&lt;br /&gt;        if (pthread_create(&amp;amp;threads[i], NULL, thread, &amp;amp;input[i]) &amp;lt; 0) {&lt;br /&gt;            perror("pthread_create failed");&lt;br /&gt;            exit(1);&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;    for (i = 0; i &amp;lt; N_CPUS; i++)&lt;br /&gt;        pthread_join(threads[i], NULL);&lt;br /&gt;&lt;br /&gt;    return 0;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;And here some results of a run in my laptop with Intel(R) Core(TM)2 CPU U7600 @ 1.20GHz:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;=== USER_SPINLOCK=0 (use pthread_mutex) ===&lt;br /&gt;$ /usr/bin/time -v sudo ./userspace-spinlock &lt;br /&gt;running on 2 cpus&lt;br /&gt;        Command being timed: "sudo ./userspace-spinlock"&lt;br /&gt;        User time (seconds): 17.01&lt;br /&gt;        System time (seconds): 17.09&lt;br /&gt;        Percent of CPU this job got: 190%&lt;br /&gt;        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:17.91&lt;br /&gt;        Average shared text size (kbytes): 0&lt;br /&gt;        Average unshared data size (kbytes): 0&lt;br /&gt;        Average stack size (kbytes): 0&lt;br /&gt;        Average total size (kbytes): 0&lt;br /&gt;        Maximum resident set size (kbytes): 0&lt;br /&gt;        Average resident set size (kbytes): 0&lt;br /&gt;        Major (requiring I/O) page faults: 0&lt;br /&gt;        Minor (reclaiming a frame) page faults: 658&lt;br /&gt;        Voluntary context switches: 8426&lt;br /&gt;        Involuntary context switches: 31&lt;br /&gt;        Swaps: 0&lt;br /&gt;        File system inputs: 0&lt;br /&gt;        File system outputs: 0&lt;br /&gt;        Socket messages sent: 0&lt;br /&gt;        Socket messages received: 0&lt;br /&gt;        Signals delivered: 0&lt;br /&gt;        Page size (bytes): 4096&lt;br /&gt;        Exit status: 0&lt;br /&gt;&lt;br /&gt;=== USER_SPINLOCK=1 (use user-space spinlocks) ===&lt;br /&gt;$ /usr/bin/time -v sudo ./userspace-spinlock &lt;br /&gt;running on 2 cpus&lt;br /&gt;        Command being timed: "sudo ./userspace-spinlock"&lt;br /&gt;        User time (seconds): 13.12&lt;br /&gt;        System time (seconds): 0.04&lt;br /&gt;        Percent of CPU this job got: 191%&lt;br /&gt;        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.89&lt;br /&gt;        Average shared text size (kbytes): 0&lt;br /&gt;        Average unshared data size (kbytes): 0&lt;br /&gt;        Average stack size (kbytes): 0&lt;br /&gt;        Average total size (kbytes): 0&lt;br /&gt;        Maximum resident set size (kbytes): 0&lt;br /&gt;        Average resident set size (kbytes): 0&lt;br /&gt;        Major (requiring I/O) page faults: 0&lt;br /&gt;        Minor (reclaiming a frame) page faults: 657&lt;br /&gt;        Voluntary context switches: 3&lt;br /&gt;        Involuntary context switches: 17&lt;br /&gt;        Swaps: 0&lt;br /&gt;        File system inputs: 0&lt;br /&gt;        File system outputs: 0&lt;br /&gt;        Socket messages sent: 0&lt;br /&gt;        Socket messages received: 0&lt;br /&gt;        Signals delivered: 0&lt;br /&gt;        Page size (bytes): 4096&lt;br /&gt;        Exit status: 0&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The elapsed time with traditional pthread_mutex locking is 17.91 sec; with user-space spinlocks the execution needs only 6.89 sec! This means a speed-up of ~2.6!&lt;br /&gt;&lt;br /&gt;As said above a disadvantage is the spinning of the CPUs that leads to a greater power consumption. This behaviour can be seen also looking at the voluntary context switches: 8426 in the pthread_mutex case and only 3 with user-space spinlocks (note: if you look at the code you can see that we're running real-time threads, this is the reason of this really low number of context switches). This also means that the whole system is really more reactive if the applications use pthread_mutex primitives, but if cases when reactiveness is not our goal (or we care about the reactiveness of few specific applications) we know that we can achieve a good speed-up of our parallel applications using user-space spinlocks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4621337164116452580?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4621337164116452580/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4621337164116452580' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4621337164116452580'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4621337164116452580'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/03/performance-improvement-of-parallel.html' title='Performance improvement of parallel applications with user-space spinlocks'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-8132690076524729588</id><published>2009-01-21T09:58:00.005+01:00</published><updated>2009-01-21T10:30:37.483+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='framebuffer'/><category scheme='http://www.blogger.com/atom/ns#' term='deadlock'/><category scheme='http://www.blogger.com/atom/ns#' term='kernel'/><title type='text'>potential framebuffer deadlock</title><content type='html'>In the latest kernel (2.6.29-rc2) there's a potential deadlock condition in the frame buffer between fb_ioctl() and fb_mmap().&lt;br /&gt;&lt;br /&gt;The cause is that fb_mmap() is called under mm-&gt;mmap_sem (A) held, that also acquires fb_info-&gt;lock (B); fb_ioctl() takes fb_info-&gt;lock (B) and does copy_from/to_user() that might acquire mm-&gt;mmap_sem (A) if a page fault occurs.&lt;br /&gt;&lt;br /&gt;So we've a classic deadlock condition: a process holds the lock A and attempts to obtain the lock B, but B is already held by a second process that attemps to lock A.&lt;br /&gt;&lt;br /&gt;A possible fix is to prevent the deadlock condition, that means "push down" the mutex fb_info-&gt;lock into the fb_ioctl() implementation and avoid the occurrence of the page fault with fb_info-&gt;lock held (A). But this also requires to define two basic primitives i.e. lock/unlock_fb_info() and use them opportunely inside *all* the framebuffer drivers. For now I've tried to fix at least the main fb_ioctl() function (with the common ioctl ops) that is shared by all the drivers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-8132690076524729588?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/8132690076524729588/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=8132690076524729588' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8132690076524729588'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8132690076524729588'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/01/framebuffer-deadlock.html' title='potential framebuffer deadlock'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-6154786863397845203</id><published>2009-01-21T09:33:00.002+01:00</published><updated>2009-01-21T09:55:37.697+01:00</updated><title type='text'>systemimager-light: opening a new branch?</title><content type='html'>I wonder if it's worth to open a new SystemImager branch without shipping the standard BOEL kernel anymore. Providing the packages for all the supported architectures requires a *huge* amount of time and it's not possible for me to build everything... :( (also because I don't have all the required architectures). So, probably the best way to proceed is to just remove BOEL and set UYOK the default. UYOK allows to create a boot&amp;install package (kernel+initrd.img) for SystemImager using the kernel shipped in any distribution (preferably the kernel shipped with the distribution we want to install) and an initrd_template (shipped with the SystemImager packages). And with the UYOK-only version the time to build the packages is 10 times faster!&lt;br /&gt;&lt;br /&gt;I've just opened the new branch locally in my PC using a git repository and pushed the initial release to download.systemimager.org (to get it: git-clone git://download.systemimager.org/local/git/systemimager-light), but the server is down again! :( grrr.... it seems we need a newer and more powerful server...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-6154786863397845203?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/6154786863397845203/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=6154786863397845203' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/6154786863397845203'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/6154786863397845203'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2009/01/systemimager-light-opening-new-branch.html' title='systemimager-light: opening a new branch?'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-7520454275424840738</id><published>2008-12-15T21:50:00.013+01:00</published><updated>2008-12-16T10:24:37.510+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='gcc'/><category scheme='http://www.blogger.com/atom/ns#' term='cache'/><title type='text'>cache line bouncing</title><content type='html'>Usually we don't realize how expensive is cacheline bouncing in parallel systems. Following is a simple example to evaluate the bouncing cost.&lt;br /&gt;&lt;br /&gt;A multi-threaded application uses shared data in some of its thread:&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;struct shared_data_struct {&lt;br /&gt; unsigned int data1;&lt;br /&gt; unsigned int data2;&lt;br /&gt;};&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;Suppose data1 is used only by thread1 and data2 is used by thread2. A natural way to optimize it is to pack data together in order to reduce the size of the application, thus maximizing the amount of memory that fits into the cache.&lt;br /&gt;&lt;br /&gt;Unfortunately, this inevitably leads to poor performance: if both threads write to their assigned memory location the cache line must be always in exclusive state in the L1 data cache (L1D) of each core/processors and this generates a big cache coherency overhead.&lt;br /&gt;&lt;br /&gt;For example, in the Intel Core 2 processor the cacheline size is 64 bytes (this can be retrieved using the command `getconf LEVEL1_DCACHE_LINESIZE` from the shell), so in the example above both data1 and data2 share the same L1D cacheline, though apparently they're using different independent memory locations.&lt;br /&gt;&lt;br /&gt;We can measure the cost of the cacheline bounces using oprofile and a simple example. In cache-parallel.c (see the code below) we have the  same `struct shared_data_struct` with an optional pad, depending on DISTINCT_CACHE_LINES symbol.&lt;br /&gt;&lt;br /&gt;Let's see what happens without the pad, commenting the #define DISTINCT_CACHE_LINES:&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;Configure oprofile to account L1D cache misses:&lt;br /&gt;$ sudo opcontrol --setup --event=L1D_PEND_MISS:500&lt;br /&gt;&lt;br /&gt;Start oprofile:&lt;br /&gt;$ sudo opcontrol -s&lt;br /&gt;Using 2.6+ OProfile kernel interface.&lt;br /&gt;Reading module info.&lt;br /&gt;Using log file /var/lib/oprofile/samples/oprofiled.log&lt;br /&gt;Daemon started.&lt;br /&gt;Profiler running.&lt;br /&gt;&lt;br /&gt;Run cache-parallel _without_ the pad in shared_data_struct:&lt;br /&gt;$ time ./cache-parallel&lt;br /&gt;...&lt;br /&gt;real    0m29.274s&lt;br /&gt;user    0m47.540s&lt;br /&gt;sys     0m0.121s&lt;br /&gt;&lt;br /&gt;Stop oprofile:&lt;br /&gt;$ sudo opcontrol -h&lt;br /&gt;Stopping profiling.&lt;br /&gt;Killing daemon.&lt;br /&gt;&lt;br /&gt;And see the results:&lt;br /&gt;$ opannotate --source ./cache-parallel | grep data[12]++&lt;br /&gt;1752  0.7326 :                sd-&gt;data1++;&lt;br /&gt;237090 99.1361 :                sd-&gt;data2++;&lt;br /&gt;^^^^^^&lt;br /&gt;|&lt;br /&gt;A lot of misses here!&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;If we add the pad (defining DISTINCT_CACHE_LINES):&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;$ time ./cache-parallel-with-pad&lt;br /&gt;...&lt;br /&gt;real    0m11.330s&lt;br /&gt;user    0m20.686s&lt;br /&gt;sys     0m0.037s&lt;br /&gt;&lt;br /&gt;$ opannotate --source ./cache-parallel-with-pad | grep data[12]++&lt;br /&gt;49 43.7500 :                sd-&gt;data1++;&lt;br /&gt;24 21.4286 :                sd-&gt;data2++;&lt;br /&gt;^^^^^^&lt;br /&gt;|&lt;br /&gt;Cache misses are dramatically reduced now!&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;A speedup of &lt;span style="font-weight: bold;"&gt;29.274 / 11.330 = 2.583&lt;/span&gt;, in other words the cacheline bouncing effect produced, in this case, a slowdown of &lt;span style="font-weight: bold;"&gt;~260%&lt;/span&gt;!!!&lt;br /&gt;&lt;br /&gt;Following the source code of the cache-parallel example:&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;/*&lt;br /&gt;* cache-parallel.c&lt;br /&gt;*&lt;br /&gt;* build with:&lt;br /&gt;* gcc -DCACHELINE_SIZE=$(getconf LEVEL1_DCACHE_LINESIZE) -lpthread -Wall \&lt;br /&gt;* -g -ocache-parallel cache-parallel.c&lt;br /&gt;*/&lt;br /&gt;&lt;br /&gt;#define _GNU_SOURCE&lt;br /&gt;&lt;br /&gt;#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;#include &amp;lt;sched.h&amp;gt;&lt;br /&gt;#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;#include &amp;lt;pthread.h&amp;gt;&lt;br /&gt;#include &amp;lt;errno.h&amp;gt;&lt;br /&gt;#include &amp;lt;sys/types.h&amp;gt;&lt;br /&gt;#include &amp;lt;sys/syscall.h&amp;gt;&lt;br /&gt;&lt;br /&gt;#define unlikely(expr) __builtin_expect(!!(expr), 0)&lt;br /&gt;#define likely(expr) __builtin_expect(!!(expr), 1)&lt;br /&gt;&lt;br /&gt;#define __cacheline_aligned__ __attribute__((__aligned__(CACHELINE_SIZE)))&lt;br /&gt;&lt;br /&gt;#define LOOPS_MAX 2000000000&lt;br /&gt;#define STACK_SIZE 4096&lt;br /&gt;&lt;br /&gt;/*&lt;br /&gt;* From GETTID(2):&lt;br /&gt;*&lt;br /&gt;*   Glibc does not provide a wrapper for this system call; call it using&lt;br /&gt;*   syscall(2).&lt;br /&gt;*&lt;br /&gt;*/&lt;br /&gt;static inline pid_t gettid(void)&lt;br /&gt;{&lt;br /&gt;return syscall(SYS_gettid);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;/* XXX: comment this to see the effect of the cache line bouncing */&lt;br /&gt;#define DISTINCT_CACHE_LINES&lt;br /&gt;struct shared_data_struct {&lt;br /&gt;unsigned int data1;&lt;br /&gt;#ifdef DISTINCT_CACHE_LINES&lt;br /&gt;unsigned char pad[CACHELINE_SIZE - sizeof(unsigned int)];&lt;br /&gt;#endif&lt;br /&gt;unsigned int data2;&lt;br /&gt;};&lt;br /&gt;&lt;br /&gt;static struct shared_data_struct shared_data __cacheline_aligned__;&lt;br /&gt;&lt;br /&gt;static void dump_schedstats(pid_t pid, pid_t tid)&lt;br /&gt;{&lt;br /&gt;char buffer[256];&lt;br /&gt;char filename[64];&lt;br /&gt;FILE *f;&lt;br /&gt;&lt;br /&gt;snprintf(filename, sizeof(filename),&lt;br /&gt;"/proc/%d/task/%d/status", pid, tid);&lt;br /&gt;f = fopen(filename, "r");&lt;br /&gt;if (unlikely(f == NULL)) {&lt;br /&gt;perror("could not read scheduler statistics");&lt;br /&gt;exit(1);&lt;br /&gt;}&lt;br /&gt;while (fgets(buffer, sizeof(buffer), f))&lt;br /&gt;fprintf(stdout, "[%d:%d] %s", pid, tid, buffer);&lt;br /&gt;fclose(f);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;static void *inc_first(void *arg)&lt;br /&gt;{&lt;br /&gt;struct shared_data_struct *sd = (struct shared_data_struct *)arg;&lt;br /&gt;pid_t pid = getpid(), tid = gettid();&lt;br /&gt;cpu_set_t cmask;&lt;br /&gt;register long i;&lt;br /&gt;&lt;br /&gt;/* set affinity */&lt;br /&gt;CPU_ZERO(&amp;amp;cmask);&lt;br /&gt;CPU_SET(0, &amp;amp;cmask);&lt;br /&gt;if (unlikely(sched_setaffinity(pid, sizeof(cmask), &amp;amp;cmask) &amp;lt; 0)) {&lt;br /&gt;perror("could not set cpu affinity for the child.");&lt;br /&gt;exit(1);&lt;br /&gt;}&lt;br /&gt;/* periodically increment first member of shared struct */&lt;br /&gt;for (i = 0; i &amp;lt; LOOPS_MAX; i++)&lt;br /&gt;sd-&amp;gt;data1++;&lt;br /&gt;dump_schedstats(pid, tid);&lt;br /&gt;&lt;br /&gt;return NULL;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;static void *inc_second(void *arg)&lt;br /&gt;{&lt;br /&gt;struct shared_data_struct *sd = (struct shared_data_struct *)arg;&lt;br /&gt;pid_t pid = getpid(), tid = gettid();&lt;br /&gt;cpu_set_t cmask;&lt;br /&gt;register long i;&lt;br /&gt;&lt;br /&gt;/* set affinity */&lt;br /&gt;CPU_ZERO(&amp;amp;cmask);&lt;br /&gt;CPU_SET(1, &amp;amp;cmask);&lt;br /&gt;if (unlikely(sched_setaffinity(0, sizeof(cmask), &amp;amp;cmask) &amp;lt; 0)) {&lt;br /&gt;perror("could not set cpu affinity for current process.");&lt;br /&gt;exit(1);&lt;br /&gt;}&lt;br /&gt;/* periodically increment second member of shared struct */&lt;br /&gt;for (i = 0; i &amp;lt; LOOPS_MAX; i++)&lt;br /&gt;sd-&amp;gt;data2++;&lt;br /&gt;dump_schedstats(pid, tid);&lt;br /&gt;&lt;br /&gt;return NULL;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;int main(int argc, char **argv)&lt;br /&gt;{&lt;br /&gt;void *child_stack;&lt;br /&gt;pthread_t child_thr;&lt;br /&gt;&lt;br /&gt;/* allocate memory for other process to execute in */&lt;br /&gt;if (unlikely((child_stack = malloc(STACK_SIZE)) == NULL)) {&lt;br /&gt;perror("cannot allocate stack for child");&lt;br /&gt;exit(1);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;/* create the child */&lt;br /&gt;if (unlikely(pthread_create(&amp;amp;child_thr, NULL,&lt;br /&gt;&amp;amp;inc_second, &amp;amp;shared_data) &amp;lt; 0)) {&lt;br /&gt;perror("pthread_create failed");&lt;br /&gt;exit(1);&lt;br /&gt;}&lt;br /&gt;inc_first((void *)&amp;amp;shared_data);&lt;br /&gt;pthread_join(child_thr, NULL);&lt;br /&gt;&lt;br /&gt;return 0;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-7520454275424840738?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/7520454275424840738/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=7520454275424840738' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7520454275424840738'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7520454275424840738'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/12/cacheline-bouncing.html' title='cache line bouncing'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-551035796941115757</id><published>2008-10-27T14:45:00.002+01:00</published><updated>2008-10-27T16:00:00.076+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linuxday'/><category scheme='http://www.blogger.com/atom/ns#' term='SystemImager'/><title type='text'>SystemImager @ LinuxDay 2008 in Ferrara</title><content type='html'>I presented a quick overview of SystemImager, how it works and typical use cases, at the LinuxDay 2008 in Ferrara. The slides are available &lt;a href="http://download.systemimager.org/%7Earighi/doc/SystemImager-LinuxDay-2008-Ferrara.pdf"&gt;here&lt;/a&gt;. There are also some nice &lt;a href="http://linuxday.ferrara.linux.it/2008/album"&gt;pictures&lt;/a&gt; of the event.&lt;br /&gt;&lt;br /&gt;After my talk Andrea Arcangeli presented an interesting talk about recent core kernel features (expecially mmu_notifier), that makes &lt;a href="http://kvm.qumranet.com/kvmwiki"&gt;KVM&lt;/a&gt; to reliable swap guest mapped pages (without a mmu_notifier pages mapped in secondary MMU are pinned and cannot be swapped), perform balooning and save host memory with KSM (map different virtual address to common pages shared between different VMs).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-551035796941115757?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/551035796941115757/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=551035796941115757' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/551035796941115757'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/551035796941115757'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/10/systemimager-linuxday-2008-in-ferrara.html' title='SystemImager @ LinuxDay 2008 in Ferrara'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-6495258213706391449</id><published>2008-10-15T22:16:00.002+02:00</published><updated>2008-10-15T22:17:53.559+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='homepage'/><category scheme='http://www.blogger.com/atom/ns#' term='website'/><title type='text'>a new website</title><content type='html'>I've finished to write my new &lt;a href="http://www.dii.unisi.it/%7Erighi/"&gt;homepage&lt;/a&gt; hosted at dii.unisi.it... a plain and essential website (that is always the best choice IMHO) fully created with &lt;a href="http://www.vim.org/"&gt;vim&lt;/a&gt;. It has been a long time since I wrote a website from scratch, and vim is still my preferred HTML editor! :)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-6495258213706391449?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/6495258213706391449/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=6495258213706391449' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/6495258213706391449'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/6495258213706391449'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/10/new-website.html' title='a new website'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-1073058308204937717</id><published>2008-10-08T10:35:00.004+02:00</published><updated>2009-09-03T16:50:41.186+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='page writeback'/><title type='text'>fine-grained dirty_ratio and dirty_background_ratio</title><content type='html'>A process that writes something to a file generates dirty pages in the page cache. Dirty pages must be kept in sync with their backing store (the file defined on the block device).&lt;br /&gt;&lt;br /&gt;In the Linux kernel the frequency to writeback dirty pages is controlled by two parameters: vm.dirty_ratio and vm.dirty_background_ratio. Both are expressed a percentage of dirtyable memory, that is the free memory + reclaimable memory (active and inactive pages in the LRU list).&lt;br /&gt;&lt;br /&gt;The first parameter controls when a process will itself start writing out dirty data, the second controls when the kernel thread [pdflush] must be woken up and it will start writing out dirty data globally on behalf of the processes (dirty_background_ratio is always less than dirty_ratio; if dirty_background_ratio &gt;= dirty_ratio the kernel automatically set it to dirty_ratio / 2).&lt;br /&gt;&lt;br /&gt;Unfortunately, both percentages are int and the kernel doesn't even allow to set them below 5%. This means that in large memory machine those limits are too coarse. On a machine that has 1GB of dirtyable memory the kernel will start to writeback dirty pages in chunks of 50MB (!!!) minimum (with dirty_ratio = 5).&lt;br /&gt;&lt;br /&gt;Even if it could be fine for batch or server machines, this behaviour could be unpleasant for desktop or latency-sensitive environments, when the large writeback can be perceived as a lack of responsiveness in the whole system.&lt;br /&gt;&lt;br /&gt;IMHO we really need an interface to define fine-grained limits (to writeback small amount of data, often) and the best solution for this without breaking the compatibility with the old interface seems to introduce a new interface to define &lt;a href="http://en.wikipedia.org/wiki/Per_cent_mille"&gt;pcm&lt;/a&gt; (milli-percent) values.&lt;br /&gt;&lt;br /&gt;At least this would resolve the problem for today machines... until 1TB memory servers will become popular...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-1073058308204937717?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/1073058308204937717/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=1073058308204937717' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1073058308204937717'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1073058308204937717'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/10/fine-grained-dirtyratio-and.html' title='fine-grained dirty_ratio and dirty_background_ratio'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-8151683169599437473</id><published>2008-10-07T10:11:00.003+02:00</published><updated>2008-10-07T10:42:18.307+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SystemImager'/><category scheme='http://www.blogger.com/atom/ns#' term='CINECA'/><category scheme='http://www.blogger.com/atom/ns#' term='BitTorrent'/><category scheme='http://www.blogger.com/atom/ns#' term='BCX'/><title type='text'>SystemImager @ CINECA</title><content type='html'>An &lt;a href="http://www.cineca.it/pubblicazioni/notiziario/systemImager.pdf"&gt;article&lt;/a&gt; I wrote (only in italian.. sorry) for CINECA news about SystemImager and the advantages of the BitTorrent transport, documenting the installation of the whole BCX cluster (1290 nodes).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-8151683169599437473?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/8151683169599437473/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=8151683169599437473' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8151683169599437473'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8151683169599437473'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/10/systemimager-cineca.html' title='SystemImager @ CINECA'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-7759278521375583684</id><published>2008-10-04T12:15:00.011+02:00</published><updated>2008-10-07T10:11:33.000+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cgroup'/><category scheme='http://www.blogger.com/atom/ns#' term='kernel'/><category scheme='http://www.blogger.com/atom/ns#' term='IO controller'/><title type='text'>cgroup I/O bandwidth controller results</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://download.systemimager.org/%7Earighi/linux/patches/io-throttle/benchmark/graph/cgroup-io-throttle-bandwidth.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px;" src="http://download.systemimager.org/%7Earighi/linux/patches/io-throttle/benchmark/graph/cgroup-io-throttle-bandwidth.png" alt="" border="0" /&gt;&lt;/a&gt;Some experimental results of the tests I ran on my box with the latest version of my &lt;span style="font-style: italic;"&gt;cgroup-io-throttle&lt;/span&gt; patchset against the -mm kernel (&lt;span style="font-style: italic;"&gt;2.6.27-rc5-mm1&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;The goal of this test is to demonstrate the effectiveness of applying a throttling controller to enhance the IO performance predictability in a shared system.&lt;br /&gt;&lt;br /&gt;The graphic highlights the bursty behaviour of the IO rate that we&lt;span style="font-family:monospace;"&gt; &lt;/span&gt;have with a plain CFQ IO scheduling (&lt;span style="font-style: italic;"&gt;red line&lt;/span&gt;), and the smoother and contained behaviour using &lt;span style="font-style: italic;"&gt;cgroup-io-throttle&lt;/span&gt;. We can also see the differences between the leaky bucket (&lt;span style="font-style: italic;"&gt;green line&lt;/span&gt;) and token bucket (&lt;span style="font-style: italic;"&gt;blue line&lt;/span&gt;) policies: first one is even more smooth and gives a better guarantee to respect the IO limit (hard limit), the second one allows a small irregularity degree (soft limit), but is better in terms of efficiency, expecially at high IO rates.&lt;br /&gt;&lt;br /&gt;See my &lt;a href="http://arighi.blogspot.com/2008/07/cgroup-block-device-io-bandwidth.html"&gt;previous post&lt;/a&gt; to have an overview of the advantages this controler could provide.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-7759278521375583684?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/7759278521375583684/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=7759278521375583684' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7759278521375583684'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7759278521375583684'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/10/cgroup-io-bandwidth-controller.html' title='cgroup I/O bandwidth controller results'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-1566286261565398024</id><published>2008-07-13T11:46:00.002+02:00</published><updated>2008-07-13T12:04:18.400+02:00</updated><title type='text'>cgroup: block device i/o bandwidth controller</title><content type='html'>I've posted to the LKML (&lt;a href="http://lkml.org/lkml/2008/7/12/55"&gt;[1]&lt;/a&gt;, &lt;a href="http://lkml.org/lkml/2008/7/12/56"&gt;[2]&lt;/a&gt;, &lt;a href="http://lkml.org/lkml/2008/7/12/57"&gt;[3]&lt;/a&gt;, &lt;a href="http://lkml.org/lkml/2008/7/12/58"&gt;[4]&lt;/a&gt;) the latest version of my patch to provide I/O bandwidth controlling for Linux &lt;a href="http://lwn.net/Articles/256389/"&gt;cgroups&lt;/a&gt;. The objective of this patch is to improve I/O performance predictability of different cgroups sharing the same block devices.&lt;br /&gt;&lt;br /&gt;Some advantages of providing this feature:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Allow I/O traffic shaping for block device shared among different cgroups&lt;/li&gt;&lt;li&gt;I/O performance predictability allows to better satisfy timing requirements for real-time applications&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Limiting rules do not depend of the particular I/O scheduler (anticipatory, deadline, CFQ, noop) and/or the type of the underlying block devices&lt;/li&gt;&lt;li&gt;The bandwidth limitations are guaranteed both for synchronous and asynchronous operations, even the I/O passing through the page cache or buffers and not only direct I/O&lt;/li&gt;&lt;li&gt;It is possible to implement a simple user-space application to dynamically adjust the I/O workload of different process containers at run-time, according to the particular users' requirements and applications' performance constraints&lt;/li&gt;&lt;li&gt;It is even possible to implement event-based performance throttling mechanisms; for example the same user-space application could actively throttle the I/O bandwidth to reduce power consumption when the battery of a mobile device is running low (power throttling) or when the temperature of a hardware component is too high (thermal throttling)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-1566286261565398024?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/1566286261565398024/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=1566286261565398024' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1566286261565398024'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1566286261565398024'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/07/cgroup-block-device-io-bandwidth.html' title='cgroup: block device i/o bandwidth controller'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-7342022351248323104</id><published>2008-05-28T16:51:00.007+02:00</published><updated>2008-05-29T14:24:55.784+02:00</updated><title type='text'>task i/o accounting in Linux: provide distinct tgid/tid i/o statistics</title><content type='html'>Here a new patch sent to the LKML. This patch (&lt;a href="http://lwn.net/Articles/283937/"&gt;coverage at LWN.net&lt;/a&gt;) allows to properly account i/o statistics of threaded applications (tasks spawning threads by &lt;span style="font-style: italic;"&gt;clone()&lt;/span&gt; syscall + &lt;span style="font-style: italic;"&gt;CLONE_THREAD&lt;/span&gt; flag).&lt;br /&gt;&lt;br /&gt;Without this fix a generic application can bypass of the accounting of the actual i/o statistics creating a bunch of threads doing the real i/o work.&lt;br /&gt;&lt;br /&gt;The patch permits to account aggregate i/o statistics in /proc/PID/io, so it allows to easily find the top i/o consumer simply looking at this file, even when PID spawns the i/o worker threads. Moreover, a new file /proc/PID/task/TID/io (one for each TID) is created in procfs to account single thread i/o statistics, allowing in this way, to find which thread(s) of the top i/o consumer is eating the whole i/o bandwidth.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-7342022351248323104?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/7342022351248323104/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=7342022351248323104' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7342022351248323104'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/7342022351248323104'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/05/task-io-accounting-in-linux-provide.html' title='task i/o accounting in Linux: provide distinct tgid/tid i/o statistics'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-3360834624409133325</id><published>2008-05-08T23:16:00.004+02:00</published><updated>2009-12-15T15:06:12.551+01:00</updated><title type='text'>Recover data from hexdump output in the console</title><content type='html'>It happens sooner or later to delete some blocks of a disk by a wrong "dd" executed as root... :-) Ok, if you've been so lucky to "hexdump -v -C" those blocks to the console before you can just copy&amp;amp;paste the dump-ed ouput and send it to the standard input of this one-liner perl script:&lt;br /&gt;&lt;pre&gt;&amp;nbsp;&lt;/pre&gt;&lt;pre&gt;map { map { print pack('H[2]', $_) } split(/\s+/, $1)&lt;br /&gt;if ((/^\w+\s+((\w{2}\s+){1,16})\s+|.*$/i) &amp;amp;&amp;amp; ($1)) }&amp;nbsp;&lt;/pre&gt;&lt;pre&gt;&amp;nbsp;&lt;/pre&gt;The corresponding binary data can be taken from the standard output. That's it! :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-3360834624409133325?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/3360834624409133325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=3360834624409133325' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3360834624409133325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3360834624409133325'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/05/recover-data-from-hexdump-output-in.html' title='Recover data from hexdump output in the console'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-8577150937707102793</id><published>2008-03-23T19:32:00.007+01:00</published><updated>2008-05-08T23:54:23.091+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='sparse file'/><category scheme='http://www.blogger.com/atom/ns#' term='rsync'/><title type='text'>rsync and sparse files</title><content type='html'>Unfortunately there's not a syscall in Linux to check if a file is &lt;a href="http://en.wikipedia.org/wiki/Sparse_file"&gt;sparse&lt;/a&gt; or not. There're some nice ideas to extend the lseek() syscall and implement such feature in ZFS (&lt;a href="http://blogs.sun.com/bonwick/entry/seek_hole_and_seek_data"&gt;SEEK_HOLE and SEEK_DATA for sparse files&lt;/a&gt;), but there's nothing ready for production filesystems yet.&lt;br /&gt;&lt;br /&gt;The common approach for user space applications is to implement a heuristic to check if a file can be treated as sparse (and save disk space) or not (and just write bytes to disk).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.samba.org/rsync/"&gt;rsync&lt;/a&gt;, for example, checks every chunk of 1024 bytes before writing data to a generic destination file. If a chunk starts or ends with 0-s, these 0-s are just skipped by lseek().&lt;br /&gt;&lt;br /&gt;Unfortunately, in some filesystems, typically optimized for large sequential I/O throughputs (like &lt;a href="http://www-03.ibm.com/systems/clusters/software/gpfs/index.html"&gt;IBM GPFS&lt;/a&gt;, &lt;a href="http://www-03.ibm.com/systems/storage/software/virtualization/sfs/index.html"&gt;IBM SAN FS&lt;/a&gt;, or distributed filesystems in general), a lot of lseek()s operations can strongly impact on performances.&lt;br /&gt;&lt;br /&gt;In this cases it can be very helpful to enlarge the block size used to handle sparse files.&lt;br /&gt;&lt;br /&gt;For example, using a sparse write size of 32KB, I've been able to increase the transfer rate of an order of magnitude copying the output files of scientific applications from GPFS to GPFS or GPFS to SAN FS.&lt;br /&gt;&lt;br /&gt;Read &lt;a href="http://www.mail-archive.com/rsync@lists.samba.org/msg21246.html"&gt;this thread&lt;/a&gt; on rsync mailing list.&lt;br /&gt;&lt;br /&gt;And &lt;a href="http://www.samba.org/ftp/rsync/dev/patches/sparse-block.diff"&gt;here is the patch&lt;/a&gt; to add --sparse-block=SIZE option to rsync, allowing to tune this parameter at run-time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-8577150937707102793?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/8577150937707102793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=8577150937707102793' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8577150937707102793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8577150937707102793'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/03/rsync-and-sparse-files.html' title='rsync and sparse files'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-193356294204897340</id><published>2008-03-22T13:18:00.006+01:00</published><updated>2008-03-22T15:12:50.998+01:00</updated><title type='text'>Xen vs KVM/QEMU for testing SystemImager installs</title><content type='html'>Xen is great, but for SystemImager testing I've found many advantages using KVM/QEMU.&lt;br /&gt;&lt;br /&gt;First of all, I don't need to run all my applications in a dom0 (or domU), that means I don't have any virtualization overhead for various openoffice instances, thunderbird, firefox, kopete, lastfm, kernel builds, SystemImager builds, and all the common activities I do in my laptop. With KVM/QEMU only the guest OSes are affected by virtualization overhead (more overhead with KVM/QEMU for the I/O anyway, respect to Xen guests). In fact, from this point of view KVM/QEMU is better suited for desktop machines.&lt;br /&gt;&lt;br /&gt;Second, to test a standard SystemImager install I always need to use the unmodified &lt;a href="http://wiki.systemimager.org/index.php/UYOK"&gt;BOEL or UYOK&lt;/a&gt; kernel, that means to run Xen fully-virtualized guests (HVM mode) and falls back to the QEMU hardware emulated model for I/O operations, aka "QEMU device manager" (qemu-dm), a patched version of the original qemu device emulation. So, in principle, there's not a big difference respect to run guest OSes with KVM/QEMU in terms of performance.&lt;br /&gt;&lt;br /&gt;Third, I really like the command line syntax of kvm/qemu. In principle the xm syntax of Xen is better, because it supports both a configuration file and/or command line parameters, but I always prefer the qemu syntax, probably it's because I'm more familiar with it.&lt;br /&gt;&lt;br /&gt;And what about VMWare? Well.. I've not been able to create x86_64 guests with VMPlayer, and I've not even found anything in the web that pointed me in the right direction. I liked the NAT-only network configuration of VMPlayer, but I think I can survive also with the little more complex bridged network setup for QEMU/KVM or Xen.&lt;br /&gt;&lt;br /&gt;Here is my network config to start a KVM/QEMU virtual machine in Ubuntu 7.10:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;/etc/dbus-1/event.d/25NetworkManager stop&lt;br /&gt;&lt;br /&gt;modprobe tun&lt;br /&gt;modprobe kvm-intel&lt;br /&gt;&lt;br /&gt;brctl addbr br0&lt;br /&gt;ifconfig br0 up&lt;br /&gt;ifconfig eth0 up&lt;br /&gt;brctl addif br0 eth0&lt;br /&gt;&lt;br /&gt;tunctl -b -u 1000 -t qtap0&lt;br /&gt;brctl addif br0 qtap0&lt;br /&gt;ifconfig qtap0 up 0.0.0.0 promisc&lt;br /&gt;&lt;br /&gt;chmod a+rw /dev/net/tun&lt;br /&gt;chmod a+rw /dev/kvm&lt;br /&gt;&lt;br /&gt;kvm -cdrom /home/righiandr/systemimager.iso \&lt;br /&gt;        -hda /home/righiandr/xen/domains/Debian4/disk.img -boot d \&lt;br /&gt;        -m 512 -net nic,model=rtl8139 -net tap,ifname=qtap0,script=no -smp 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;NOTE: systemimager.iso is made by si_mkautoinstallcd.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-193356294204897340?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/193356294204897340/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=193356294204897340' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/193356294204897340'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/193356294204897340'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/03/xen-vs-kvmqemu-for-testing-systemimager.html' title='Xen vs KVM/QEMU for testing SystemImager installs'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5499224392000862797</id><published>2008-03-22T12:46:00.005+01:00</published><updated>2008-03-22T13:15:11.749+01:00</updated><title type='text'>Automating Xen VM deployment with SystemImager</title><content type='html'>Now, that I own an Intel VT capable PC, I'm able to test any kind of &lt;a href="http://wiki.systemimager.org/index.php/Automating_Xen_VM_deployment_with_SystemImager"&gt;automatic Xen VM deployment with SystemImager&lt;/a&gt; directly from my laptop, even HVM Xen VMs.&lt;br /&gt;&lt;br /&gt;Next step will be to test the deployment of a Xen VM with a &lt;a href="http://wiki.systemimager.org/index.php/HOWTO_raw-disk_cloning_with_SystemImager"&gt;Windows raw-disk image&lt;/a&gt;, cloned by SystemImager from a VMWare (vmplayer) machine. With this we're finally able to cover all the kind of migrations: virtual-to-virtual (that means migrations between equal or even different virtualization technologies), physical-to-virtual, virtual-to-physical, and physical-to-physical (obviously last one was the main task of SystemImager). And not only with Linux! Anyway, at the moment, this still needs some manual tweaks (read the howtos above) and I would really like to spend some efforts to implement a SystemImager GUI to fully automatize all these steps.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5499224392000862797?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5499224392000862797/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5499224392000862797' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5499224392000862797'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5499224392000862797'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/03/automating-xen-vm-deployment-with.html' title='Automating Xen VM deployment with SystemImager'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-3446238724407540574</id><published>2008-03-16T15:37:00.004+01:00</published><updated>2008-03-16T16:08:55.544+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='picasa'/><title type='text'>picasa upload &amp; download</title><content type='html'>After uploaded a lot of photos to picasa web albums via this small script I wrote (&lt;a href="http://download.systemimager.org/%7Earighi/picasaupload/picasaupload.py"&gt;picasaupload&lt;/a&gt;), I realized that there's not a "magic key " to download the pictures back again to my pc. So, here is another script that works in the other way: &lt;a href="http://download.systemimager.org/%7Earighi/picasaupload/picasadownload"&gt;picasadownload&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;As suggested by the name this script allows to download images from a Google Picasa web album, passed as command line argument.&lt;br /&gt;&lt;br /&gt;For example to download the pictures I made at Celsa, a wonderful castle &lt;a href="http://picasaweb.google.com/righi.andrea/Celsa/photo#map"&gt;near Siena&lt;/a&gt;: &lt;pre&gt;picasadownload http://picasaweb.google.com/righi.andrea/Celsa&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-3446238724407540574?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/3446238724407540574/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=3446238724407540574' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3446238724407540574'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3446238724407540574'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/03/picasa-upload-download.html' title='picasa upload &amp; download'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-1365994348669171387</id><published>2008-03-13T14:18:00.002+01:00</published><updated>2008-03-13T14:55:53.804+01:00</updated><title type='text'>A new notebook...</title><content type='html'>After carrying around my notebook with all my books and notes I finally considered to buy an ultraportable model: a &lt;a href="http://www.dell.com/content/products/productdetails.aspx/latit_d430?c=us&amp;amp;l=en&amp;amp;s=biz&amp;amp;cs=555"&gt;Dell Latitude D430&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;My specific model is configured as follows:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Intel U7600 Ultra Low Voltage (ULV) Core 2 Duo&lt;/li&gt;&lt;li&gt;2GB &lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;DDR2 533MHz SDRAM (1 x 1GB + 1GB integrated)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;12.1" WXGA Display (1280 x 800)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;80GB Toshiba 5200RPM 1.8'' HDD (I'll move to &lt;/span&gt;&lt;/span&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;a SSD soon ;-))&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;MediaBase 8x DVD+/-RW Drive - W Euro&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;Intel 4965AGN wireless card&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span name="intelliTxt" id="intelliTxt"&gt;&lt;span&gt;FreeDOS inside! Woo-hoo!!! (&lt;span style="font-style: italic;"&gt;H.Simpson accent&lt;/span&gt;) no Microcrap included! :-) just reinstalled with Ubuntu 7.10.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;And now I can finally build x86_64 &lt;a href="http://www.mail-archive.com/sisuite-users@lists.sourceforge.net/msg04351.html"&gt;SystemImager&lt;/a&gt;&lt;a href="http://www.mail-archive.com/sisuite-users@lists.sourceforge.net/msg04351.html"&gt; packages&lt;/a&gt; at home (or better everywhere) and play with Xen and a processor with Intel VT extensions!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-1365994348669171387?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/1365994348669171387/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=1365994348669171387' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1365994348669171387'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/1365994348669171387'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/03/new-notebook.html' title='A new notebook...'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4668994607269579751</id><published>2008-02-16T22:23:00.003+01:00</published><updated>2008-02-16T22:40:51.740+01:00</updated><title type='text'>SystemImager cluster map</title><content type='html'>Finally we officially opened the &lt;a href="http://www.systemimager.org/cluster-register/"&gt;SystemImager cluster map&lt;/a&gt; a Google maps integrated tool to keep track who's actually using &lt;a href="http://www.systemimager.org"&gt;SystemImager&lt;/a&gt; around the world. It's awesome isn't it? :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4668994607269579751?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4668994607269579751/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4668994607269579751' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4668994607269579751'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4668994607269579751'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/02/systemimager-cluster-map.html' title='SystemImager cluster map'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5568392517002681662</id><published>2008-01-17T21:59:00.000+01:00</published><updated>2008-01-17T23:11:28.839+01:00</updated><title type='text'>Linux: I/O throttling (again)</title><content type='html'>I've improved my &lt;a href="http://lwn.net/Articles/264770/"&gt;per-task I/O throttling patch&lt;/a&gt; to support &lt;a href="http://lwn.net/Articles/265209/"&gt;per-uid/gid I/O &lt;/a&gt;&lt;a href="http://lwn.net/Articles/265209/"&gt;throttling&lt;/a&gt;. As reported in the patch description:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;Allow to limit the I/O bandwidth for specific uid(s) or gid(s) imposing&lt;br /&gt;additional delays on those processes that exceed the limits defined in a&lt;br /&gt;configfs tree.&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;A typical use of this patch could be on a shared Linux system under heavy load condition due to I/O-intensive applications. In this scenario it's possible to assign a different amount of available I/O bandwidth for each group or user (read also fair sharing, I/O shaping, etc.): for example 5MB/s to group A (students), 20MB/s to group B (professors), unlimited MB/s for user C (sysadmin), etc.&lt;br /&gt;&lt;br /&gt;But a vastely more interesting approach would be to implement a control group (&lt;a href="http://lwn.net/Articles/256389/"&gt;cgroup&lt;/a&gt;) based I/O throttling... and I've just started to work on this! ;-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5568392517002681662?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5568392517002681662/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5568392517002681662' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5568392517002681662'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5568392517002681662'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/01/linux-io-throttling-again.html' title='Linux: I/O throttling (again)'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-160703368180080769</id><published>2008-01-13T17:08:00.000+01:00</published><updated>2008-01-13T17:43:29.352+01:00</updated><title type='text'>Linux: per-task I/O throttling</title><content type='html'>I've posted a &lt;a href="http://lkml.org/lkml/2008/1/10/482"&gt;patch&lt;/a&gt; on the LKML that allows to limit the I/O bandwidth per-task via /proc filesystem. Writing a value &gt; 0 in /proc/&lt;span style="font-style: italic;"&gt;PID&lt;/span&gt;/io_throttle allows to set the upper bound limit of the I/O bandwidth (in 512-bytes sector per second) usable by the process &lt;span style="font-style: italic;"&gt;PID&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The patch itself it's not really useful, the same result can be obtained by &lt;a href="http://linux.die.net/man/1/ionice"&gt;ionice&lt;/a&gt; and a good I/O scheduler (like &lt;a href="http://en.wikipedia.org/wiki/CFQ"&gt;CFQ&lt;/a&gt;), but my patch it's a very simple proof-of-concept that it's possible to implement a kind of UID/GID (or even process-container) based policy of I/O bandwidth shaping (like network bandwidth shaping).&lt;br /&gt;&lt;br /&gt;Anyway, just right now I'm running my new 2.6.24-rc7-io-throttle kernel and using the following script to throttle the I/O consumption of the backup, that now can run in backgrund with a very small impact in my other applications. ;-)&lt;br /&gt;&lt;br /&gt;WARNING: obviously this script requires my kernel &lt;a href="http://lkml.org/lkml/2008/1/10/482"&gt;patch&lt;/a&gt; to work...&lt;span style=";font-family:courier new;font-size:85%;"  &gt;&lt;span style="font-size:100%;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;$ cat ~/bin/iothrottle&lt;br /&gt;#!/bin/sh&lt;br /&gt;&lt;br /&gt;[ $# -lt 2 ] &amp;amp;&amp;amp; echo "usage: $0 RATE CMD" &amp;amp;&amp;amp; exit 1&lt;br /&gt;&lt;br /&gt;rate=$1&lt;br /&gt;shift&lt;br /&gt;$* &amp;amp;&lt;br /&gt;trap "kill -9 $!" SIGINT SIGTERM&lt;br /&gt;[ -e /proc/$!/io_throttle ] &amp;amp;&amp;amp; echo $rate &gt;/proc/$!/io_throttle&lt;br /&gt;wait %1&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-160703368180080769?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/160703368180080769/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=160703368180080769' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/160703368180080769'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/160703368180080769'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/01/ive-posted-patch-0-in-proc-pid.html' title='Linux: per-task I/O throttling'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5984297603629895722</id><published>2008-01-05T15:19:00.000+01:00</published><updated>2008-01-05T22:29:46.961+01:00</updated><title type='text'>PyGFS: implementing a distributed filesystem in python</title><content type='html'>In this post I try to explain how to implement a secure and robust &lt;a href="http://en.wikipedia.org/wiki/Distributed_file_system"&gt;distributed filesystem&lt;/a&gt; in user-space with python.&lt;br /&gt;&lt;br /&gt;The advantages of user-space are many: no kernel modification, no OS crashes due to buggy code, debugging is easy, etc. Moreover, for the development point of view, in user-space it's possible to exploit all the nice features provided by the user-space libraries! It means that with few lines of code we can provide a lot of interesting features.&lt;br /&gt;&lt;br /&gt;So, let's see some potential requirements for our filesystem:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;the filesystem must support a complete set of standard POSIX APIs,&lt;/li&gt;&lt;li&gt;as a distributed filesystem it must provide data accessibility to remote hosts,&lt;br /&gt;&lt;/li&gt;&lt;li&gt;it must be reliable to hardware or network failures,&lt;/li&gt;&lt;li&gt;it must be secure (it must provide authentication, authorization and encryption mechanisms to provide secure access over insecure networks).&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Even if the requirements seem to fit on a long-term project, it's possible to satisfy all of them with few lines of code. Let's see how.&lt;br /&gt;&lt;br /&gt;The user-space accessibility is provided by &lt;a href="http://fuse.sourceforge.net/"&gt;FUSE&lt;/a&gt;, that allows to implement a full POSIX filesystem without any kernel changes (it provides all the required kernel APIs to register a filesystem without any kernel-space code). FUSE also allows to provide a secure method for non privileged users to mount their own filesystem.&lt;br /&gt;&lt;br /&gt;A distributed filesystem also need a mechanism for communications (how to send data to the remote hosts). An interesting project that could help us for this is &lt;a href="http://pyro.sourceforge.net/"&gt;Pyro&lt;/a&gt;. Pyro allows to skip the development for a new networking communication protocol, since it provides an elegant and easy-to-use object oriented form of RPC. It also optionally supports x509 certificate encryption, that perfectly covers our security requirement.&lt;br /&gt;&lt;br /&gt;At this point the real filesystem implementation is quite easy, we can use a simple client-server approach like &lt;a href="http://nfs.sourceforge.net/"&gt;NFS&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The client wraps all the POSIX syscalls in the filesystem defined by the FUSE interface and calls the equivalent OS routines on the remote server (using Pyro RPC); the server executes the OS procedues over the back-end filesystem and pass to the client the same result returned by the OS syscall (executed on the server filesystem).&lt;br /&gt;&lt;br /&gt;Moreover, to provide reliability feautures it's possible to exploit the robust &lt;a href="http://docs.python.org/tut/node10.html"&gt;exception handling statements&lt;/a&gt; in python. In this way we can detect all the communication failures and call an opportune event handler to re-issue the operations when the server become reachable again. We can also increase the reliability using a client-side and a server-side file handles; in this way each file handle at the client-side can mapped to a different file handle at the server side. If the server goes down the mapping between the two file handles is simply re-initialized and this allows to transparently continue the operations on the clients as the server was never stopped.&lt;br /&gt;&lt;br /&gt;So, I tried to implement a real example of this filesystem and I've called it &lt;a href="http://sourceforge.net/projects/pygfs"&gt;PyGFS&lt;/a&gt; (it should be something like: python grid file system... in perspective I'd like to improve it with multiple servers to mirror or unify more filesystems in different hosts, just like a real grid-filesystem...). The source code is available to all who are interested on it... If you even have ideas to add new features let me know... ;-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5984297603629895722?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5984297603629895722/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5984297603629895722' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5984297603629895722'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5984297603629895722'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/01/pygfs-implementing-distributed.html' title='PyGFS: implementing a distributed filesystem in python'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5377209285541830388</id><published>2008-01-05T15:03:00.000+01:00</published><updated>2008-01-05T15:17:51.325+01:00</updated><title type='text'>SystemImager @ LinuxDay 2007 in Bologna</title><content type='html'>The slides of my talk "SystemImager and BitTorrent: a peer-to-peer approach to large scale OS deployment" presented at LinuxDay 2007 in Bologna are available &lt;a href="http://erlug.linux.it/linuxday/2007/"&gt;here&lt;/a&gt; (the site is in italian bug the slides are still in english).&lt;br /&gt;&lt;br /&gt;They're pretty the same slides on &lt;a href="http://download.systemimager.org/pub/docs/SystemImager-LinuxTag-2007-presentation.pdf"&gt;SystemImager website&lt;/a&gt;, with few small changes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5377209285541830388?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5377209285541830388/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5377209285541830388' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5377209285541830388'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5377209285541830388'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2008/01/slides-of-my-talk-systemimager-and.html' title='SystemImager @ LinuxDay 2007 in Bologna'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5214600441997099783</id><published>2007-10-02T21:30:00.000+02:00</published><updated>2007-10-02T21:44:23.231+02:00</updated><title type='text'>How to sync Google Calendar with a local iCal file</title><content type='html'>I've written a simply python script (&lt;a href="http://googlecalendarsync.googlecode.com/"&gt;googlecalendarsync&lt;/a&gt;) to synchronize Google Calendar (only the default calendar) with a local iCal file. The advantage of this script, respect to the &lt;a href="https://addons.mozilla.org/en-US/thunderbird/addon/4631"&gt;Provider for Google Calendar&lt;/a&gt; solution is that it's possible to work offline in the calendar and it allows to use any calendar client that supports the iCal format (so not only Lightning or Sunbird).&lt;br /&gt;&lt;br /&gt;The synchronization is bi-directional: it means that it's possible to create/update/delete events directly from the google calendar interface or directly in your calendar client. The last modified events are taken if the same event is modified in both two places.&lt;br /&gt;&lt;br /&gt;After any change you can synchronize the calendars running the script (without any argument).&lt;br /&gt;&lt;br /&gt;I've opened a new project in the google project hosting; the tool is available here: &lt;a href="http://googlecalendarsync.googlecode.com/"&gt;http://googlecalendarsync.googlecode.com/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5214600441997099783?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5214600441997099783/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5214600441997099783' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5214600441997099783'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5214600441997099783'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/10/how.html' title='How to sync Google Calendar with a local iCal file'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4813839660956813922</id><published>2007-05-16T20:24:00.000+02:00</published><updated>2007-05-16T21:57:12.997+02:00</updated><title type='text'>libnosync: are *sync really necessary?</title><content type='html'>I was asking myself why user applications should care about the synchronization of their buffers. I suppose it's a task dedicated to the operating system, that actually knows what is better for the system. Looking at the manpage of FSYNC(2) we can see that:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;NAME&lt;br /&gt;       fsync,  fdatasync  -  synchronize  a file's complete in-core state with&lt;br /&gt;       that on disk&lt;br /&gt;&lt;br /&gt;[snip]&lt;br /&gt;&lt;br /&gt;DESCRIPTION&lt;br /&gt;       fsync copies all in-core parts of a file to disk, and waits  until  the&lt;br /&gt;       device  reports  that all parts are on stable storage.  It also updates&lt;br /&gt;       metadata stat information. It does  not  necessarily  ensure  that  the&lt;br /&gt;       entry  in the directory containing the file has also reached disk.  For&lt;br /&gt;       that an explicit fsync on the file descriptor of the directory is  also&lt;br /&gt;       needed.&lt;br /&gt;&lt;br /&gt;[snip]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;OK, but... why? I wrote a simple glibc wrapper (see below) in order to have "fake" fsync() and fdatasync() - not for the simple sync(), so you can continue to run the famous `sync; sync; sync`, if you're paranoid enough ;-) - and I was impressed by the heavy use of them by the user applications... and the speed-up if you disable them.&lt;br /&gt;&lt;br /&gt;In fact, if you have a journaled filesystem (hey! otherwise I think you should really consider to move to a journaled filesystem!) all the flushes of metadata causes a lot of writes in the journal (for example in ext3 a single fsync() causes the write of *everything*) and it's a lot of I/O for your PC. And this is a disadvantage also in term of power consumption.&lt;br /&gt;&lt;br /&gt;So, where is the trick?! After some thoughts I realized that the main reson should be to *be* really sure that the internal metadata of the applications (like a DBMS for example), built on-top-of the filesystem, have been correctly written to the backing store area. Everything that implements its own concept of "journal" should use the *sync() functions. Otherwise if a crash occurs just in the middle of an "important" write, well... at the resume the metadata of your filesystem will be ok, but the metadata of the application (mapped into the filesystem data) could result corrupted. So, in order to have a robust desktop it's surely better to have those syscalls enabled.&lt;br /&gt;&lt;br /&gt;OK, but is this really important for *all* your applications??? for example I don't think it's important for amarok... for example try to run a simple `strace -qfe trace=fdatasync,fsync amarok`. In my system I can see 36 syscalls of *sync!!! and this is too much... BTW I've nothing against amarok, it's a great application &amp; my favourite music player :-)&lt;br /&gt;&lt;br /&gt;Following the *sync() lib wrapper. Use this (always without any warranty) if you want to run your non-critical application faster. [IDEA] It would be interesting to run your apps with the wrapper and execute a `sync; sync; sync` just before the screensaver... :-)&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;/*&lt;br /&gt; *  libnosync&lt;br /&gt; *&lt;br /&gt; *  Copyright (C) 2007 Andrea Righi &lt;a.righi@cineca.it&gt;&lt;br /&gt; *&lt;br /&gt; *  This program is free software; you can redistribute it and/or modify&lt;br /&gt; *  it under the terms of the GNU General Public License as published by&lt;br /&gt; *  the Free Software Foundation; either version 2 of the License, or&lt;br /&gt; *  (at your option) any later version.&lt;br /&gt; *&lt;br /&gt; *  This program is distributed in the hope that it will be useful,&lt;br /&gt; *  but WITHOUT ANY WARRANTY; without even the implied warranty of&lt;br /&gt; *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the&lt;br /&gt; *  GNU General Public License for more details.&lt;br /&gt; *&lt;br /&gt; *  You should have received a copy of the GNU General Public License&lt;br /&gt; *  along with this program; if not, write to the Free Software&lt;br /&gt; *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA&lt;br /&gt; *&lt;br /&gt; * Compile:&lt;br /&gt; *     gcc -fPIC -Wall -O2 -g -shared -W1,-soname,libnosync.so.0 \&lt;br /&gt; *     -o libnosync.so.0.1 -lc -ldl&lt;br /&gt; *&lt;br /&gt; * Use:&lt;br /&gt; *     export LD_PRELOAD=`pwd`/libnosync.so.0.1&lt;br /&gt; *&lt;br /&gt; * Remove:&lt;br /&gt; *     unset LD_PRELOAD&lt;br /&gt; */&lt;br /&gt;&lt;br /&gt;#define _GNU_SOURCE&lt;br /&gt;#define __USE_GNU&lt;br /&gt;&lt;br /&gt;#include &lt;stdio.h&gt;&lt;br /&gt;#include &lt;stdarg.h&gt;&lt;br /&gt;#include &lt;string.h&gt;&lt;br /&gt;#include &lt;fcntl.h&gt;&lt;br /&gt;#include &lt;dlfcn.h&gt;&lt;br /&gt;#include &lt;stdlib.h&gt;&lt;br /&gt;#include &lt;unistd.h&gt;&lt;br /&gt;#include &lt;sys/types.h&gt;&lt;br /&gt;&lt;br /&gt;#ifdef DEBUG&lt;br /&gt;#define DPRINTF(format, args...) fprintf(stderr, "debug: " format, ##args)&lt;br /&gt;#else&lt;br /&gt;#define DPRINTF(format, args...)&lt;br /&gt;#endif&lt;br /&gt;&lt;br /&gt;int fdatasync(int) __attribute__ ((weak, alias("wrap_fsync")));&lt;br /&gt;int fsync(int) __attribute__ ((weak, alias("wrap_fsync")));&lt;br /&gt;&lt;br /&gt;int wrap_fsync(int fd)&lt;br /&gt;{&lt;br /&gt;        DPRINTF("called fsync/fdatasync on fd = %d\n", fd);&lt;br /&gt;        return 0;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4813839660956813922?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4813839660956813922/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4813839660956813922' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4813839660956813922'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4813839660956813922'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/05/libnosync-are-sync-really-necessary.html' title='libnosync: are *sync really necessary?'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-2375631431749579712</id><published>2007-05-13T16:45:00.000+02:00</published><updated>2007-05-13T18:47:52.186+02:00</updated><title type='text'>Thunderbird + Google Calendar</title><content type='html'>I realized that I really need a calendar integrated in my email client (&lt;a href="http://www.mozilla.com/en-US/thunderbird/"&gt;Thunderbird&lt;/a&gt;), and unfortunately the *great* &lt;a href="http://www.vim.org/"&gt;vim&lt;/a&gt; + some shell script in cron are not enough... :-)&lt;br /&gt;&lt;br /&gt;It's a couple of weeks that I'm using &lt;a href="http://www.mozilla.org/projects/calendar/releases/lightning0.3.1.html"&gt;lightning&lt;/a&gt; with a cool extension called &lt;a href="https://addons.mozilla.org/en-US/thunderbird/addon/4631"&gt;Provider for Google Calendar&lt;/a&gt;, that allows a bi-directional access (r/w) to google calendar directly from Thunderbird. I've also enabled the SMS notification (free) and honestly I have to admin that it's simply great! Now I can read my events, tasks, TODOs, etc. everywhere using a web browser or by my email client and receive alarms and notifications in my phone.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-2375631431749579712?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/2375631431749579712/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=2375631431749579712' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/2375631431749579712'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/2375631431749579712'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/05/i-really-need-calendar-in-thunderbird.html' title='Thunderbird + Google Calendar'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-2799785432464578220</id><published>2007-05-12T10:50:00.000+02:00</published><updated>2007-05-12T10:57:15.561+02:00</updated><title type='text'>LinuxTag 2007 in Berlin</title><content type='html'>I'll be at Linuxtag 2007 in Berlin. On thursday (31/05/2007) I'll present a &lt;a href="http://www.linuxtag.org/2007/de/conf/events/vp-donnerstag/vortragsdetails.html?talkid=63"&gt;paper&lt;/a&gt; about the integration of the &lt;a href="http://www.bittorrent.com/"&gt;BitTorrent&lt;/a&gt; protocol in &lt;a href="http://www.systemimager.org/"&gt;SystemImager&lt;/a&gt;, to quickly deploy operating systems in large installations, like HPC clusters, big render farms or complex grid-computing environments.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-2799785432464578220?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/2799785432464578220/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=2799785432464578220' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/2799785432464578220'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/2799785432464578220'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/05/linuxtag-2007-in-berlin.html' title='LinuxTag 2007 in Berlin'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-119990825347438497</id><published>2007-05-11T19:17:00.000+02:00</published><updated>2007-05-26T12:00:43.192+02:00</updated><title type='text'>Linux VM: per-user overcommit policy</title><content type='html'>I wrote a simple &lt;a href="http://download.systemimager.org/~arighi/linux/patches/2.6.21/overcommit-uid"&gt;patch&lt;/a&gt; that allows to define per-UID virtual memory overcommit handling.&lt;br /&gt;&lt;br /&gt;Configuration is stored in a hash list in kernel space reachable through /proc/overcommit_uid (surely there're better ways to do it, i.e. via configfs).&lt;br /&gt;&lt;br /&gt;Since most of the time we've readers, the concurrent read/write accesses of the hash list are synchronized using the &lt;a href="http://www.rdrop.com/users/paulmck/RCU/whatisRCU.html"&gt;RCU (Read Copy Update)&lt;/a&gt; mutual exclusion.&lt;br /&gt;&lt;br /&gt;Hash elements are defined using a triple:&lt;br /&gt;&lt;br /&gt;uid:overcommit_memory:overcommit_ratio&lt;br /&gt;&lt;br /&gt;The overcommit_* values have the same semantic of their respective sysctl variables. If a user is not present in the hash, the default system policy will be used (defined by /proc/sys/vm/overcommit_memory and /proc/sys/vm/overcommit_ratio).&lt;br /&gt;&lt;br /&gt;Example:&lt;br /&gt;&lt;br /&gt;- admin can allocate full memory + swap:&lt;br /&gt;&lt;br /&gt;root@host # echo 0:2:100 &gt; /proc/overcommit_uid&lt;br /&gt;&lt;br /&gt;- processes belonging to sshd (uid=100) and ntp (uid=102) users can be quite critical, so use the same policy of the admin:&lt;br /&gt;&lt;br /&gt;root@host # echo 100:2:100 &gt; /proc/overcommit_uid&lt;br /&gt;root@host # echo 102:2:100 &gt; /proc/overcommit_uid&lt;br /&gt;&lt;br /&gt;- Others can allocate up to the swap + 60% of the available RAM:&lt;br /&gt;&lt;br /&gt;root@host # echo 2 &gt; /proc/sys/vm/overcommit_memory &amp;&amp; echo 60 &gt; /proc/sys/vm/overcommit_ratio&lt;br /&gt;&lt;br /&gt;The result in the example above is that the memory is never overcommitted (due to the value 2 in overcommit_memory) and the 40% of the RAM is used as spare memory, reserved for root processes and critical services only. Normal users can use only the 60% of the RAM. So, in conclusions, non-privileged users never hog the machine.&lt;br /&gt;&lt;br /&gt;You can play with per-user overcommit parameters to implement your own VM allocation rules.&lt;br /&gt;&lt;br /&gt;This is only a very simple approach to user resource management. If you want a more flexible, complete and powerful approach look at the &lt;a href="http://lkml.org/lkml/2006/9/14/370"&gt;containers&lt;/a&gt; work, a very interesting project actively developed in Linux.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-119990825347438497?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/119990825347438497/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=119990825347438497' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/119990825347438497'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/119990825347438497'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/05/linux-vm-per-user-overcommit-policy.html' title='Linux VM: per-user overcommit policy'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-8247726130608676359</id><published>2007-04-16T19:16:00.000+02:00</published><updated>2007-04-16T19:57:33.842+02:00</updated><title type='text'>ZERO_PAGE or not ZERO_PAGE...</title><content type='html'>Interesting discussion on the LKML about the opportuninty to remove the ZERO_PAGE for anonymous mappings (&lt;a href="http://lkml.org/lkml/2007/4/3/432"&gt;http://lkml.org/lkml/2007/4/3/432&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;ZERO_PAGE is a single physical page that is always filled by &lt;span style="font-style:italic;"&gt;0&lt;/span&gt; and it's used for zero-mapped memory areas.&lt;br /&gt;&lt;br /&gt;It is used for example to initialize the anonymous pages of a task (not file-backed memory that exists only during the life of the task). When a program performs a &lt;span style="font-style:italic;"&gt;malloc()&lt;/span&gt; the buffer returned by the function should be filled by zero. If the program tries to read from that buffer, the kernel, instead of allocating new physical free pages without any reasonable purpose, maps all the virtual accessed memory to the ZERO_PAGE.&lt;br /&gt;&lt;br /&gt;Anyway, in general, an application that reads from a just allocated empty buffer is a quite stupid application :-) (except when you have to work with sparse matrices!) and the ZERO_PAGE handling has a cost in every COW faults.&lt;br /&gt;&lt;br /&gt;The following patch removes the handling of the ZERO_PAGE for anonymous memory mappings and it simply allocates new physical pages in the case that a program wants to read empty buffers. Depending on your applications you should see a small improvement in terms of performance, but a bigger memory consumption if you runs that kind of applications mentioned above.&lt;br /&gt;&lt;br /&gt;side note: I'm using it in my notebook and it works fine! :-)&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;--- linux-2.6.20.4/mm/memory.c.orig    2007-04-06 00:23:52.000000000 +0200&lt;br /&gt;+++ linux-2.6.20.4/mm/memory.c    2007-04-06 00:25:48.000000000 +0200&lt;br /&gt;@@ -1569,16 +1569,11 @@&lt;br /&gt;&lt;br /&gt;  if (unlikely(anon_vma_prepare(vma)))&lt;br /&gt;      goto oom;&lt;br /&gt;-    if (old_page == ZERO_PAGE(address)) {&lt;br /&gt;-        new_page = alloc_zeroed_user_highpage(vma, address);&lt;br /&gt;-        if (!new_page)&lt;br /&gt;-            goto oom;&lt;br /&gt;-    } else {&lt;br /&gt;-        new_page = alloc_page_vma(GFP_HIGHUSER, vma, address);&lt;br /&gt;-        if (!new_page)&lt;br /&gt;-            goto oom;&lt;br /&gt;-        cow_user_page(new_page, old_page, address, vma);&lt;br /&gt;-    }&lt;br /&gt;+&lt;br /&gt;+    new_page = alloc_page_vma(GFP_HIGHUSER, vma, address);&lt;br /&gt;+    if (!new_page)&lt;br /&gt;+        goto oom;&lt;br /&gt;+    cow_user_page(new_page, old_page, address, vma);&lt;br /&gt;&lt;br /&gt;  /*&lt;br /&gt;   * Re-check the pte - we dropped the lock&lt;br /&gt;@@ -2088,38 +2083,24 @@&lt;br /&gt;  spinlock_t *ptl;&lt;br /&gt;  pte_t entry;&lt;br /&gt;&lt;br /&gt;-    if (write_access) {&lt;br /&gt;-        /* Allocate our own private page. */&lt;br /&gt;-        pte_unmap(page_table);&lt;br /&gt;+    /* Allocate our own private page. */&lt;br /&gt;+    pte_unmap(page_table);&lt;br /&gt;&lt;br /&gt;-        if (unlikely(anon_vma_prepare(vma)))&lt;br /&gt;-            goto oom;&lt;br /&gt;-        page = alloc_zeroed_user_highpage(vma, address);&lt;br /&gt;-        if (!page)&lt;br /&gt;-            goto oom;&lt;br /&gt;+    if (unlikely(anon_vma_prepare(vma)))&lt;br /&gt;+        goto oom;&lt;br /&gt;+    page = alloc_zeroed_user_highpage(vma, address);&lt;br /&gt;+    if (!page)&lt;br /&gt;+        goto oom;&lt;br /&gt;&lt;br /&gt;-        entry = mk_pte(page, vma-&gt;vm_page_prot);&lt;br /&gt;-        entry = maybe_mkwrite(pte_mkdirty(entry), vma);&lt;br /&gt;+    entry = mk_pte(page, vma-&gt;vm_page_prot);&lt;br /&gt;+    entry = maybe_mkwrite(pte_mkdirty(entry), vma);&lt;br /&gt;&lt;br /&gt;-        page_table = pte_offset_map_lock(mm, pmd, address, &amp;ptl);&lt;br /&gt;-        if (!pte_none(*page_table))&lt;br /&gt;-            goto release;&lt;br /&gt;-        inc_mm_counter(mm, anon_rss);&lt;br /&gt;-        lru_cache_add_active(page);&lt;br /&gt;-        page_add_new_anon_rmap(page, vma, address);&lt;br /&gt;-    } else {&lt;br /&gt;-        /* Map the ZERO_PAGE - vm_page_prot is readonly */&lt;br /&gt;-        page = ZERO_PAGE(address);&lt;br /&gt;-        page_cache_get(page);&lt;br /&gt;-        entry = mk_pte(page, vma-&gt;vm_page_prot);&lt;br /&gt;-&lt;br /&gt;-        ptl = pte_lockptr(mm, pmd);&lt;br /&gt;-        spin_lock(ptl);&lt;br /&gt;-        if (!pte_none(*page_table))&lt;br /&gt;-            goto release;&lt;br /&gt;-        inc_mm_counter(mm, file_rss);&lt;br /&gt;-        page_add_file_rmap(page);&lt;br /&gt;-    }&lt;br /&gt;+    page_table = pte_offset_map_lock(mm, pmd, address, &amp;ptl);&lt;br /&gt;+    if (unlikely(!pte_none(*page_table)))&lt;br /&gt;+        goto release;&lt;br /&gt;+    inc_mm_counter(mm, anon_rss);&lt;br /&gt;+    lru_cache_add_active(page);&lt;br /&gt;+    page_add_new_anon_rmap(page, vma, address);&lt;br /&gt;&lt;br /&gt;  set_pte_at(mm, address, page_table, entry);&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-8247726130608676359?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/8247726130608676359/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=8247726130608676359' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8247726130608676359'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8247726130608676359'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/04/zeropage-or-not-zeropage.html' title='ZERO_PAGE or not ZERO_PAGE...'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-3864518604138226826</id><published>2007-04-14T17:00:00.000+02:00</published><updated>2007-04-14T17:25:09.680+02:00</updated><title type='text'>A fast way to get users' disk usage on ext2/ext3 filesystems</title><content type='html'>The simplest way to get the per user disk usage is to mount the filesystem with the quota accounting (see "man 8 mount"). This is the most reliale way to be sure that the quota limits will be respected, since the accounting and the checks are performed synchronously by the filesystem itself. Unfortunately this adds a little overhead in the whole filesystem.&lt;br /&gt;&lt;br /&gt;Another approach is to periodically check the disk usage with a script (typically with a cron job). The script should sum the size of each file and directory in the filesystem grouping them by the respective owners.&lt;br /&gt;&lt;br /&gt;This program (&lt;a href="http://download.systemimager.org/%7Earighi/e2fsusage/"&gt;e2fsusage&lt;/a&gt;) uses the second approach, but it doesn't read directly the files and directories, it analyze the filesystem metadata, performing a sequential scan of all the inodes.&lt;br /&gt;&lt;br /&gt;In this way it bypass the process for the translation of the file/dir names into the respective inodes and it strongly reduces the total time to scan the entire filesystem. Moreover, since it evaluates the real allocated blocks of the filesystem using &lt;span style="font-style: italic;"&gt;i_blocks&lt;/span&gt;, instead of &lt;span style="font-style: italic;"&gt;i_size &lt;/span&gt;(see the &lt;span style="font-style: italic;"&gt;struct ext2_inode&lt;/span&gt; in &lt;span style="font-style: italic;"&gt;/usr/include/ext2fs/ext2_fs.h&lt;/span&gt;) it is able to detect the true size&lt;br /&gt;occupied by each user (read it as: it is able to correctly handle sparse files).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-3864518604138226826?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/3864518604138226826/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=3864518604138226826' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3864518604138226826'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3864518604138226826'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/04/fast-way-to-get-users-disk-usage-on.html' title='A fast way to get users&apos; disk usage on ext2/ext3 filesystems'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4480677869887625508</id><published>2007-04-04T21:27:00.001+02:00</published><updated>2011-01-03T00:41:12.702+01:00</updated><title type='text'>How to bypass the buffer cache in Linux</title><content type='html'>Linux has 2 kind of caches: the page cache and the buffer cache. The role of the page cache is to speed-up the access of the files on disks, in a similar way the buffer cache contains buffers of pages read from or being written to block devices. Both of them are memory areas managed in different ways (one more optimized for file objects and the other more block device oriented).&lt;br /&gt;&lt;br /&gt;From /proc/meminfo is possible to monitor the memory allocated for both caches (Buffers is the buffer cache, Cached is the page cache), for example:&lt;br /&gt;&lt;pre&gt;# cat /proc/meminfo&lt;br /&gt;...&lt;br /&gt;Buffers:         15116 kB&lt;br /&gt;Cached:          67912 kB&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;To perform an I/O benchmark on block devices (like /dev/sda, /dev/sdb, etc.) we usually use a simple `dd`, that loads data from device into memory (in read tests) or write from memory to device (in write tests). But in this cases data are accessed only once! There are no more reads or writes on their buffers. In these cases the buffer cache is only an overhead and it should be meaningful to bypass it.&lt;br /&gt;&lt;br /&gt;A way is to open the files using the flag O_DIRECT. This flag allows to bypass the caching mechanisms and exploit directly the DMA from/to the block device and the userspace source/destination buffers.&lt;br /&gt;&lt;br /&gt;Obviously there's not in the kernel a global flag to say: "ok just disable buffer cache" and it's not even possibile to disable the buffer cache for a single process.&lt;br /&gt;&lt;br /&gt;In the case that you can (and you want) to patch and recompile your application you could explicitly set the flag O_DIRECT in every open()s, but it wouldn't be so handy... ;-)&lt;br /&gt;&lt;br /&gt;Another solution is to write a simple glibc wrapper that intercepts all the open() and set the O_DIRECT flag.&lt;br /&gt;&lt;br /&gt;Following an example:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;libdirectio.c&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;#define _GNU_SOURCE&lt;br /&gt;#define __USE_GNU&lt;br /&gt;&lt;br /&gt;#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;#include &amp;lt;stdarg.h&amp;gt;&lt;br /&gt;#include &amp;lt;string.h&amp;gt;&lt;br /&gt;#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;#include &amp;lt;dlfcn.h&amp;gt;&lt;br /&gt;#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;#include &amp;lt;sys/types.h&amp;gt;&lt;br /&gt;&lt;br /&gt;#define DEBUG&lt;br /&gt;&lt;br /&gt;#ifdef DEBUG&lt;br /&gt;#define DPRINTF(format, args...) fprintf(stderr, "debug: " format, ##args)&lt;br /&gt;#else&lt;br /&gt;#define DPRINTF(format, args...)&lt;br /&gt;#endif&lt;br /&gt;&lt;br /&gt;int open(const char *, int, ...) __attribute__ ((weak, alias("wrap_open")));&lt;br /&gt;int __open(const char *, int, ...) __attribute__ ((weak, alias("wrap_open")));&lt;br /&gt;int open64(const char *, int, ...) __attribute__ ((weak, alias("wrap_open64")));&lt;br /&gt;int __open64(const char *, int, ...) __attribute__ ((weak, alias("wrap_open64")));&lt;br /&gt;&lt;br /&gt;static int (*orig_open)(const char *, int, ...) = NULL;&lt;br /&gt;static int (*orig_open64)(const char *, int, ...) = NULL;&lt;br /&gt;&lt;br /&gt;static int __do_wrap_open(const char *name, int flags, mode_t mode,&lt;br /&gt;int (*func_open)(const char *, int, ...))&lt;br /&gt;{&lt;br /&gt;    if (strncmp("/dev/null", name, sizeof("/dev/null"))) {&lt;br /&gt;        DPRINTF("setting flags O_DIRECT on %s\n", name);&lt;br /&gt;        flags |= O_DIRECT;&lt;br /&gt;    }&lt;br /&gt;    if (!strncmp("/dev/", name, sizeof("/dev/") - 1) ||&lt;br /&gt;            !strncmp("/proc/", name, sizeof("/proc/") - 1))&lt;br /&gt;        return fd;&lt;br /&gt;    return func_open(name, flags, mode);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;int wrap_open(const char *name, int flags, ...)&lt;br /&gt;{&lt;br /&gt;    va_list args;&lt;br /&gt;    mode_t mode;&lt;br /&gt;&lt;br /&gt;    va_start(args, flags);&lt;br /&gt;    mode = va_arg(args, mode_t);&lt;br /&gt;    va_end(args);&lt;br /&gt;&lt;br /&gt;    DPRINTF("calling libc open(%s, 0x%x, 0x%x)\n", name, flags, mode);&lt;br /&gt;&lt;br /&gt;    return __do_wrap_open(name, flags, mode, orig_open);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;int wrap_open64(const char *name, int flags, ...)&lt;br /&gt;{&lt;br /&gt;    va_list args;&lt;br /&gt;    mode_t mode;&lt;br /&gt;&lt;br /&gt;    va_start(args, flags);&lt;br /&gt;    mode = va_arg(args, mode_t);&lt;br /&gt;    va_end(args);&lt;br /&gt;&lt;br /&gt;    DPRINTF("calling libc open64(%s, 0x%x, 0x%x)\n", name, flags, mode);&lt;br /&gt;&lt;br /&gt;    return __do_wrap_open(name, flags, mode, orig_open64);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;void _init(void)&lt;br /&gt;{&lt;br /&gt;    orig_open = dlsym(RTLD_NEXT, "open");&lt;br /&gt;    if (!orig_open) {&lt;br /&gt;        fprintf(stderr, "error: missing symbol open!\n");&lt;br /&gt;        exit(1);&lt;br /&gt;    }&lt;br /&gt;    orig_open64 = dlsym(RTLD_NEXT, "open64");&lt;br /&gt;    if (!orig_open64) {&lt;br /&gt;        fprintf(stderr, "error: missing symbol open64!\n");&lt;br /&gt;        exit(1);&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Makefile&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;VERSION=0.1&lt;br /&gt;&lt;br /&gt;TARGET=libdirectio.so.$(VERSION)&lt;br /&gt;OBJS=libdirectio.o&lt;br /&gt;CC=gcc&lt;br /&gt;CFLAGS= -fPIC -Wall -O2 -g&lt;br /&gt;SHAREDFLAGS= -nostartfiles -shared -W1,-soname,libdirectio.so.0&lt;br /&gt;&lt;br /&gt;all: $(TARGET)&lt;br /&gt;&lt;br /&gt;%.o: %.c&lt;br /&gt;$(CC) -I. $(CFLAGS) -c $&lt; -o $@&lt;br /&gt;&lt;br /&gt;$(TARGET): $(OBJS)&lt;br /&gt;$(CC) $(SHAREDFLAGS) $(OBJS) -o $(TARGET) -lc -ldl&lt;br /&gt;&lt;br /&gt;clean:&lt;br /&gt;rm -f $(OBJS) $(TARGET)&lt;br /&gt;&lt;/pre&gt;To compile the library simply run `make`.You can pre-load it using the LD_PRELOAD environment variable in this way:&lt;pre&gt;# export LD_PRELOAD=$FULL_PATH_OF_YOUR_LIBRARY/libdirectio.so.0.1&lt;br /&gt;&lt;/pre&gt;Then you can run your brand-new direct I/O benchmark (typically `dd`) for block devices.To unload the library and restore the standard access simply run:&lt;pre&gt;# unload LD_PRELOAD&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4480677869887625508?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4480677869887625508/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4480677869887625508' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4480677869887625508'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4480677869887625508'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/04/how-to-bypass-buffer-cache-in-linux.html' title='How to bypass the buffer cache in Linux'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-8049879979377725153</id><published>2007-03-25T21:45:00.000+02:00</published><updated>2007-03-26T00:01:57.989+02:00</updated><title type='text'>disk I/O per-process accounting</title><content type='html'>A common problem in Linux is how to find the most I/O intensive process when there is an intense disk activity of the system. In some cases you may want to kill the crazy process that caused this condition.&lt;br /&gt;&lt;br /&gt;A lot of tools in Linux are able to deliver generic stats for your system: top, sar, dstat, iostat, vmstat, ... but unfortunately none of them is capable to show the particular disk activity done by each process.&lt;br /&gt;&lt;br /&gt;The following kernel patch enables the userspace tools to access per-process I/O statistics (WARNING: I tested it only with 2.6.18.3 vanilla!!!):&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;--- include/linux/sched.h.orig 2007-03-25 21:42:50.000000000 +0200&lt;br /&gt;+++ include/linux/sched.h 2007-03-25 21:42:56.000000000 +0200&lt;br /&gt;@@ -990,6 +990,12 @@&lt;br /&gt;  struct rcu_head rcu;&lt;br /&gt; &lt;br /&gt;  /*&lt;br /&gt;+  * disk I/O accounting informations&lt;br /&gt;+  */&lt;br /&gt;+ unsigned long long acct_disk_read;&lt;br /&gt;+ unsigned long long acct_disk_write;&lt;br /&gt;+&lt;br /&gt;+ /*&lt;br /&gt;   * cache last used pipe for splice&lt;br /&gt;   */&lt;br /&gt;  struct pipe_inode_info *splice_pipe;&lt;br /&gt;--- block/ll_rw_blk.c.orig 2007-03-25 18:05:51.000000000 +0200&lt;br /&gt;+++ block/ll_rw_blk.c 2007-03-25 18:12:51.000000000 +0200&lt;br /&gt;@@ -2586,6 +2586,12 @@&lt;br /&gt;   disk_round_stats(rq-&gt;rq_disk);&lt;br /&gt;   rq-&gt;rq_disk-&gt;in_flight++;&lt;br /&gt;  }&lt;br /&gt;+&lt;br /&gt;+ if (rw == READ) {&lt;br /&gt;+  current-&gt;acct_disk_read += nr_sectors;&lt;br /&gt;+ } else {&lt;br /&gt;+  current-&gt;acct_disk_write += nr_sectors;&lt;br /&gt;+ }&lt;br /&gt; }&lt;br /&gt; &lt;br /&gt; /*&lt;br /&gt;--- fs/proc/array.c.orig 2007-03-25 18:13:07.000000000 +0200&lt;br /&gt;+++ fs/proc/array.c 2007-03-25 18:15:00.000000000 +0200&lt;br /&gt;@@ -412,7 +412,7 @@&lt;br /&gt; &lt;br /&gt;  res = sprintf(buffer,"%d (%s) %c %d %d %d %d %d %lu %lu \&lt;br /&gt; %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \&lt;br /&gt;-%lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu\n",&lt;br /&gt;+%lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu %llu %llu\n",&lt;br /&gt;   task-&gt;pid,&lt;br /&gt;   tcomm,&lt;br /&gt;   state,&lt;br /&gt;@@ -457,7 +457,9 @@&lt;br /&gt;   task_cpu(task),&lt;br /&gt;   task-&gt;rt_priority,&lt;br /&gt;   task-&gt;policy,&lt;br /&gt;-  (unsigned long long)delayacct_blkio_ticks(task));&lt;br /&gt;+  (unsigned long long)delayacct_blkio_ticks(task),&lt;br /&gt;+  task-&gt;acct_disk_read,&lt;br /&gt;+  task-&gt;acct_disk_write);&lt;br /&gt;  if(mm)&lt;br /&gt;   mmput(mm);&lt;br /&gt;  return res;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The patch adds at the end of the process status array (see /usr/src/linux/fs/proc/array.c) two entries:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;the I/O read activity of the process&lt;/li&gt;&lt;li&gt;the I/O write activity of the process&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;You can access to them via the proc filesystem, the process array is in /proc/[pid]/stat (see `man 5 proc`).&lt;br /&gt;&lt;br /&gt;For example the following command shows the "top 10" list of the most I/O intensive processes of my system:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ cat /proc/[0-9]*/stat | awk '{print $2 ":" $43 + $44}' | sort -rn -t : -k 2 | head&lt;br /&gt;(pdflush):275240&lt;br /&gt;(reiserfs/0):179064&lt;br /&gt;(thunderbird-bin):74376&lt;br /&gt;(cupsd):18904&lt;br /&gt;(firefox-bin):15640&lt;br /&gt;(Xorg):13632&lt;br /&gt;(netstat):13512&lt;br /&gt;(gaim):9096&lt;br /&gt;(kswapd0):6032&lt;br /&gt;(syslog-ng):4568&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;As expected at the first place there's the pdflush (the worker_thread that writes back filesystem data), followed by the reiserfs/0 worker_thread... but obviously you can't kill them! they're kernel thread... so in my case the most active I/O intensive userspace process is thunderbird! ;-)&lt;br /&gt;&lt;br /&gt;You can also write your custom top-like userspace tools to monitor the I/O rate of each process, or a program to see if your processes are doing more reads or writes, etc...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-8049879979377725153?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/8049879979377725153/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=8049879979377725153' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8049879979377725153'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/8049879979377725153'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/03/disk-io-per-process-accounting.html' title='disk I/O per-process accounting'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5769294604354796484</id><published>2007-03-07T22:00:00.000+01:00</published><updated>2007-03-07T22:29:58.383+01:00</updated><title type='text'></title><content type='html'>A quite old, but very nice howto about basic kernel bug hunting techniques:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;kernel debugging (&lt;a href="http://bravo.ce.uniroma2.it/kernelhacking2002/en/course-notes/Kernel_Debugging_1.txt"&gt;part 1&lt;/a&gt;)&lt;span class="down" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;kernel debugging (&lt;a href="http://bravo.ce.uniroma2.it/kernelhacking2002/en/course-notes/Kernel_Debugging_2.txt"&gt;part 2&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;kernel debugging (&lt;a href="http://bravo.ce.uniroma2.it/kernelhacking2002/en/course-notes/Kernel_Debugging_3.txt"&gt;part 3&lt;/a&gt;)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5769294604354796484?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5769294604354796484/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5769294604354796484' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5769294604354796484'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5769294604354796484'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/03/quite-old-but-very-nice-howto-about.html' title=''/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-3070076871361107470</id><published>2007-03-06T10:34:00.000+01:00</published><updated>2007-03-06T10:40:05.604+01:00</updated><title type='text'></title><content type='html'>Excellent explanation about &lt;a href="http://sourceforge.net/mailarchive/message.php?msg_id=12656745"&gt;SVN merging&lt;/a&gt;. SVN merge is a good way to merge code changes between different directories within  a repository and this is done often when it's necessary to apply the same fixes in different branches.&lt;span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-3070076871361107470?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/3070076871361107470/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=3070076871361107470' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3070076871361107470'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3070076871361107470'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/03/excellent-explanation-about-svn-merging.html' title=''/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-5428454833191692131</id><published>2007-03-05T19:21:00.000+01:00</published><updated>2007-03-05T19:42:55.551+01:00</updated><title type='text'>weekend at Marmoraia</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://lh3.google.com/image/righi.andrea/ResU-QwCqBI/AAAAAAAAABk/yU39lg641EQ/s288/hpim1214.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px;" src="http://lh3.google.com/image/righi.andrea/ResU-QwCqBI/AAAAAAAAABk/yU39lg641EQ/s288/hpim1214.jpg" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Yesterday visit to the beautiful church of Marmoraia, a romanic church of XI century in the hills of the &lt;a href="http://maps.google.com/?ie=UTF8&amp;z=15&amp;amp;ll=43.331827,11.173053&amp;spn=0.015671,0.028667&amp;amp;om=1"&gt;Montagnola Senese&lt;/a&gt;, with a wonderful walk in the nearby woods of chestnut trees...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-5428454833191692131?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/5428454833191692131/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=5428454833191692131' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5428454833191692131'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/5428454833191692131'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/03/weekend-at-marmoraia.html' title='weekend at Marmoraia'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4896166796342556171</id><published>2007-03-03T15:33:00.000+01:00</published><updated>2007-03-05T18:50:51.464+01:00</updated><title type='text'>multi-threaded "cat-like" command for web pages...</title><content type='html'>Following a very useful script to dump HTML of one or more web links, given as arguments, to standard output. If more than one link is passed it spawns a thread for each link (synchronizing the dump on stdout in mutual exclusion). The threaded approach reduce the average waiting time for the reply of the connections and it's strongly improve performances when we need to download a lot of pages at the same time (i.e. I usually use this script to dump and grep the linux kernel &lt;a href="http://www.kernel.org/pub/linux/kernel/v2.6"&gt;changelogs&lt;/a&gt; directly from web...).&lt;br /&gt;&lt;br /&gt;BTW: python is great! ;-)&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;#!/usr/bin/env python&lt;br /&gt;&lt;br /&gt;import sys, urllib, urllib&lt;br /&gt;from threading import Thread, Lock&lt;br /&gt;&lt;br /&gt;class webThread(Thread):&lt;br /&gt;def __init__(self, url):&lt;br /&gt;  self.url = url&lt;br /&gt;  Thread.__init__(self)&lt;br /&gt;&lt;br /&gt;def run(self):&lt;br /&gt;  remotefile = urllib.urlopen(self.url)&lt;br /&gt;  data = remotefile.read()&lt;br /&gt;  remotefile.close()&lt;br /&gt;&lt;br /&gt;  stdout_mutex.acquire()&lt;br /&gt;  print "=== %s ===" % self.url&lt;br /&gt;  print data&lt;br /&gt;  stdout_mutex.release()&lt;br /&gt;&lt;br /&gt;if __name__ == '__main__':&lt;br /&gt;if len(sys.argv) &lt;&gt;" % sys.argv[0]&lt;br /&gt;  sys.exit(1)&lt;br /&gt;else:&lt;br /&gt;  threads = []&lt;br /&gt;  stdout_mutex = Lock()&lt;br /&gt;  sys.stdout.flush()&lt;br /&gt;  for url in sys.argv[1:]:&lt;br /&gt;      t = webThread(url)&lt;br /&gt;      t.start()&lt;br /&gt;      threads.append(t)&lt;br /&gt;  for t in threads:&lt;br /&gt;      t.join()&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4896166796342556171?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4896166796342556171/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4896166796342556171' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4896166796342556171'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4896166796342556171'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/03/multi-threaded-cat-like-command-for-web.html' title='multi-threaded &quot;cat-like&quot; command for web pages...'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-3726610806540838775</id><published>2007-03-01T19:59:00.000+01:00</published><updated>2007-03-01T20:39:38.548+01:00</updated><title type='text'>interactive support for si_psh</title><content type='html'>Today I improved my favorite distributed shell  &lt;a href="http://svn.systemimager.org/filedetails.php?repname=systemimager&amp;path=%2Ftrunk%2Fsbin%2Fsi_psh&amp;amp;rev=0&amp;sc=0"&gt;si_psh&lt;/a&gt; (that is part of &lt;a href="http://wiki.systemimager.org"&gt;SystemImager&lt;/a&gt;) adding the &lt;a href="http://svn.systemimager.org/diff.php?repname=systemimager&amp;amp;path=%2Ftrunk%2Fsbin%2Fsi_psh&amp;rev=0&amp;amp;sc=0"&gt;interactive support&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I need it for the &lt;a href="http://www.cineca.it/bdp/sezioni/risorse/hardware/scheda?CODICE=bcx_5120_hpc&amp;BSDVAR_TIPOLOGIA=hpcspeciali"&gt;BCX cluster&lt;/a&gt; when I have to run multiple commands on the same subset of nodes. Without the interactive support I used to edit the command line of the previous command in the shell, but with long commands this is not practice enough...&lt;br /&gt;&lt;br /&gt;I discovered also that ssh (maybe via glibc, dunno...) knows when it has the stdin opened on a terminal or  not and if it doesn't find a valid terminal it's not possible to catch the stderr inside a perl wrapper script! The problem is that when si_psh runs interactively stdin is not opened in the terminal, but it's opened inside the perl script, to get the user commands (like a typical shell). And so I wasn't able to get the stderr of the spawned ssh sessions...&lt;br /&gt;&lt;br /&gt;It's possible to workaround spawning the ssh process via exec. For example in this case @out contains both stdout and stderr:&lt;br /&gt;&lt;br /&gt; &lt;span style="font-size:100%;"&gt;&lt;span style="font-family: courier new;"&gt;my @out = `exec 2&gt;&amp;1 ssh ...`;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;And this doesn't work (@out doesn't contain the stderr):&lt;br /&gt;&lt;br /&gt; &lt;span style="font-size:100%;"&gt;&lt;span style="font-family: courier new;"&gt;my @out = `ssh ... 2&gt;&amp;1`;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-3726610806540838775?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/3726610806540838775/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=3726610806540838775' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3726610806540838775'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/3726610806540838775'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/03/interactive-support-for-sipsh.html' title='interactive support for si_psh'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-2775295135324637917</id><published>2007-02-28T20:12:00.000+01:00</published><updated>2007-03-02T00:14:07.632+01:00</updated><title type='text'>memory leak bug with IA32 emulation on x86_64</title><content type='html'>It seems that the kernel of some recent distributions (like RHEL4 or SLES9 for example) are affected by a memory leak bug in the committed memory: the virtual memory allocated by the userland applications and requested by &lt;span style="font-family:courier new;"&gt;*malloc()&lt;/span&gt;. This occurs only with 64-bit processors (like x86_64, in my case) when you run IA32 applications. If you start to run a lot of IA32 applications you can see the value of &lt;span style="font-family:courier new;"&gt;Committed_AS&lt;/span&gt; in &lt;span style="font-family:courier new;"&gt;/proc/meminfo&lt;/span&gt; to grow forever... it occurs only in the kernels of some distributions, not with recents vanilla.&lt;br /&gt;&lt;br /&gt;But... is it a critical bug? it depends... virtual memory is not physical memory, applications can always request a memory region, but if they don't use it the physical memory is never allocated. The point is: should the kernel give virtual memory to the processes also if they are requesting more than the physical memory? In case of yes the system is overcommitting the memory.&lt;br /&gt;&lt;br /&gt;Linux supports 3 overcommit handling policies (see &lt;span style="font-family:courier new;"&gt;/usr/src/linux/Documentation/vm/overcommit-accounting&lt;/span&gt;):&lt;br /&gt;&lt;ul&gt;&lt;li&gt;"guess" policy&lt;br /&gt;&lt;/li&gt;&lt;li&gt;always overcommit&lt;br /&gt;&lt;/li&gt;&lt;li&gt;never overcommit&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;By default Linux uses the "guess" policy: the kernel uses a heuristic to decide if a memory request can be committed or not. This heuristic does not depend on the value of the committed memory, but it depends essentially  on the free physical memory. Also with always overcommit  the counter of the committed memory is not important (except for accounting informations). But with never overcommit policy the value of the committed memory defines the result of the memory requests, (because the memory can't be overcommitted) so in this case it is functionally important. For more implementation details see &lt;span style="font-family:courier new;"&gt;__vm_enough_memory()&lt;/span&gt; in &lt;span style="font-family:courier new;"&gt;/usr/src/linux/mm/mmap.c&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The memory leak bug occurs because during &lt;span style="font-family:courier new;"&gt;exec()&lt;/span&gt;, &lt;span style="font-family:courier new;"&gt;setup_arg_pages()&lt;/span&gt; calls &lt;span style="font-family:courier new;"&gt;vm_enough_memory()&lt;/span&gt; for a vma without the &lt;span style="font-family:courier new;"&gt;VM_ACCOUNT&lt;/span&gt; flag set. When the process exits, &lt;span style="font-family:courier new;"&gt;exit_mmap()&lt;/span&gt; only calls &lt;span style="font-family:courier new;"&gt;vm_unacct_memory()&lt;/span&gt; if the vma has the &lt;span style="font-family:courier new;"&gt;VM_ACCOUNT&lt;/span&gt; flag set... hey! but so we're really leaking memory here...&lt;br /&gt;&lt;br /&gt;The fix in this case is very simple:&lt;span style=";font-family:courier new;font-size:85%;"  &gt;&lt;br /&gt;&lt;blockquote&gt;--- include/asm-x86_64/page.h.orig      2007-02-27 17:31:05.000000000 +0100&lt;br /&gt;+++ include/asm-x86_64/page.h   2007-02-27 17:24:26.000000000 +0100&lt;br /&gt;@@ -134,7 +134,7 @@&lt;br /&gt;#define __VM_DATA_DEFAULT_FLAGS        (VM_READ | VM_WRITE | VM_EXEC | \&lt;br /&gt;                            VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)&lt;br /&gt;#define __VM_STACK_FLAGS       (VM_GROWSDOWN | VM_READ | VM_WRITE | VM_EXEC | \&lt;br /&gt;-                                VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)&lt;br /&gt;+                                VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC | VM_ACCOUNT)&lt;br /&gt;&lt;br /&gt;#define VM_DATA_DEFAULT_FLAGS \&lt;br /&gt;   (test_thread_flag(TIF_IA32) ? vm_data_default_flags32 : \&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;/span&gt;Unfortunately there's another problem... in &lt;span style="font-family:courier new;"&gt;arch/x86_64/ia32/ia32_binfmt.c&lt;/span&gt;, &lt;span style="font-family:courier new;"&gt;security_vm_enough_memory()&lt;/span&gt; tend to forget to &lt;span style="font-family:courier new;"&gt;vm_unacct_memory()&lt;/span&gt; when a&lt;br /&gt;failure occurs (this problem is more rare, but it can occur). For this problem the patch is the following:&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;blockquote&gt;&lt;span style="font-family:courier new;"&gt;--- arch/x86_64/ia32/ia32_binfmt.c.orig 2007-02-27 17:26:47.000000000 +0100&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+++ arch/x86_64/ia32/ia32_binfmt.c      2007-02-27 17:27:01.000000000 +0100&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;@@ -347,11 +347,6 @@&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        if (!mpnt)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;                return -ENOMEM;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-       if (security_vm_enough_memory((IA32_STACK_TOP - (PAGE_MASK &amp; (unsigned long) bprm-&gt;p))&gt;&gt;PAGE_SHIFT)) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-               kmem_cache_free(vm_area_cachep, mpnt);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-               return -ENOMEM;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-       }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        memset(mpnt, 0, sizeof(*mpnt));&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        down_write(&amp;mm-&gt;mmap_sem);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;--- fs/exec.c.orig      2007-02-27 17:27:39.000000000 +0100&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+++ fs/exec.c   2007-02-27 17:28:08.000000000 +0100&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;@@ -413,11 +413,6 @@&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        if (!mpnt)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;                return -ENOMEM;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-       if (security_vm_enough_memory(arg_size &gt;&gt; PAGE_SHIFT)) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-               kmem_cache_free(vm_area_cachep, mpnt);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-               return -ENOMEM;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-       }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        memset(mpnt, 0, sizeof(*mpnt));&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        down_write(&amp;mm-&gt;mmap_sem);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;--- mm/mmap.c.orig      2007-02-27 17:27:50.000000000 +0100&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+++ mm/mmap.c   2007-02-27 17:28:58.000000000 +0100&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;@@ -2024,6 +2024,9 @@&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        __vma = find_vma_prepare(mm,vma-&gt;vm_start,&amp;prev,&amp;amp;amp;amp;amp;rb_link,&amp;rb_parent);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        if (__vma &amp;&amp;amp; __vma-&gt;vm_start &lt;&gt;vm_end)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;                return -ENOMEM;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+       if ((vma-&gt;vm_flags &amp; VM_ACCOUNT) &amp;amp;&amp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+               security_vm_enough_memory(vma_pages(vma)))&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+                       return -ENOMEM;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        vma_link(mm, vma, prev, rb_link, rb_parent);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;        return 0;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt; }&lt;/span&gt;&lt;/blockquote&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;If I apply the 2 patches above I can resolve my problems. For critical servers I use the "never overcommit" policy, because I hate when the OOM-killer always decide to terminate the most important application... :-) If the memory is never overcommitted OOM-killer is disabled and the applications can quit in a more graceful way; it's better to get a &lt;span style="font-family:courier new;"&gt;NULL&lt;/span&gt; from a &lt;span style="font-family:courier new;"&gt;*malloc()&lt;/span&gt;than get a &lt;span style="font-family:courier new;"&gt;SIGKILL&lt;/span&gt; from the kernel... :-)&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt; &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-2775295135324637917?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/2775295135324637917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=2775295135324637917' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/2775295135324637917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/2775295135324637917'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/02/memory-leak-bug-with-ia32-emulation-on.html' title='memory leak bug with IA32 emulation on x86_64'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4397409626710913610.post-4027041873824691616</id><published>2007-02-27T21:13:00.000+01:00</published><updated>2007-02-28T00:33:11.269+01:00</updated><title type='text'>first post on my blog...</title><content type='html'>Today I decided to open this blog, just for test and fun. I'm not aware of any blog, I always played with forums, mailing lists, etc. but never with blogs, so now it's time to start. :-)&lt;br /&gt;&lt;br /&gt;Following some useful links about me:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;my personal website: &lt;a href="http://members.lycos.co.uk/righiandr"&gt;Andrea Righi's forum&lt;br /&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;my favourite project: &lt;a href="http://wiki.systemimager.org/"&gt;SystemImager&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;my favourite kernel: &lt;a href="http://www.kernel.org/"&gt;linux&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;the place where I work: &lt;a href="http://www.cineca.it/en/index.htm"&gt;CINECA&lt;/a&gt;&lt;/span&gt;&lt;a href="http://www.cineca.it/"&gt;&lt;span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4397409626710913610-4027041873824691616?l=arighi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://arighi.blogspot.com/feeds/4027041873824691616/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4397409626710913610&amp;postID=4027041873824691616' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4027041873824691616'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4397409626710913610/posts/default/4027041873824691616'/><link rel='alternate' type='text/html' href='http://arighi.blogspot.com/2007/02/first-post-on-my-blog.html' title='first post on my blog...'/><author><name>Andrea Righi</name><uri>https://profiles.google.com/101376048385071510808</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh5.googleusercontent.com/-EQBqI7gO90k/AAAAAAAAAAI/AAAAAAAAHbg/_6xJF8FUWFY/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry></feed>
