Monday, April 16, 2007

ZERO_PAGE or not ZERO_PAGE...

Interesting discussion on the LKML about the opportuninty to remove the ZERO_PAGE for anonymous mappings (http://lkml.org/lkml/2007/4/3/432).

ZERO_PAGE is a single physical page that is always filled by 0 and it's used for zero-mapped memory areas.

It is used for example to initialize the anonymous pages of a task (not file-backed memory that exists only during the life of the task). When a program performs a malloc() the buffer returned by the function should be filled by zero. If the program tries to read from that buffer, the kernel, instead of allocating new physical free pages without any reasonable purpose, maps all the virtual accessed memory to the ZERO_PAGE.

Anyway, in general, an application that reads from a just allocated empty buffer is a quite stupid application :-) (except when you have to work with sparse matrices!) and the ZERO_PAGE handling has a cost in every COW faults.

The following patch removes the handling of the ZERO_PAGE for anonymous memory mappings and it simply allocates new physical pages in the case that a program wants to read empty buffers. Depending on your applications you should see a small improvement in terms of performance, but a bigger memory consumption if you runs that kind of applications mentioned above.

side note: I'm using it in my notebook and it works fine! :-)


--- linux-2.6.20.4/mm/memory.c.orig 2007-04-06 00:23:52.000000000 +0200
+++ linux-2.6.20.4/mm/memory.c 2007-04-06 00:25:48.000000000 +0200
@@ -1569,16 +1569,11 @@

if (unlikely(anon_vma_prepare(vma)))
goto oom;
- if (old_page == ZERO_PAGE(address)) {
- new_page = alloc_zeroed_user_highpage(vma, address);
- if (!new_page)
- goto oom;
- } else {
- new_page = alloc_page_vma(GFP_HIGHUSER, vma, address);
- if (!new_page)
- goto oom;
- cow_user_page(new_page, old_page, address, vma);
- }
+
+ new_page = alloc_page_vma(GFP_HIGHUSER, vma, address);
+ if (!new_page)
+ goto oom;
+ cow_user_page(new_page, old_page, address, vma);

/*
* Re-check the pte - we dropped the lock
@@ -2088,38 +2083,24 @@
spinlock_t *ptl;
pte_t entry;

- if (write_access) {
- /* Allocate our own private page. */
- pte_unmap(page_table);
+ /* Allocate our own private page. */
+ pte_unmap(page_table);

- if (unlikely(anon_vma_prepare(vma)))
- goto oom;
- page = alloc_zeroed_user_highpage(vma, address);
- if (!page)
- goto oom;
+ if (unlikely(anon_vma_prepare(vma)))
+ goto oom;
+ page = alloc_zeroed_user_highpage(vma, address);
+ if (!page)
+ goto oom;

- entry = mk_pte(page, vma->vm_page_prot);
- entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+ entry = mk_pte(page, vma->vm_page_prot);
+ entry = maybe_mkwrite(pte_mkdirty(entry), vma);

- page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
- if (!pte_none(*page_table))
- goto release;
- inc_mm_counter(mm, anon_rss);
- lru_cache_add_active(page);
- page_add_new_anon_rmap(page, vma, address);
- } else {
- /* Map the ZERO_PAGE - vm_page_prot is readonly */
- page = ZERO_PAGE(address);
- page_cache_get(page);
- entry = mk_pte(page, vma->vm_page_prot);
-
- ptl = pte_lockptr(mm, pmd);
- spin_lock(ptl);
- if (!pte_none(*page_table))
- goto release;
- inc_mm_counter(mm, file_rss);
- page_add_file_rmap(page);
- }
+ page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
+ if (unlikely(!pte_none(*page_table)))
+ goto release;
+ inc_mm_counter(mm, anon_rss);
+ lru_cache_add_active(page);
+ page_add_new_anon_rmap(page, vma, address);

set_pte_at(mm, address, page_table, entry);

No comments: