Archive for the ‘memory corruption’ Category

kernel: mm: debug_pagealloc

January 25, 2016

This post discusses how to enable debug_pagealloc, and how it detects single bit error and memory corruption.

reference code base
linux 4.3

how to enable debug_pagealloc

  • If CONFIG_DEBUG_PAGEALLOC is not set, then debug_pagealloc is always enabled.
  • If CONFIG_DEBUG_PAGEALLOC is set, then debug_pagealloc is disabled by default. Add debug_pagealloc=on in kernel command line could enable this feature. android: add arguments in kernel command line shows how to add arguments into kernel command line.
853         debug_pagealloc=
854                         [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
855                         parameter enables the feature at boot time. In
856                         default, it is disabled. We can avoid allocating huge
857                         chunk of memory for debug pagealloc if we don't enable
858                         it at boot time and the system will work mostly same
859                         with the kernel built without CONFIG_DEBUG_PAGEALLOC.
860                         on: enable the feature

allocate/free pages and debug_pagealloc
Allocating pages will poison pages with 0xaa, and freeing pages will unpoison pages. While unpoisoning pages, it will check if each byte of poisoned pages are 0xaa. If only one bit is incorrect, then kernel log will show “pagealloc: single bit error”. If more than one bit is incorrect, then kernel log will show “pagealloc: memory corruption”.

__alloc_pages_nodemask()
-> get_page_from_freelist()
   -> prep_new_page()
      -> kernel_map_pages()
         -> __kernel_map_pages()
For order 0 pages
__free_pages()
-> free_hot_cold_page()
   -> free_pages_prepare()

For high order pages
__free_pages()
-> __free_pages_ok()
   -> free_pages_prepare()
128 void __kernel_map_pages(struct page *page, int numpages, int enable)
129 {
130         if (!page_poisoning_enabled)
131                 return;
132 
133         if (enable)
134                 unpoison_pages(page, numpages);
135         else
136                 poison_pages(page, numpages);
137 }
138 
 32 /********** mm/debug-pagealloc.c **********/
 33 #define PAGE_POISON 0xaa

conclusion
To enable debug_pagealloc, it needs to compile kernel with CONFIG_DEBUG_PAGEALLOC=y and add debug_pagealloc=on in kernel command line. This feature poisons pages while pages are allocated and unpoison pages while pages are freed. While unpoisoning pages, if pages’ content are incorrect, then kernel log will show “pagealloc: single bit error” or “pagealloc: memory corruption”.

Advertisements

patch discussion: mm/vmscan.c: fix types of some locals

December 27, 2015

This post discusses mm/vmscan.c: fix types of some locals.

merge at
git: kernel/git/mhocko/mm.git
branch: since-4.3

zone_page_state(), zone_unmapped_file_pages()
Both function returns page numbers with type unsigned long.

what does the patch do
The patch fixes possible underflow while using a long local variable to accept return value of a function returning unsigned long. The patch fixes this problem by replacing the types of some variable with unsigned long accordingly.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6ceede0..55721b6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -194,7 +194,7 @@ static bool sane_reclaim(struct scan_control *sc)
 
 static unsigned long zone_reclaimable_pages(struct zone *zone)
 {
-	int nr;
+	unsigned long nr;
 
 	nr = zone_page_state(zone, NR_ACTIVE_FILE) +
 	     zone_page_state(zone, NR_INACTIVE_FILE);
@@ -3693,10 +3693,10 @@ static inline unsigned long zone_unmapped_file_pages(struct zone *zone)
 }
 
 /* Work out how many page cache pages we can reclaim in this reclaim_mode */
-static long zone_pagecache_reclaimable(struct zone *zone)
+static unsigned long zone_pagecache_reclaimable(struct zone *zone)
 {
-	long nr_pagecache_reclaimable;
-	long delta = 0;
+	unsigned long nr_pagecache_reclaimable;
+	unsigned long delta = 0;
 
 	/*
 	 * If RECLAIM_UNMAP is set, then all file pages are considered

conclusion
This post discusses mm/vmscan.c: fix types of some locals. If the return value is unsigned long, then caller could use unsigned long variable to accept it to avoid underflow.


%d bloggers like this: