Archive for the ‘kernel panic’ Category

android: binder: kernel panic: binder_get_ref

August 23, 2015

This post is to discuss debugging kernel panic at binder_get_ref. The reference source code here is qualcomm msm kernel release 3.4.0.

[  638.949732] c1    936 Unable to handle kernel paging request at virtual address e714c805
[  638.949732] c1    936 pgd = dd99c000
[  638.949763] c1    936 [e714c805] *pgd=9f075811, *pte=00000000, *ppte=00000000
[  638.949763] c1    936 DFSR=00000007, TTBCR=00000000, TTBR0=9e39c06a, TTBR1=8060406a
[  638.949763] c1    936 MAIR0=ff0881a8, MAIR1=40e040e0, SCTLR=10c5787d
[  638.949793] c1    936 Internal error: Oops: 7 [#1] PREEMPT SMP ARM
[  638.949793] c1    936 Modules linked in: bcmdhd dm_crypt(O) moc_crypto(PO) moc_platform_mod(O)
[  638.949824] c1    936 CPU: 1    Tainted: P        W  O  (3.4.10-gc50a6b2 #1)
[  638.949854] c1    936 PC is at binder_get_ref+0x8/0x34
[  638.949854] c1    936 LR is at binder_transaction_buffer_release+0x190/0x340
[  638.949854] c1    936 pc : [<c050c304>]    lr : [<c050e48c>]    psr: a0000013
[  638.949854] c1    936 sp : dae7ddd8  ip : 766045f4  fp : 000001cc
[  638.949885] c1    936 r10: 00000000  r9 : dc3e6400  r8 : 00000068
[  638.949885] c1    936 r7 : eb000288  r6 : c0cf22f8  r5 : 0001734e  r4 : eb000220
[  638.949885] c1    936 r3 : e714c7dd  r2 : 000001cb  r1 : 000001cc  r0 : da231580
…
[  638.952601] c1    936 Process Binder_4 (pid: 936, stack limit = 0xdae7c2f8)
…
[  638.952815] c1    936 [<c050c304>] (binder_get_ref+0x8/0x34) from [<c050e48c>] (binder_transaction_buffer_release+0x190/0x340)
[  638.952845] c1    936 [<c050e48c>] (binder_transaction_buffer_release+0x190/0x340) from [<c050fb1c>] (binder_thread_write+0x620/0x1f7c)
[  638.952845] c1    936 [<c050fb1c>] (binder_thread_write+0x620/0x1f7c) from [<c0512238>] (binder_ioctl+0x1d4/0x588)
[  638.952876] c1    936 [<c0512238>] (binder_ioctl+0x1d4/0x588) from [<c014d2cc>] (do_vfs_ioctl+0x4d8/0x54c)
[  638.952876] c1    936 [<c014d2cc>] (do_vfs_ioctl+0x4d8/0x54c) from [<c014d374>] (sys_ioctl+0x34/0x54)
[  638.952906] c1    936 [<c014d374>] (sys_ioctl+0x34/0x54) from [<c000e480>] (ret_fast_syscall+0x0/0x30)
[  638.952906] c1    936 Code: e1a00003 e12fff1e e5903010 ea000006 (e5932028) 
[  638.952937] c1    936 ---[ end trace 3d62df4cc805c0c7 ]---
[  638.952968] c1    936 Kernel panic - not syncing: Binder_4 PC:binder_get_ref+0x8/0x34 LR:binder_transaction_buffer_release+0x190/0x340

analysis
In kernel binder, a ref of a process points to a node in some process. A ref is a client and a node is a service. Each process use a rb tree to maintains all refs of this process. Each process also use another rb tree to maintains all nodes of this process.

In this case, thread Binder_4(tid=936), requests a binder transaction from a binder proxy whose description is 0x1cc. Binder driver tries to find ref whose description is 0x1cc from the ref rb tree.

This panic happens in binder_get_ref(binder_proc=0xdc3e6400, desc=0x1cc). log shows, at timestamp, ref = r3 = 0xe714c7dd and desc = r1 = 0x1cc. Kernel panic happens at ref->desc while deference ref->desc = [0xe714c7dd + 0x28] = 0x[e714c805]

static struct binder_ref *binder_get_ref(struct binder_proc *proc,
                     uint32_t desc)
{
    struct rb_node *n = proc->refs_by_desc.rb_node;
    struct binder_ref *ref;

    while (n) {
        ref = rb_entry(n, struct binder_ref, rb_node_desc);

        if (desc < ref->desc)
            n = n->rb_left;
        else if (desc > ref->desc)
            n = n->rb_right;
        else
            return ref; 
    }
    return NULL;
}
c050c2fc <binder_get_ref>:
c050c2fc:   e5903010    ldr r3, [r0, #16]
c050c300:   ea000006    b   c050c320 <binder_get_ref+0x24>
c050c304:   e5932028    ldr r2, [r3, #40]   ; 0x28
c050c308:   e2430004    sub r0, r3, #4
c050c30c:   e1510002    cmp r1, r2
c050c310:   35933008    ldrcc   r3, [r3, #8]
c050c314:   3a000001    bcc c050c320 <binder_get_ref+0x24>
c050c318:   912fff1e    bxls    lr
c050c31c:   e5933004    ldr r3, [r3, #4]
c050c320:   e3530000    cmp r3, #0
c050c324:   1afffff6    bne c050c304 <binder_get_ref+0x8>
c050c328:   e1a00003    mov r0, r3
c050c32c:   e12fff1e    bx  lr

analysis: check if ref rb tree is memory corrupted
Traverse the ref rb tree to find rb node whose desc is 01xcc. If the root cause is memory corruption, then the traverse would fail in the same place as the kernel panic. However, the traverse is successful and a rb node with desc 0x1cc is found. Now the root cause seems not related to memory corruption. Let’s check the assembly code for further analysis.
binder_panic_01

analysis: check registers and assembly code
Since r2 = 0x1cb, this panic happens while traversing from the rb node with desc 0x1cb to rb node with desc 0x1cd.

Given, desc = 0x1cc, ref->desc = 0x1cb, and n = 0xda231584, if
desc > ref->desc is true, then n = n->rb_right. So n = [0xda231584+0x4] = [0xda231588] = 0xda818a84.

One possible reason to cause this issue is a bit flip in CPU caches happens and [0xda231588] becomes [0xdaa31588] = 0xe714c7dd which is equal to r3.
0xda231588 = 0b11011010001000110001010110001000
0xdaa31588 = 0b11011010101000110001010110001000

binder_panic_02

c050c304:   e5932028    ldr r2, [r3, #40]   ; 0x28 
=> r2 = [0xda2315ac] = 0x1cb
c050c308:   e2430004    sub r0, r3, #4
=> r0 = 0xda231580
=> since 0x1cb < 0x1cc, traverse the right child.
c050c31c:   e5933004    ldr r3, [r3, #4]
=> r3 = [0xda231588] = 0xda818a84 
=> However, a bit flip happens => r3 = [0xdaa31588] = e714c7dd
c050c320:   e3530000    cmp r3, #0
c050c324:   1afffff6    bne c050c304 <binder_get_ref+0x8>
c050c304:   e5932028    ldr r2, [r3, #40]   ; 0x28
=> Unable to handle kernel paging request at virtual address e714c805

conclusion
In this case, a kernel panic happens at binder_get_ref. At first, traverse the ref rb tree to check if the ref rb tree is memory corrupted. The result shows that the ref rb tree is not memory corrupted. With further analysis of registers and assembly code in binder_get_ref, one possible root cause is that a bit flip happens in CPU caches while memory content is correct. So it’s possible a hardware issue rather than a software issue.

Advertisements

arm: mm: kernel panic: BUG: Bad page map in process

August 10, 2015

This post is to discuss debugging kernel panic: BUG: Bad page map in process. The reference source code here is qualcomm msm kernel release 3.4.0. In the case discussed here, CONFIG_ARM_LPAE is disabled, and virtual address space is 4 GB.

symptom: BUG: Bad page map in process
This issue happens while user space requests to unmap an interval of virtual memory. kernel removes the virtual memory interval and walks page table to find the physical page to which the virtual memory is mapping. If the physical page exists, then kernel will clear the page table entry and decrease map count of this mapped page. In this issue, the unmppaed virtual memory maps to a physical free page. It contradicts to unmap a free page and a BUG is triggered.

[ 518.237101] c1 6508 BUG: Bad page map in process c.sense.browser pte:22880747 pmd:22892831
[ 518.237115] c1 6508 page:d3847a00 count:0 mapcount:-128 mapping: (null) index:0x0
[ 518.237122] c1 6508 page flags: 0x14(referenced|dirty)
[ 518.237138] c1 6508 addr:90130000 vm_flags:000c00fb anon_vma: (null) mapping:efa02118 index:bb
[ 518.237152] c1 6508 vma->vm_ops->fault: kgsl_gpumem_vm_fault+0x0/0x48
[ 518.237161] c1 6508 vma->vm_file->f_op->mmap: kgsl_mmap+0x0/0x2b4
[ 518.237204] c1 6508 kernel BUG at … kernel/mm/memory.c:710!
[ 518.237212] c1 6508 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[ 518.237265] c1 6508 PC is at print_bad_pte+0x154/0x178
[ 518.237276] c1 6508 LR is at __print_symbol+0x2c/0x4c
[ 518.239850] c1 6508 Kernel panic - not syncing: c.sense.browser PC:print_bad_pte+0x154/0x178 LR:__print_symbol+0x2c/0x4c
[ 518.239760] c1 6508 [<c025df78>] (print_bad_pte+0x154/0x178) from [<c025f2d8>] (unmap_single_vma+0x330/0x5a8)
[ 518.239772] c1 6508 [<c025f2d8>] (unmap_single_vma+0x330/0x5a8) from [<c025fd9c>] (unmap_vmas+0x38/0x54)
[ 518.239784] c1 6508 [<c025fd9c>] (unmap_vmas+0x38/0x54) from [<c026296c>] (unmap_region+0xa4/0x194)
[ 518.239795] c1 6508 [<c026296c>] (unmap_region+0xa4/0x194) from [<c026394c>] (do_munmap+0x220/0x288)
[ 518.239806] c1 6508 [<c026394c>] (do_munmap+0x220/0x288) from [<c02639f0>] (vm_munmap+0x3c/0x50)
[ 518.239820] c1 6508 [<c02639f0>] (vm_munmap+0x3c/0x50) from [<c01061e0>] (ret_fast_syscall+0x0/0x30)
[ 518.239829] c1 6508 Code: 0a000002 e5931028 e59f0020 ebfe8752 (e7f001f2)

user space virtual memory management
Kernel manages the user space virtual address space with a double linked list, current->mm->mmap. While user space tries to unmap a virtual memory interval, kernel removes the interval from the list current->mm->mmap. Kernel also uses a rb tree to manage the same virtual address space to provide optimal time complexity to find the required interval.

BUG_Bad_page_map_01

page table
Since CONFIG_ARM_LPAE is disabled, the page table is 2-level rather than 3-level. In arm v7, within a virtual address, [31..20] bits specify the first level page table offset, [19..12] bits specify the second level page table offset, and [12..0] bits specify the address within the physical page.

In arm, kernel puts the two consecutive second level page tables together. And it tweaks to use [31..21] bits to specify the first level page table offset, [20..12] bits to specify the second level page table offset.

BUG_Bad_page_map_02

debug
do_munmap might do the following three jobs:

  • Remove vm_area_struct from current->mm->mmap
  • Clear page table entry at current->mm->pg[pgd_offste]->pte[pte_offset]
  • Modify page structure pointed by page table entry
  • In the case, kernel unmaps an interval from 0x90130000 in virtual address space. After walking the page table, kernel concludes that it maps to a physical page whose corresponding page structure is at 0xD3847A00. After decreased by one, page->_mapcount is equal to -129. The page is a Buddy one with order 4. With page->lru, it’s confirmed the page is in a freelist and its predecessor and successor are both order 4 Buddy page, too. So far, it seems that virtual address is not supposed to map to this physical page, let’s walk the page table to see what happens.

    BUG_Bad_page_map_03

    #define PAGE_BUDDY_MAPCOUNT_VALUE (-128)
     
    static inline int PageBuddy(struct page *page)
    {
             return atomic_read(&page->_mapcount) == PAGE_BUDDY_MAPCOUNT_VALUE;
    }
     
    inline void __SetPageBuddy(struct page *page)
    {
            VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
            atomic_set(&page->mapcount, PAGE_BUDDY_MAPCOUNT_VALUE);
    }
     
    static inline void __ClearPageBuddy(struct page *page)
    {
            VM_BUG_ON_PAGE(!PageBuddy(page), page);
            atomic_set(&page->mapcount, -1);
    

    debug: find the first level page table entry
    The first level page table is task->mm->pgd. Given virtual address is 0x90130000, the first level page table entry is task->mm->pgd[0x480] = 0xE923A400. From ramdump, it appears that in print_bad_pte function the pgd variable have the same value. So it’s probably correct.

    (gdb) p/x 0x90130000 >> 21
    $1 = 0x480
    (gdb) p/x ((pgd_t *)0xE9238000) + 0x480
    $2 = 0xe923a400
    

    BUG_Bad_page_map_04

    debug: find the second level page table
    The first level page table entry points to physical address of the second level page table. The second level page table is 1KB in size and alignment. The physical address of the second level page table is 0x22892800 and 0x31 is flag.

    [0xE923A400] = 0x22892831
    

    As far as second level page table is concerned, arm 32-bit kernel puts two linux page tables and two hardware page tables within a page.

    Within in page 0x22892000:

  • The linux page table 0x22892000 is related to task->mm->pgd[0x480]
  • The linux page table 0x22892400 is related to task->mm->pgd[0x481]
  • The hw page table 0x22892800 is related to task->mm->pgd[0x480]
  • The hw page table 0x22892C00 is related to task->mm->pgd[0x481]
  • task->mm->pgd[0x480] points to the hw page table 0x22892800
  • task->mm->pgd[0x481] points to the hw page table 0x22892C00
  • BUG_Bad_page_map_05

    debug: find the second level page table entry

    (gdb) p/x (0x90130000 >> 12) & ((1<<9)-1)
    $1 = 0x130
    (gdb) p/x ((pte_t *)0x22892000) + 0x130
    $2 = 228924C0
    

    BUG_Bad_page_map_06

    debug: find the physical page
    The content of the second level page table entry was already erased by unmap. From adjacent second level page table entry, the physical address of the page might be 0x22980000. last_kmsg shows that pte is 0x22880000.

    [ 518.237101] c1 6508 BUG: Bad page map in process c.sense.browser pte:22880747 pmd:22892831
    
    mem_map[22880] = 0xD3847A00
    

    BUG_Bad_page_map_07

    conclusion
    The vma should map to the physical page at 0x22980000. But log shows that the physical page is at 0x22880000. It’s possible that a bit-flip happens in cpu cache and the incorrect pte value related to a free page trigger a BUG.


    %d bloggers like this: