Archive for the ‘arm64’ Category

kernel: arm64: mm: how user space stack grows

December 28, 2015

This post discusses how user space stack grows.

reference code base
linux 4.3

kernel config assumption
# CONFIG_STACK_GROWSUP is not set

how does user space stack grow
As discussed in kernel: mm: task->mm->mmap_sem, virtual address space is represented by intervals of VMAs. The stack itself corresponding to a VMA whose vm_flags has flag VM_GROWSDOWN set.

If a stack grows below its corresponding VMA’s scope, then a page fault will be triggered. __do_page_fault() calls expand_stack() to extend the corresponding VMA downward. Then, it calls -> handle_mm_fault() to allocate a page and modify page tables.

do_page_fault()
-> __do_page_fault()
   -> expand_stack()
    -> expand_downwards()
   -> handle_mm_fault()
      -> __handle_mm_fault()

__do_page_fault
If vma->vm_start > addr, then __do_page_fault() checks if VM_GROWSDOWN flag is set in vma->vm_flags. If true, then it calls expand_stack(vma, addr) to expand the VMA by 1 page. If successful, it calls handle_mm_fault() to allocate a page and modify page tables.

157 static int __do_page_fault(struct mm_struct *mm, unsigned long addr,
158                            unsigned int mm_flags, unsigned long vm_flags,
159                            struct task_struct *tsk)
160 {
161         struct vm_area_struct *vma;
162         int fault;
163 
164         vma = find_vma(mm, addr);
165         fault = VM_FAULT_BADMAP;
166         if (unlikely(!vma))
167                 goto out;
168         if (unlikely(vma->vm_start > addr))
169                 goto check_stack;
170 
171         /*
172          * Ok, we have a good vm_area for this memory access, so we can handle
173          * it.
174          */
175 good_area:
176         /*
177          * Check that the permissions on the VMA allow for the fault which
178          * occurred. If we encountered a write or exec fault, we must have
179          * appropriate permissions, otherwise we allow any permission.
180          */
181         if (!(vma->vm_flags & vm_flags)) {
182                 fault = VM_FAULT_BADACCESS;
183                 goto out;
184         }
185 
186         return handle_mm_fault(mm, vma, addr & PAGE_MASK, mm_flags);
187 
188 check_stack:
189         if (vma->vm_flags & VM_GROWSDOWN && !expand_stack(vma, addr))
190                 goto good_area;
191 out:
192         return fault;
193 }

conclusion
This post discusses how user space stack grows. While stack is below its lowest address defined by VMA. A page fault is triggered to extend the corresponding VMA downward by 1 page, allocate a page, and modify page tables.

mm: arm64: zone_size_init

December 20, 2015

The post discusses zone_size_init().

reference code base
Qualcomm msm8994 LA.BF64.1.1-06510-8×94.0 with Android 5.0.2(LRX22G), bootloader (L)ittle (K)ernel and Linux kernel 3.10.49.

CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_NO_BOOTMEM=y

call stack

start_kernel()
-> setup_arch()
   -> arm64_memblock_init()
   -> paging_init()
      -> bootmem_init()
         -> zone_sizes_init()
            -> free_area_init_node()
               -> free_area_init_core()

paging_init()
kernel: arm64: memblock initialisation arm64_memblock_init() setup memory blocks according to device tree. paging_init() continuously to set up page tables, zone memory maps, and zero page.

zone_sizes_init()
zone_sizes_init() calculates the size of each zone. The memory blocks whose address are below 4 GB are in DMA zone. The memory blocks whose address are above 4 GB are in normal zone.

After the size of each zone is determined, zone_sizes_init() calls free_area_init_node() to initialize each zone.

static void __init zone_sizes_init(unsigned long min, unsigned long max)
{
	struct memblock_region *reg;
	unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
	unsigned long max_dma = min;

	memset(zone_size, 0, sizeof(zone_size));

	/* 4GB maximum for 32-bit only capable devices */
	if (IS_ENABLED(CONFIG_ZONE_DMA)) {
		unsigned long max_dma_phys =
			(unsigned long)(dma_to_phys(NULL, DMA_BIT_MASK(32)) + 1);
		max_dma = max(min, min(max, max_dma_phys >> PAGE_SHIFT));
		zone_size[ZONE_DMA] = max_dma - min;
	}
	zone_size[ZONE_NORMAL] = max - max_dma;

	memcpy(zhole_size, zone_size, sizeof(zhole_size));

	for_each_memblock(memory, reg) {
		unsigned long start = memblock_region_memory_base_pfn(reg);
		unsigned long end = memblock_region_memory_end_pfn(reg);

		if (start >= max)
			continue;

		if (IS_ENABLED(CONFIG_ZONE_DMA) && start < max_dma) {
			unsigned long dma_end = min(end, max_dma);
			zhole_size[ZONE_DMA] -= dma_end - start;
		}

		if (end > max_dma) {
			unsigned long normal_end = min(end, max);
			unsigned long normal_start = max(start, max_dma);
			zhole_size[ZONE_NORMAL] -= normal_end - normal_start;
		}
	}

	free_area_init_node(0, zone_size, min, zhole_size);
}

an example in which there is only one zone
In this example, all memory blocks are below 4 GB, i.e., 0x100000000. So it has only one zone, /proc/zoneinfo shows that the device has only one zone, DMA.

If we change the memory block address of the device and make the first 1.5 GB memory blocks below 0x100000000. And the second 1.5 GB memory blocks is above 0x100000000. Then the system will have two zones, DMA and NORMAL. The size of the two zones are close.

$ adb shell cat /sys/kernel/debug/memblock/memory
   0: 0x0000000000000000..0x00000000057fffff
   1: 0x000000000f900000..0x000000005fffffff
   2: 0x0000000080000000..0x00000000dfffffff
$ adb shell cat /proc/zoneinfo
Node 0, zone      DMA
  pages free     71889
        min      1923
        low      17260
        high     17741
        scanned  0
        spanned  917504
        present  745216
        managed  599867

why DMA zone is needed
Some devices could only access 32-bit IO address. The drivers of these devices couldn’t allocate physical pages whose address is greater or equal to 0x100000000. Thus, DMA zone which consists of memory blocks whose address is below 0x100000000 could provide feasible pages for these drivers.

conclusion
This post discusses how zone_size_init() determines the zone size in arm64. After it determines the size of each zone, it then continuously call free_area_init_node() to initialize each zone.

kernel: arm64: mm: allocate kernel stack

November 21, 2015

This post is to discuss allocation of kernel stack in arm64.

reference code base
LA.BF64.1.2.1-02220-8×94.0 with Android 5.1.0_r3(LMY47I) and Linux kernel 3.10.49.

reference kernel config

# CONFIG_NUMA is not set
CONFIG_ZONE_DMA=y
# CONFIG_MEMCG is not set
# CONFIG_TRANSPARENT_HUGEPAGE is not set
CONFIG_MEMORY_ISOLATION=y
CONFIG_CMA=y

when is kernel stack created
A kernel stack is created for each thread at fork.

do_fork()
-> copy_process()
-> dup_task_struct()
-> alloc_thread_info_node()
-> alloc_pages_node()

arm64 kernel stack size and thread info
In arm64, each thread needs an order-2 page as kernel stack. The thread_info of each thread is at the lowest address of this page. The SP of each thread initially points to the highest address of this page minus 16.

/*
* This creates a new process as a copy of the old one,
* but does not actually start it yet.
*
* It copies the registers, and all the appropriate
* parts of the process environment (as per the clone
* flags). The actual kick-off is left to the caller.
*/
static struct task_struct *copy_process(unsigned long clone_flags,
unsigned long stack_start,
unsigned long stack_size,
int __user *child_tidptr,
struct pid *pid,
int trace)
{
......
p = dup_task_struct(current);
......
}
static struct task_struct *dup_task_struct(struct task_struct *orig)
{
struct task_struct *tsk;
struct thread_info *ti;
unsigned long *stackend;
int node = tsk_fork_get_node(orig);
int err;

tsk = alloc_task_struct_node(node);
if (!tsk)
return NULL;

ti = alloc_thread_info_node(tsk, node);
if (!ti)
goto free_tsk;

static struct thread_info *alloc_thread_info_node(struct task_struct *tsk,
int node)
{
struct page *page = alloc_pages_node(node, THREADINFO_GFP_ACCOUNTED,
THREAD_SIZE_ORDER);

return page ? page_address(page) : NULL;
}
#ifndef CONFIG_ARM64_64K_PAGES
#define THREAD_SIZE_ORDER 2
#endif

#define THREAD_SIZE 16384
#define THREAD_START_SP (THREAD_SIZE - 16)
#ifdef __KERNEL__

#ifdef CONFIG_DEBUG_STACK_USAGE
# define THREADINFO_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
#else
# define THREADINFO_GFP (GFP_KERNEL | __GFP_NOTRACK)
#endif

#define THREADINFO_GFP_ACCOUNTED (THREADINFO_GFP | __GFP_KMEMCG)

gfp_mask of this allocation

  • By default: gfp_mask of this allocation is (__GFP_NOTRACK | ___GFP_KMEMCG | ___GFP_FS | ___GFP_IO | ___GFP_WAIT) = 0x3000d0
  • According to kernel: alloc_page: how suspend resume controls gfp_mask , gfp_mask of this allocation will become (__GFP_NOTRACK | ___GFP_KMEMCG | ___GFP_WAIT) = 0x300010 in while system is suspended. Since all user space process are freeze before entering suspend, this condition only happens to kthreadd.

  • conclusion
    This post is to discuss allocation of kernel stack in arm64. Also discuss order and gfp_mask parameters while allocating kernel stack in arm64.

    kernel: mm: page_alloc: behaviors of page allocation while a thread forks

    November 18, 2015

    The post is to discuss behaviors of page allocation while a thread forks. In this case, the process enters page allocation slowpath while allocating pages as kernel stack with gfp_mask=0x3000d0.

    reference code base
    LA.BF64.1.2.1-02220-8×94.0 with Android 5.1.0_r3(LMY47I) and Linux kernel 3.10.49.

    reference kernel config

    # CONFIG_NUMA is not set
    CONFIG_ZONE_DMA=y
    # CONFIG_MEMCG is not set
    # CONFIG_TRANSPARENT_HUGEPAGE is not set
    CONFIG_MEMORY_ISOLATION=y
    CONFIG_CMA=y
    

    environment setup
    The memory has only one node which has one DMA zone. The zone has 727 pagebloacks among which 106 are CMA ones.

    Number of blocks type     Unmovable  Reclaimable      Movable      Reserve          CMA      Isolate 
    Node 0, zone      DMA          143            8          468            2          106            0
    

    call stack
    The process enters do_fork(), allocate order-2 pages and enter page allocation slowpath.

    <4>[122596.622892] c2  15688 gle.android.gms(15688:15688): alloc order:2 mode:0x3000d0, reclaim 60 in 0.030s pri 10, scan 60, lru 80228, trigger lmk 1 times
    <4>[122596.622921] c2  15688 CPU: 2 PID: 15688 Comm: gle.android.gms Tainted: G        W    3.10.49-g4c6439a #12 
    <4>[122596.622931] c2  15688 Call trace:
    <4>[122596.622954] c2  15688 [<ffffffc0002077dc>] dump_backtrace+0x0/0x134
    <4>[122596.622965] c2  15688 [<ffffffc000207920>] show_stack+0x10/0x1c
    <4>[122596.622981] c2  15688 [<ffffffc000cedc64>] dump_stack+0x1c/0x28
    <4>[122596.622995] c2  15688 [<ffffffc0002cb6d8>] try_to_free_pages+0x5f4/0x720
    <4>[122596.623009] c2  15688 [<ffffffc0002c219c>] __alloc_pages_nodemask+0x544/0x834
    <4>[122596.623022] c2  15688 [<ffffffc00021a1e4>] copy_process.part.58+0xf4/0xdfc
    <4>[122596.623031] c2  15688 [<ffffffc00021b000>] do_fork+0xe0/0x358
    <4>[122596.623041] c2  15688 [<ffffffc00021b310>] SyS_clone+0x10/0x1c
    <4>[122596.685079] c1  15688 gle.android.gms(15688:15688): alloc order:2 mode:0x3000d0, reclaim 54 in 0.030s pri 10, scan 97, lru 79879, trigger lmk 1 times
    <4>[122596.685114] c1  15688 CPU: 1 PID: 15688 Comm: gle.android.gms Tainted: G        W    3.10.49-g4c6439a #12 
    <4>[122596.685127] c1  15688 Call trace:
    <4>[122596.685152] c1  15688 [<ffffffc0002077dc>] dump_backtrace+0x0/0x134
    <4>[122596.685163] c1  15688 [<ffffffc000207920>] show_stack+0x10/0x1c
    <4>[122596.685179] c1  15688 [<ffffffc000cedc64>] dump_stack+0x1c/0x28
    <4>[122596.685193] c1  15688 [<ffffffc0002cb6d8>] try_to_free_pages+0x5f4/0x720
    <4>[122596.685207] c1  15688 [<ffffffc0002c219c>] __alloc_pages_nodemask+0x544/0x834
    <4>[122596.685220] c1  15688 [<ffffffc00021a1e4>] copy_process.part.58+0xf4/0xdfc
    <4>[122596.685230] c1  15688 [<ffffffc00021b000>] do_fork+0xe0/0x358
    <4>[122596.685241] c1  15688 [<ffffffc00021b310>] SyS_clone+0x10/0x1c
    

    why does fork allocate an order-2 page in arm64
    kernel: arm64: mm: allocate kernel stack

    behaviors of page allocation

  • GFP_KERNEL means this allocation could do IO/FS operations and sleep.
  • gfp_mask suggests allocation from ZONE_NORMAL, and the first feasible zone is zonelist is ZONE_DMA.
  • gfp_mask suggests allocation from MIGRATE_UNMOVABLE freelist.
  • low watermark check is required.
  • page order = 2 
    gfp_mask = 0x3000d0 = (__GFP_NOTRACK | __GFP_KMEMCG | GFP_KERNEL)
    high_zoneidx = gfp_zone(gfp_mask) = ZONE_NORMAL = 1
    migratetype = allocflags_to_migratetype(gfp_mask) = MIGRATE_UNMOVABLE = 0 
    prefered_zone = ZONE_DMA
    alloc_flags = ALLOC_WMARK_LOW | ALLOC_CPUSET
    

    behaviors of page allocation slowpath

  • __GFP_NO_KSWAPD is not set: wakeup kswapd
  • try get_page_from_freelist before entering rebalance
  • ALLOC_NO_WATERMARKS is not set: skip trying __alloc_pages_high_priority which returns page if success
  • wait is true: enter rebalance which includes compaction and direct reclaim
  • Try compaction which returns page if success.
  • Try direct reclaim which returns page if success.
  • If both compaction and direct have no progresses, trigger OOM. It then returns pages if available after OOM.
  • should_alloc_retry() always returns true and it goes back to rebalance again.
  • wait = gfp_mask & __GFP_WAIT = __GFP_WAIT
    alloc_flags = gfp_to_alloc_flags(gfp_mask) = 0x00000040 = (ALLOC_WMARK_MIN | ALLOC_CPUSET)
    

    behaviors of should_alloc_retry()
    __GFP_NORETRY is not set, __GFP_NOFAIL is not set, pm_suspended_storage() is false, and page order is 2. So should_alloc_retry() always returns true .

    static inline int
    should_alloc_retry(gfp_t gfp_mask, unsigned int order,
    				unsigned long did_some_progress,
    				unsigned long pages_reclaimed)
    {
    	/* Do not loop if specifically requested */
    	if (gfp_mask & __GFP_NORETRY)
    		return 0;
    
    	/* Always retry if specifically requested */
    	if (gfp_mask & __GFP_NOFAIL)
    		return 1;
    
    	/*
    	 * Suspend converts GFP_KERNEL to __GFP_WAIT which can prevent reclaim
    	 * making forward progress without invoking OOM. Suspend also disables
    	 * storage devices so kswapd will not help. Bail if we are suspending.
    	 */
    	if (!did_some_progress && pm_suspended_storage())
    		return 0;
    
    	/*
    	 * In this implementation, order <= PAGE_ALLOC_COSTLY_ORDER
    	 * means __GFP_NOFAIL, but that may not be true in other
    	 * implementations.
    	 */
    	if (order <= PAGE_ALLOC_COSTLY_ORDER)
    		return 1;
    
    	/*
    	 * For order > PAGE_ALLOC_COSTLY_ORDER, if __GFP_REPEAT is
    	 * specified, then we retry until we no longer reclaim any pages
    	 * (above), or we've reclaimed an order of pages at least as
    	 * large as the allocation's order. In both cases, if the
    	 * allocation still fails, we stop retrying.
    	 */
    	if (gfp_mask & __GFP_REPEAT && pages_reclaimed < (1 << order))
    		return 1;
    
    	return 0;
    }
    

    conclusion
    The post is to discuss behaviors of page allocation while a thread forks. In arm64, each thread needs an order-2 page as kernel stack. In this case, a thread allocates an order-2 page with gfp_mask=0x3000d0. the process enters page allocation slowpath and does direct reclaim twice. These reclaims take 0.06 seconds within fork.

    kernel: arm64: gfp_mask

    November 16, 2015

    This post is to discuss gfp_mask.

    reference code base
    LA.BF64.1.2.1-02220-8×94.0 with Android 5.1.0_r3(LMY47I) and Linux kernel 3.10.49.

    reference kernel config

    # CONFIG_NUMA is not set
    CONFIG_ZONE_DMA=y
    # CONFIG_MEMCG is not set
    # CONFIG_TRANSPARENT_HUGEPAGE is not set
    CONFIG_MEMORY_ISOLATION=y
    CONFIG_CMA=y
    

    what is gfp_mask
    Get Free Page flag mask

    when is gfp_mask used
    alloc_pages() is the API of buddy system. It returns a page of requested order. gfp_mask is the argument passing to alloc_pages() to tell buddy system the preference to allocate this page.

    #define alloc_pages(gfp_mask, order) \
    		alloc_pages_node(numa_node_id(), gfp_mask, order)
    
    static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
    						unsigned int order)
    {
    	/* Unknown node is current node */
    	if (nid &amp;amp;amp;lt; 0)
    		nid = numa_node_id();
    
    	return __alloc_pages(gfp_mask, order, node_zonelist(nid, gfp_mask));
    }
    

    all gfp_mask modifiers

    /* Plain integer GFP bitmasks. Do not use this directly. */
    #define ___GFP_DMA		0x01u
    #define ___GFP_HIGHMEM		0x02u
    #define ___GFP_DMA32		0x04u
    #define ___GFP_MOVABLE		0x08u
    #define ___GFP_WAIT		0x10u
    #define ___GFP_HIGH		0x20u
    #define ___GFP_IO		0x40u
    #define ___GFP_FS		0x80u
    #define ___GFP_COLD		0x100u
    #define ___GFP_NOWARN		0x200u
    #define ___GFP_REPEAT		0x400u
    #define ___GFP_NOFAIL		0x800u
    #define ___GFP_NORETRY		0x1000u
    #define ___GFP_MEMALLOC		0x2000u
    #define ___GFP_COMP		0x4000u
    #define ___GFP_ZERO		0x8000u
    #define ___GFP_NOMEMALLOC	0x10000u
    #define ___GFP_HARDWALL		0x20000u
    #define ___GFP_THISNODE		0x40000u
    #define ___GFP_RECLAIMABLE	0x80000u
    #define ___GFP_KMEMCG		0x100000u
    #define ___GFP_NOTRACK		0x200000u
    #define ___GFP_NO_KSWAPD	0x400000u
    #define ___GFP_OTHER_NODE	0x800000u
    #define ___GFP_WRITE		0x1000000u
    #define ___GFP_CMA		0x2000000u
    

    zone modifiers
    User could chose the preferred zone from which the page is from. If no free page is available in this zone, then buddy allocator fallbacks to other zones to to get free pages.

    /*
     * GFP bitmasks..
     *
     * Zone modifiers (see linux/mmzone.h - low three bits)
     *
     * Do not put any conditional on these. If necessary modify the definitions
     * without the underscores and use them consistently. The definitions here may
     * be used in bit comparisons.
     */
    #define __GFP_DMA	((__force gfp_t)___GFP_DMA)
    #define __GFP_HIGHMEM	((__force gfp_t)___GFP_HIGHMEM)
    #define __GFP_DMA32	((__force gfp_t)___GFP_DMA32)
    #define __GFP_MOVABLE	((__force gfp_t)___GFP_MOVABLE)  /* Page is movable */
    #define __GFP_CMA	((__force gfp_t)___GFP_CMA)
    #define GFP_ZONEMASK	(__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE| \
    			__GFP_CMA)
    

    action modifiers
    These modifiers change buddy system behaviors. By default, buddy guarantee allocation success if the requested page is less than or equal to order-3. __GFP_REPEAT, __GFP_NOFAIL, and __GFP_NORETRY could change these default behaviors.

    /*
     * Action modifiers - doesn't change the zoning
     *
     * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
     * _might_ fail.  This depends upon the particular VM implementation.
     *
     * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
     * cannot handle allocation failures.  This modifier is deprecated and no new
     * users should be added.
     *
     * __GFP_NORETRY: The VM implementation must not retry indefinitely.
     *
     * __GFP_MOVABLE: Flag that this page will be movable by the page migration
     * mechanism or reclaimed
     */
    #define __GFP_WAIT	((__force gfp_t)___GFP_WAIT)	/* Can wait and reschedule? */
    #define __GFP_HIGH	((__force gfp_t)___GFP_HIGH)	/* Should access emergency pools? */
    #define __GFP_IO	((__force gfp_t)___GFP_IO)	/* Can start physical IO? */
    #define __GFP_FS	((__force gfp_t)___GFP_FS)	/* Can call down to low-level FS? */
    #define __GFP_COLD	((__force gfp_t)___GFP_COLD)	/* Cache-cold page required */
    #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)	/* Suppress page allocation failure warning */
    #define __GFP_REPEAT	((__force gfp_t)___GFP_REPEAT)	/* See above */
    #define __GFP_NOFAIL	((__force gfp_t)___GFP_NOFAIL)	/* See above */
    #define __GFP_NORETRY	((__force gfp_t)___GFP_NORETRY) /* See above */
    #define __GFP_MEMALLOC	((__force gfp_t)___GFP_MEMALLOC)/* Allow access to emergency reserves */
    #define __GFP_COMP	((__force gfp_t)___GFP_COMP)	/* Add compound page metadata */
    #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)	/* Return zeroed page on success */
    #define __GFP_NOMEMALLOC ((__force gfp_t)___GFP_NOMEMALLOC) /* Don't use emergency reserves.
    							 * This takes precedence over the
    							 * __GFP_MEMALLOC flag if both are
    							 * set
    							 */
    

    macro modifiers

  • The most common macro modifier is GFP_KERNEL which implies the user could sleep, access IO, and call file system functions.
  • GFP_ATOMIC is used from contexts where schedule is not allowed.
  • #define GFP_ATOMIC	(__GFP_HIGH)
    #define GFP_NOIO	(__GFP_WAIT)
    #define GFP_NOFS	(__GFP_WAIT | __GFP_IO)
    #define GFP_KERNEL	(__GFP_WAIT | __GFP_IO | __GFP_FS)
    

    conclusion
    This post is to discuss gfp_mask. Also, it explains some common modifiers of gfp_masks.

    android: arm64: unwind stack to get frames and backtrace of a thread

    October 29, 2015

    This post is to discuss backtrace, frame, and how frames make up a backtrace in arm64. We demonstrate an example in which we unwind stack to get frames and backtrace of a thread.

    testing environment
    The infrastructure code base here is LA.BF64.1.1-06510-8×94.0 with Android 5.0.0_r2(LRX21M) and Linux kernel 3.10.49. The device CPU is architecture arm-v8 cortex-53.

    get backtrace from tombstone
    In android: arm64: analyze the call stack after a process hit native crash, we shows how to get the backtrace of a process at native crash from tombstone.

    ABI: 'arm64'
    pid: 20948, tid: 20948, name: coredumptest  >>> /data/coredumptest <<< 
    signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 
    .......
    backtrace:
        #00 pc 0000000000014434  /system/lib64/libc.so (strlen+16)
        #01 pc 0000000000000efc  /data/coredumptest
        #02 pc 0000000000000f84  /data/coredumptest
        #03 pc 000000000000100c  /data/coredumptest
        #04 pc 0000000000001094  /data/coredumptest
        #05 pc 0000000000000d78  /data/coredumptest (main+40)
        #06 pc 0000000000013474  /system/lib64/libc.so (__libc_init+100)
        #07 pc 0000000000000e8c  /data/coredumptest
    

    get the thread’s stack from tombstone
    the sp register points to top of the thread’s stack. It appears that the stack has 8 frames, from frame #00 to frame #07. Each frame except frame #00 has a symbol indicating from which to enter this frame.

    ABI: 'arm64'
    pid: 20948, tid: 20948, name: coredumptest  >>> /data/coredumptest <<< 
    signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 
    .......
        x28  0000000000000000  x29  0000007fcbe33220  x30  0000005595326f00
        sp   0000007fcbe33220  pc   0000007faa597434  pstate 0000000040000000
    ......
    stack:
             0000007fcbe331a0  0000000000000000  
             0000007fcbe331a8  0000000000000000  
             0000007fcbe331b0  0000000000000000  
             0000007fcbe331b8  0000000000000000  
             0000007fcbe331c0  0000000000000000  
             0000007fcbe331c8  0000000000000000  
             0000007fcbe331d0  0000000000000000  
             0000007fcbe331d8  0000000000000000  
             0000007fcbe331e0  0000000000000000  
             0000007fcbe331e8  0000000000000000  
             0000007fcbe331f0  0000000000000000  
             0000007fcbe331f8  0000007faa63e040  /system/bin/linker64
             0000007fcbe33200  0000000000000006  
             0000007fcbe33208  0000000000000000  
             0000007fcbe33210  0000007fcbe33270  [stack]
             0000007fcbe33218  0000007faa5dcc98  /system/lib64/libc.so (__cxa_atexit+60)
        #00  0000007fcbe33220  0000007fcbe33240  [stack]
             ........  ........
        #01  0000007fcbe33220  0000007fcbe33240  [stack]
             0000007fcbe33228  0000005595326f88  /data/coredumptest
             0000007fcbe33230  ffffffffffffffff  
             0000007fcbe33238  0000007fcbe33348  [stack]
        #02  0000007fcbe33240  0000007fcbe33260  [stack]
             0000007fcbe33248  0000005595327010  /data/coredumptest
             0000007fcbe33250  ffffffffffffffff  
             0000007fcbe33258  0000007faa63e198  /system/bin/linker64
        #03  0000007fcbe33260  0000007fcbe33280  [stack]
             0000007fcbe33268  0000005595327098  /data/coredumptest
             0000007fcbe33270  ffffffffffffffff  
             0000007fcbe33278  0000007faa596468  /system/lib64/libc.so (__libc_init+88)
        #04  0000007fcbe33280  0000007fcbe332a0  [stack]
             0000007fcbe33288  0000005595326d7c  /data/coredumptest (main+44)
             0000007fcbe33290  ffffffffffffffff  
             0000007fcbe33298  0000005595326d50  /data/coredumptest (main)
        #05  0000007fcbe332a0  0000007fcbe332d0  [stack]
             0000007fcbe332a8  0000007faa596478  /system/lib64/libc.so (__libc_init+104)
             0000007fcbe332b0  0000007fcbe33358  [stack]
             0000007fcbe332b8  0000000000000000  
             0000007fcbe332c0  ffffffffffffffff  
             0000007fcbe332c8  ffffffffffffffff  
        #06  0000007fcbe332d0  0000007fcbe33300  [stack]
             0000007fcbe332d8  0000005595326e90  /data/coredumptest
             0000007fcbe332e0  0000000000000000  
             0000007fcbe332e8  0000000000000000  
             0000007fcbe332f0  0000000000000000  
             0000007fcbe332f8  0000000000000000  
        #07  0000007fcbe33300  0000000000000000
             0000007fcbe33308  0000007faa63f610  /system/bin/linker64 (_start+8)
             0000007fcbe33310  0000000000000000
             0000007fcbe33318  0000007fcbe33340  [stack]
             0000007fcbe33320  0000000000000000
             0000007fcbe33328  0000005595338ce0  /data/coredumptest
             0000007fcbe33330  0000005595338cf0  /data/coredumptest
             0000007fcbe33338  0000005595338d00  /data/coredumptest
             0000007fcbe33340  0000000000000001
             0000007fcbe33348  0000007fcbe33a3a  [stack]
             0000007fcbe33350  0000000000000000
             0000007fcbe33358  0000007fcbe33a4d  [stack]
             0000007fcbe33360  0000007fcbe33a62  [stack]
             0000007fcbe33368  0000007fcbe33a8e  [stack]
             0000007fcbe33370  0000007fcbe33aa1  [stack]
             0000007fcbe33378  0000007fcbe33d67  [stack]
    

    what is frame pointer
    According to arm64 aarch64 state PCS(Procedure Call Standard), register x29 is frame pointer and x30 is link register. arm64 kernel enable frame pointer which makes it easy to unwind thread’s stack into frames. All frames are in a linked list in which each element points to the next element with its frame pointer.

    how a frame is constructed and destructed
    While entering a function, a frame might be created such as line 2, 3.

  • stack pointer is decreased by frame size.(push)
  • frame pointer and link register are stored at the top of the frame.
  • frame pointer is updated as stack pointer.
  • While leaving a function, the created frame should be removed such as line 11.

  • frame pointer and link register are loaded from top of the frame.
  • stack pointer is increased by frame size.(pop)
  • 0000000000000ec0 <atexit>:
         ec0:   a9be7bfd    stp x29, x30, [sp,#-32]!
         ec4:   910003fd    mov x29, sp
         ec8:   f9000fa0    str x0, [x29,#24]
         ecc:   90000000    adrp    x0, 0 <writev@plt-0xbc0>
         ed0:   913a6000    add x0, x0, #0xe98
         ed4:   f0000081    adrp    x1, 13000 <__dso_handle>
         ed8:   91000022    add x2, x1, #0x0
         edc:   f9400fa1    ldr x1, [x29,#24]
         ee0:   97ffff74    bl  cb0 <__cxa_atexit@plt>
         ee4:   a8c27bfd    ldp x29, x30, [sp],#32
         ee8:   d65f03c0    ret   
    

    how all frames are linked with frame pointer
    All frames form a linked list. Frame pointer is the head of this linked list.

  • In this example, frame pointer x29 0000007fcbe33220 points to the top address of frame #1.
  • ABI: 'arm64'
    pid: 20948, tid: 20948, name: coredumptest  >>> /data/coredumptest <<< 
    signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 
    .......
        x28  0000000000000000  x29  0000007fcbe33220  x30  0000005595326f00
        sp   0000007fcbe33220  pc   0000007faa597434  pstate 0000000040000000
    
  • frame pointer, 0x0000007fcbe33220, of frame #1 points to the top address of frame #02, 0000007fcbe33240
  • link register, 0x0000007fcbe33228, of frame #1 points to 0000005595326f88 /data/coredumptest
  •     #01  0000007fcbe33220  0000007fcbe33240  [stack]
             0000007fcbe33228  0000005595326f88  /data/coredumptest
             0000007fcbe33230  ffffffffffffffff  
             0000007fcbe33238  0000007fcbe33348  [stack]
    
  • frame pointer, 0000007fcbe33240, of frame #2 points to the top address of frame #03, 0000007fcbe33260
  • link register, 0000007fcbe33248, of frame #2 points to 0000005595327010 /data/coredumptest
  •     #02  0000007fcbe33240  0000007fcbe33260  [stack]
             0000007fcbe33248  0000005595327010  /data/coredumptest
             0000007fcbe33250  ffffffffffffffff  
             0000007fcbe33258  0000007faa63e198  /system/bin/linker64
    
  • frame pointer, 0000007fcbe33260, of frame #3 points to the top address of frame #04, 0000007fcbe33280
  • link register, 0000007fcbe33268, of frame #3 points to 0000005595327098 /data/coredumptest
  •     #03  0000007fcbe33260  0000007fcbe33280  [stack]
             0000007fcbe33268  0000005595327098  /data/coredumptest
             0000007fcbe33270  ffffffffffffffff  
             0000007fcbe33278  0000007faa596468  /system/lib64/libc.so (__libc_init+88)
    
  • frame pointer, 0000007fcbe33280, of frame #4 points to the top address of frame #05, 0000007fcbe332a0
  •     #04  0000007fcbe33280  0000007fcbe332a0  [stack]
             0000007fcbe33288  0000005595326d7c  /data/coredumptest (main+44)
             0000007fcbe33290  ffffffffffffffff  
             0000007fcbe33298  0000005595326d50  /data/coredumptest (main)
    
  • frame pointer, 0000007fcbe332a0, of frame #5 points to the top address of frame #06, 0000007fcbe332d0
  • link register, 0000007fcbe332a8, of frame #5 points to 0000007faa596478 /system/lib64/libc.so (__libc_init+104)
  •     #05  0000007fcbe332a0  0000007fcbe332d0  [stack]
             0000007fcbe332a8  0000007faa596478  /system/lib64/libc.so (__libc_init+104)
             0000007fcbe332b0  0000007fcbe33358  [stack]
             0000007fcbe332b8  0000000000000000  
             0000007fcbe332c0  ffffffffffffffff  
             0000007fcbe332c8  ffffffffffffffff  
    
  • frame pointer, 0000007fcbe332d0, of frame #6 points to the top address of frame #07, 0000007fcbe33300
  • link register, 0000007fcbe332d8, of frame #6 points to 0000005595326e90 /data/coredumptest
  •     #06  0000007fcbe332d0  0000007fcbe33300  [stack]
             0000007fcbe332d8  0000005595326e90  /data/coredumptest
             0000007fcbe332e0  0000000000000000  
             0000007fcbe332e8  0000000000000000  
             0000007fcbe332f0  0000000000000000  
             0000007fcbe332f8  0000000000000000  
    
  • frame pointer, 0000007fcbe33300, of frame #7 points to 0000000000000000. It’s the last frame.
  • link register, 0000007fcbe33308, of frame #7 points to 0000007faa63f610 /system/bin/linker64 (_start+8)
  •     #07  0000007fcbe33300  0000000000000000
             0000007fcbe33308  0000007faa63f610  /system/bin/linker64 (_start+8)
             0000007fcbe33310  0000000000000000
             0000007fcbe33318  0000007fcbe33340  [stack]
             0000007fcbe33320  0000000000000000
             0000007fcbe33328  0000005595338ce0  /data/coredumptest
             0000007fcbe33330  0000005595338cf0  /data/coredumptest
             0000007fcbe33338  0000005595338d00  /data/coredumptest
             0000007fcbe33340  0000000000000001
             0000007fcbe33348  0000007fcbe33a3a  [stack]
             0000007fcbe33350  0000000000000000
             0000007fcbe33358  0000007fcbe33a4d  [stack]
             0000007fcbe33360  0000007fcbe33a62  [stack]
             0000007fcbe33368  0000007fcbe33a8e  [stack]
             0000007fcbe33370  0000007fcbe33aa1  [stack]
             0000007fcbe33378  0000007fcbe33d67  [stack]
    

    how to get backtrace from linked frames
    The linker register of each frame is stored in the second double word of a frame.

  • #00 pc is current pc register 0000007faa597434
  • #01 pc is current link register 0000005595326f00 – 0x4 = 0000005595326efc
  • #02 pc is link register of frame #01 0000005595326f88 – 0x4 = 0000005595326f84
  • #03 pc is link register of frame #02 0000005595327010 – 0x4 = 000000559532700c
  • #04 pc is link register of frame #03 0000005595327098 – 0x4 = 0000005595327094
  • #05 pc is link register of frame #04 0000005595326d7c – 0x4 = 0000005595326d78
  • #06 pc is link register of frame #05 0000007faa596478 – 0x4 = 0000007faa596474
  • #07 pc is link register of frame #06 0000005595326e90 – 0x4 = 0000005595326e8c
  • The backtrace unwinded from stack is same as the backtrace of tombstone and gdb loading core file in android: arm64: analyze the call stack after a process hit native crash.

    ABI: 'arm64'
    pid: 20948, tid: 20948, name: coredumptest  >>> /data/coredumptest <<< 
    signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 
    ......
        x28  0000000000000000  x29  0000007fcbe33220  x30  0000005595326f00
        sp   0000007fcbe33220  pc   0000007faa597434  pstate 0000000040000000
    ......
    backtrace:
        #00 pc 0000000000014434  /system/lib64/libc.so (strlen+16)
        #01 pc 0000000000000efc  /data/coredumptest
        #02 pc 0000000000000f84  /data/coredumptest
        #03 pc 000000000000100c  /data/coredumptest
        #04 pc 0000000000001094  /data/coredumptest
        #05 pc 0000000000000d78  /data/coredumptest (main+40)
        #06 pc 0000000000013474  /system/lib64/libc.so (__libc_init+100)
        #07 pc 0000000000000e8c  /data/coredumptest
    
    (gdb) bt
    #0  strlen () at bionic/libc/arch-arm64/generic/bionic/strlen.S:71
    #1  0x0000005595326f00 in strlen (s=0x0) at bionic/libc/include/string.h:239
    #2  test4 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:11
    #3  0x0000005595326f88 in test3 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:20
    #4  0x0000005595327010 in test2 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:29
    #5  0x0000005595327098 in test1 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:38
    #6  0x0000005595326d7c in main () at frameworks/native/services/coredumptest/CoredumpTest.cpp:56
    

    conclusion
    In this post, we explain what is backtrace and frames. We then show how to use registers such stack pointer and frame pointer of current context to get frames and backtrace of a thread.

    android: arm64: how to analyze the call stack after a process hit native crash

    October 28, 2015

    This post is to analyze the call stack after a process hits native crash in android.

    testing environment
    The infrastructure code base here is LA.BF64.1.1-06510-8×94.0 with Android 5.0.0_r2(LRX21M) and Linux kernel 3.10.49. The device CPU is architecture arm-v8 cortex-53.

    use gdb to get call stack from core file
    In android: coredump: analyze core file with gdb, we demonstrate how to use gdb to load core file and get call stack of coredumptest, which deferences a NULL pointer and hit native crash.

    (gdb) bt
    #0  strlen () at bionic/libc/arch-arm64/generic/bionic/strlen.S:71
    #1  0x0000005595326f00 in strlen (s=0x0) at bionic/libc/include/string.h:239
    #2  test4 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:11
    #3  0x0000005595326f88 in test3 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:20
    #4  0x0000005595327010 in test2 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:29
    #5  0x0000005595327098 in test1 () at frameworks/native/services/coredumptest/CoredumpTest.cpp:38
    #6  0x0000005595326d7c in main () at frameworks/native/services/coredumptest/CoredumpTest.cpp:56
    

    get call stack from tombstone
    In addition to core file, we could also get call stacks from tombstone. While a 64-bit process hits native crash, debuggerd64 wlll attach the process and dump its register and call stacks in /data/tombstones/tombstone_0x, where 0 <= x <= 9.

    ABI: 'arm64'
    pid: 20948, tid: 20948, name: coredumptest  >>> /data/coredumptest <<< 
    signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 
        x0   0000000000000000  x1   0000000000000000  x2   0000007fcbe33358  x3   000000000000000a
        x4   0000000000000001  x5   0000000000000000  x6   000000000000000b  x7   0000000000000000
        x8   00000000000000a4  x9   0000000000000000  x10  0000007fcbe32f88  x11  0101010101010101
        x12  0000000000000001  x13  000000000000001e  x14  0000007faa6560f0  x15  0000007faa656100
        x16  0000005595338fb8  x17  0000007faa597424  x18  0000000000000000  x19  ffffffffffffffff
        x20  0000007fcbe33348  x21  0000000000000001  x22  0000005595326d50  x23  0000000000000000
        x24  0000000000000000  x25  0000000000000000  x26  0000000000000000  x27  0000000000000000
        x28  0000000000000000  x29  0000007fcbe33220  x30  0000005595326f00
        sp   0000007fcbe33220  pc   0000007faa597434  pstate 0000000040000000
        v0   2e2e2e2e2e2e2e2e2e2e2e2e2e2e2e2e  v1   746165662e6d70642e74736973726570
        v2   6f63656e6362696c0000000000657275  v3   00000000000000000000000000000000
        v4   00000000000000008020080280000000  v5   00000000400000004000000000000000
        v6   00000000000000000000000000000000  v7   80200802802008028020080280200802
        v8   00000000000000000000000000000000  v9   00000000000000000000000000000000
        v10  00000000000000000000000000000000  v11  00000000000000000000000000000000
        v12  00000000000000000000000000000000  v13  00000000000000000000000000000000
        v14  00000000000000000000000000000000  v15  00000000000000000000000000000000
        v16  40100401401004014010040140100401  v17  00000000a00a80000000aa8000404000
        v18  00000000000000008020080280000000  v19  00000000000000000000000000000000
        v20  00000000000000000000000000000000  v21  00000000000000000000000000000000
        v22  00000000000000000000000000000000  v23  00000000000000000000000000000000
        v24  00000000000000000000000000000000  v25  00000000000000000000000000000000
        v26  00000000000000000000000000000000  v27  00000000000000000000000000000000
        v28  00000000000000000000000000000000  v29  00000000000000000000000000000000
        v30  00000000000000000000000000000000  v31  00000000000000000000000000000000
        fpsr 00000000  fpcr 00000000
    
    backtrace:
        #00 pc 0000000000014434  /system/lib64/libc.so (strlen+16)
        #01 pc 0000000000000efc  /data/coredumptest
        #02 pc 0000000000000f84  /data/coredumptest
        #03 pc 000000000000100c  /data/coredumptest
        #04 pc 0000000000001094  /data/coredumptest
        #05 pc 0000000000000d78  /data/coredumptest (main+40)
        #06 pc 0000000000013474  /system/lib64/libc.so (__libc_init+100)
        #07 pc 0000000000000e8c  /data/coredumptest
    

    use addr2line to analyze call stacks in tombstone
    We could use addr2line to transform symbol address to source code function name and line number.

    $ aarch64-linux-android-addr2line -e symbols/system/bin/coredumptest -a 0000000000000ef8
    _Z5test4v
    0x0000000000000ef8
    frameworks/native/services/coredumptest/CoredumpTest.cpp:10
    $ aarch64-linux-android-addr2line -e symbols/system/bin/coredumptest -a 0000000000000f84
    0x0000000000000f84
    _Z5test3v
    frameworks/native/services/coredumptest/CoredumpTest.cpp:20
    $ aarch64-linux-android-addr2line -e symbols/system/bin/coredumptest -a 000000000000100c
    0x000000000000100c
    _Z5test2v
    frameworks/native/services/coredumptest/CoredumpTest.cpp:29
    $ aarch64-linux-android-addr2line -f -e symbols/system/bin/coredumptest -a 0000000000001094
    0x0000000000001094
    _Z5test1v
    frameworks/native/services/coredumptest/CoredumpTest.cpp:38
    $ aarch64-linux-android-addr2line -f -e symbols/system/bin/coredumptest -a 0000000000000d78       
    0x0000000000000d78
    main
    frameworks/native/services/coredumptest/CoredumpTest.cpp:56
    

    review source code to see why the native crash happens
    From source code, we could find that the native crash is due to dereferencing NULL pointer.

    #define LOG_TAG "CoredumpTest"
    
    #include <utils/Log.h>
    #include <string.h>
    #include <sys/resource.h>
    
    using namespace android;
    
    int test4()
    {
        int ret = strlen(NULL);
    
        ALOGD("enter %s: %d", __func__,  ret);
    
        return ret;
    }
    
    int test3()
    {
        int ret = test4() + 3;
    
        ALOGD("enter %s: %d", __func__, ret);
    
        return ret;
    }
    
    int test2()
    {
        int ret = test3() + 2;
    
        ALOGD("enter %s: %d", __func__, ret);
    
        return ret;
    }
    
    int test1()
    {
        int ret = test2() + 1;
    
        ALOGD("enter %s: %d", __func__, ret);
    
        return ret;
    }
    
    int main()
    {
        struct rlimit core_limit;
        core_limit.rlim_cur = RLIM_INFINITY;
        core_limit.rlim_max = RLIM_INFINITY;
    
        if (setrlimit(RLIMIT_CORE, &core_limit) < 0) {
            ALOGD("Failed to setrlimit: %s", strerror(errno));
            return 1;
        }
    
        int n = test1();
        ALOGD("Ready to enter test");
    
        return 0;
    }
    

    conclusion
    After a process hit native crash, we could analyze the call stack of the process from core file or tombstone. Then, review the source code to see why the crash happens.


    %d bloggers like this: