android: child process hits mutex deadlock in printf after fork

This post is to demonstrate an example in which the child process hits mutex deadlock in printf after fork. This case is a special case of android: child process hits mutex deadlock after fork.

example program
android: example: child process hits mutex deadlock in printf after fork.

software of testing device
LA.BF64.1.1-06510-8×94.0 with Android 5.0.0_r2(LRX21M) and Linux kernel 3.10.49.

hardware of testing device
Architecture arm-v8 cortex-53.

log of running this program

$ adb shell /data/printf-deadlock-after-fork | grep -v -E "(busy dump)"
26153:26153, printf-deadlock-after-fork starts
26153:26155, void* thread_routine(void*) is ready to call printf in busy loop
26153:26153, parent process after fork

code flow

  • process 26153’s main thread 26153:26153 starts
  • thread 26153:26153 creates thread 26153:26155
  • thread 26153:26155 runs in thread_routine and calls printf in busy loop
  • thread 26153:26153 forks child process 26158
  • child process 26158 call printf() and hits deadlock while get file lock of stdout
  • call stacks of blocked child process at timestamp
    The pid of child process is 26158. The state of child process is sleep.

    $ adb shell ps -t | grep printf
    root      26153 618   3584   828   ffffffff 9823b3ec S /data/printf-deadlock-after-fork
    root      26155 26153 3584   828   00000000 9823b8fc R printf-deadlock
    root      26158 26153 3584   260   00277680 981f2780 S /data/printf-deadlock-after-fork
    
    $ adb shell debuggerd64 -b 26158
    ----- pid 26158 at 2015-10-31 16:16:06 -----
    Cmd line: /data/printf-deadlock-after-fork
    ABI: 'arm64'
    
    "printf-deadlock" sysTid=26158
      #00 pc 000000000001377c  /system/lib64/libc.so (syscall+28)
      #01 pc 0000000000019cd4  /system/lib64/libc.so
    (pthread_mutex_lock+148)
      #02 pc 0000000000053918  /system/lib64/libc.so (vfprintf+24)
      #03 pc 000000000004f30c  /system/lib64/libc.so (printf+144)
      #04 pc 0000000000000d9c  /data/printf-deadlock-after-fork (main+572)
      #05 pc 0000000000013474  /system/lib64/libc.so (__libc_init+100)
      #06 pc 0000000000000e10  /data/printf-deadlock-after-fork
    
    ----- end 26158 -----
    

    The child process is blocked at FLOCKFILE(fp) of vfprintf. At this place does child process wait for the mutex of stdin.

    $ aarch64-linux-android-addr2line -f -e symbols/system/lib64/libc.so -a 0x0000000000053918                                                                 
    0x0000000000053918                                                                                                                                                                                        
    vfprintf                                                                                                                                                                                                  
    bionic/libc/upstream-openbsd/lib/libc/stdio/vfprintf.c:266
    
    int
    vfprintf(FILE *fp, const char *fmt0, __va_list ap)
    {
    	int ret;
    
    	FLOCKFILE(fp);
    	ret = __vfprintf(fp, fmt0, ap);
    	FUNLOCKFILE(fp);
    	return (ret);
    }
    
    #define FLOCKFILE(fp)   flockfile(fp)
    #define FUNLOCKFILE(fp) funlockfile(fp)
    
    void flockfile(FILE* fp) {
      if (fp != NULL) {
        pthread_mutex_lock(&_FLOCK(fp));
      }
    }
    
    struct __sfileext {
    	struct	__sbuf _ub; /* ungetc buffer */
    	struct wchar_io_data _wcio;	/* wide char io status */
    	pthread_mutex_t _lock; /* file lock */
    };
    
    #define _FILEEXT_INITIALIZER  {{NULL,0},{0},PTHREAD_RECURSIVE_MUTEX_INITIALIZER}
    
    #define _EXT(fp) ((struct __sfileext *)((fp)->_ext._base))
    #define _UB(fp) _EXT(fp)->_ub
    #define _FLOCK(fp)  _EXT(fp)->_lock
    

    analysis: why this deadlock happens
    In android: child process hits mutex deadlock after fork, we conclude that the forked child process should avoid getting mutex before calling exec() or _exit(). Otherwise, it might hit mutex deadlock.

    In this case, one thread of the parent process is in busy loop to printf. If the thread is locking the mutex of stdin at fork, then the child process would hit stdin file mutex deadlock in printf.

    how to fix this deadlock
    This patch fix child process hits mutex deadlock in printf after fork replaces printf() with write() to fix this problem. Unlike printf(), write() is a system call and it directly requests kernel to copy message to stdout file.

    
    diff --git a/PrintfDeadlockAfterFork.cpp b/PrintfDeadlockAfterFork.cpp
    index bdd1df2..12a6a27 100644
    --- a/PrintfDeadlockAfterFork.cpp
    +++ b/PrintfDeadlockAfterFork.cpp
    @@ -56,8 +56,10 @@ int main()
         pid = fork();
         if (pid == 0) {
             void *ptr;
    -        printf("%d:%d, child process after fork\n", getpid(), gettid());
    -        printf("%d:%d, is ready to _exit(1)\n", getpid(), gettid());
    +        char buf[4096];
    +        sprintf(buf, "%d:%d, child process after fork\n", getpid(), gettid());
    +        sprintf(buf, "%d:%d, is ready to _exit(1)\n", getpid(), gettid());
    +        write(1, buf, strlen(buf));
             _exit(1);
         }
    
    

    conclusion
    The forked child process should use write() rather than printf() before calling exec() or _exit(). Otherwise, it might hit mutex deadlock in printf().

    3 Responses to “android: child process hits mutex deadlock in printf after fork”

    1. android: child process hits mutex deadlock after fork | chengyihe's blog Says:

      […] android: child process hits mutex deadlock in printf after fork […]

    2. andorid: child process stuck in zombie state due to parent not reaping | chengyihe's blog Says:

      […] « android: child process hits mutex deadlock in printf after fork […]

    3. Joakim Says:

      I recently ran into an issue where flockfile hangs on stdout, but only on Android 5.0, because of a regression that was subsequently fixed in 5.1. Perhaps you’ve run into the same issue, which was fixed by these two commits:

      https://github.com/android/platform_bionic/commit/6a03abcfd23f31d1df06eb0059830e22621282bb#diff-9265e3b8114248ee4d2f8901e8ffdbcdR174

      https://github.com/android/platform_bionic/commit/c48c3e4bb3d1665f3e9fa2785daafa72dfe59399#diff-897727def501eab7347bd584dcb131b9R39

      I work around the issue by initializing the mutex correctly on stdout before calling flockfile.

    Leave a comment