Android 13 Zygote进程启动流程分析

简介

Zygote

借用5W2H法

What?zygote是Android中用来fork App进程的进程,它的作用就是fork出其它需要JVM的进程,比如你写的一个App应用,也可以理解为应用进程的守护进程。

Why?Android应用层,各App(进程)都运行在虚拟机上,都需要一个Android运行时环境,在应用启动时,实时创建虚拟机,加载运行时环境及资源?显然不是,借用linux的fork函数,可以将启动应用所要做的事情提前做好,待有进程启动时,zygote就”复制”一份运行时环境给进程,以提高性能。

Who,When?Where?Zytoge进程是init进程(pid=1)启动的。具体是在init的第三个阶段(解析init.rc后)通过/system/bin/app_process命令行带参数来启动。具体的启动流程是重点,后续详细介绍。

zygote 模式

  • zygote模式,初始化zygote进程时带–start-system-server参数,表示是zygote模式,会fork出zygote的第一个子进程:SystemServer,SystemServer会创建许多Framework层的系统服务,比如ActivityManagerService (AMS),WindowManagerService (WMS)等

  • application模式,启动普通应用程序,传递的参数有class名字以及class带的参数,注:个人理解为uid为0或启动标志带DEBUG_ENABLE_DEBUGGER,允许以application模式启动,表示以同zygote相同的方式启动进程(并非以zygote fork的方式启动)

Fork函数

fork是linux函数,fork执行完毕之后,会出现两个进程并发执行,所以它有两次返回,当返回值为负数,表示fork失败,否则fork成功。当pid = 0是表示运行在子进程,当pid > 0表示运行在父进程。

启动流程

入口:app_main.cpp#main()

platform_frameworks_base/cmds/app_process/app_main.cpp

Copy Code

int main(int argc, char* const argv[])
{
    if (!LOG_NDEBUG) {
      ALOGV("app_process main with argv: %s", argv_String.string());
    }
    // 构造内部AppRuntime对象,AppRuntime为AndroidRuntime的子类
    AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
    // Process command line arguments
    // ignore argv[0]
    argc--;
    argv++;
    // Parse runtime arguments.  Stop at first unrecognized option.
    bool zygote = false; // 带--zygote参数为true
    bool startSystemServer = false; // 带--start-system-server为true
    bool application = false; // 带--appliction参数为true
    String8 niceName; // 参数--nice-name=的值,会将原始的app_process的进程名修改为zygote或zygote64
    String8 className; // 剩下的参数 --后面的值
    // 参数解析忽略
    Vector<String8> args; // 
    if (!className.isEmpty()) {
        // application 模式, 将applicationw传给RuntimeInit.
        // The Remainder of args get passed to startup class main(). Make
        // copies of them before we overwrite them with the process name.
        args.add(application ? String8("application") : String8("tool"));
        runtime.setClassNameAndArgs(className, argc - i, argv + i);

        if (!LOG_NDEBUG) {
          ...
          ALOGV("Class name = %s, args = %s", className.string(), restOfArgs.string());
        }
    } else {
        // 此分支为zygote 模式,特别关注.
        maybeCreateDalvikCache(); // 创建/data/dalvik-cache
        if (startSystemServer) {
            args.add(String8("start-system-server"));
        }
         ...
        // 所有的参数传入zygote main() 方法.
        for (; i < argc; ++i) {
            args.add(String8(argv[i]));
        }
    }
    ...
    if (zygote) {
        runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
    } else if (!className.isEmpty()) {
        runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
    } else {
        fprintf(stderr, "Error: no class name or --zygote supplied.\n");
    }

可知main入口方法主要做参数解析及构造传给AndroidRuntime.start方法所需的参数。

native:AndroidRuntime.cpp

platform_frameworks_base/core/jni/AndroidRuntime.cpp

Copy Code

void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
    ALOGD(">>>>>> START %s uid %d <<<<<<\n",
            className != NULL ? className : "(unknown)", getuid());

    static const String8 startSystemServer("start-system-server");
    // 是否primary_zygote?, 为true会fork system server进程.
    bool primary_zygote = false;
    ...

    const char* rootDir = getenv("ANDROID_ROOT"); // 默认为/system
    const char* artRootDir = getenv("ANDROID_ART_ROOT");
    const char* i18nRootDir = getenv("ANDROID_I18N_ROOT");
    const char* tzdataRootDir = getenv("ANDROID_TZDATA_ROOT");
    // 上面几个dir,如果不存在则return
    /* 启动Java虚拟机 */
    JniInvocation jni_invocation;
    jni_invocation.Init(NULL);
    JNIEnv* env;
    if (startVm(&mJavaVM, &env, zygote, primary_zygote) != 0) {
        return; // 启动失败return
    }
    // VM启动回调,为空实现 
    onVmCreated(env);

    /*
     * 注册JNI
     */
    if (startReg(env) < 0) {
        ALOGE("Unable to register all android natives\n");
        return;
    }
    jclass stringClass; // java.lang.String
    jobjectArray strArray; // String[]
    jstring classNameStr; // className e.g. "com.android.internal.os.ZygoteInit"
        /*
         * 省略一段JNI反射调用Java的代码,转成Java代码大致如下
         * String[] strArray = new String[options.length + 1];
         * strArray[0] = className;
         * 剩下的options填充strArray数组;
         * 
         */

    /*
     * 使用JNI反射调用className类对应的main()方法
     */
}

可以看出,start()方法主要做了两件事

  • startVM():创建虚拟机,主要是设置VM运行时的参数,比如设置初始堆大小(-Xms)为dalvik.vm.heapstartsize属性值

  • startReg():注册JNI,内部调用register_jni_procs通过JNI函数动态注册注册gRegJNI函数指针数组

最后调用Java对应className类的main()函数,进入Java世界的大门

Java:ZygoteInit.java

Copy Code

public static void main(String[] argv) {
    ZygoteServer zygoteServer = null;

    // 设置zygote启动标志. 如果有其它线程执行,会抛出错误
    ZygoteHooks.startZygoteNoThreadCreation();

    // Zygote进入自己的线程组,设置gid为0
    try {
        Os.setpgid(0, 0);
    } catch (ErrnoException ex) {
        throw new RuntimeException("Failed to setpgid(0,0)", ex);
    }

    Runnable caller;
    try {
        // Store now for StatsLogging later.
        final long startTime = SystemClock.elapsedRealtime();
        final boolean isRuntimeRestarted = "1".equals(
                SystemProperties.get("sys.boot_completed"));
        // systrace的TAG
        String bootTimeTag = Process.is64Bit() ? "Zygote64Timing" : "Zygote32Timing";
        TimingsTraceLog bootTimingsTraceLog = new TimingsTraceLog(bootTimeTag,
                Trace.TRACE_TAG_DALVIK);
        // init开始,详见android systrace工具相关文档
        bootTimingsTraceLog.traceBegin("ZygoteInit");
        // 主要是开启DDMS及覆盖Mime
        RuntimeInit.preForkInit();

        boolean startSystemServer = false;
        String zygoteSocketName = "zygote";
        String abiList = null;
        boolean enableLazyPreload = false;
        // 解析从AndroidRuntime.cpp传过来的参数
        for (int i = 1; i < argv.length; i++) {
            if ("start-system-server".equals(argv[i])) {
                startSystemServer = true;
            } else if ("--enable-lazy-preload".equals(argv[i])) {
                enableLazyPreload = true;
            } else if (argv[i].startsWith(ABI_LIST_ARG)) {
                abiList = argv[i].substring(ABI_LIST_ARG.length());
            } else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
                  // socket名字
                zygoteSocketName = argv[i].substring(SOCKET_NAME_ARG.length());
            } else {
                throw new RuntimeException("Unknown command line argument: " + argv[i]);
            }
        }
        // primary zygote: /dev/socket/zygote
        // secondary zygote: /dev/socket/secondary_zygote
        final boolean isPrimaryZygote = zygoteSocketName.equals(Zygote.PRIMARY_SOCKET_NAME);
        if (!isRuntimeRestarted) {
            if (isPrimaryZygote) {
                FrameworkStatsLog.write(FrameworkStatsLog.BOOT_TIME_EVENT_ELAPSED_TIME_REPORTED,
                        BOOT_TIME_EVENT_ELAPSED_TIME__EVENT__ZYGOTE_INIT_START,
                        startTime);
            } else if (zygoteSocketName.equals(Zygote.SECONDARY_SOCKET_NAME)) {
                FrameworkStatsLog.write(FrameworkStatsLog.BOOT_TIME_EVENT_ELAPSED_TIME_REPORTED,
                        BOOT_TIME_EVENT_ELAPSED_TIME__EVENT__SECONDARY_ZYGOTE_INIT_START,
                        startTime);
            }
        }

        if (abiList == null) {
            throw new RuntimeException("No ABI list supplied.");
        }

        // 第一次zygote启动时, enableLazyPreload为false,会preload一些资源及class.
        // In such cases, we will preload things prior to our first fork.
        if (!enableLazyPreload) {
            bootTimingsTraceLog.traceBegin("ZygotePreload");
            EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
                    SystemClock.uptimeMillis());
            // 1. preload
            preload(bootTimingsTraceLog);
            EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
                    SystemClock.uptimeMillis());
            bootTimingsTraceLog.traceEnd(); // ZygotePreload
        }

        // 做一次GC
        bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
        gcAndFinalize(); // ZygoteHooks.gcAndFinalize();
        bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC

        bootTimingsTraceLog.traceEnd(); // ZygoteInit
        // 初始化Zygote的native状态
        Zygote.initNativeState(isPrimaryZygote);

        ZygoteHooks.stopZygoteNoThreadCreation();
        // 2. 创建ZygoteServer,建立socket server端
        zygoteServer = new ZygoteServer(isPrimaryZygote);

        if (startSystemServer) {
            // 3. fork system_server进程
            Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);

            // {@code r == null} in the parent (zygote) process, and {@code r != null} in the
            // child (system_server) process.
            // r为空表示zygote进程本身,r!=null表示是system_server子进程
            if (r != null) {
                r.run(); // 执行com.android.server.SystemServer.main()
                return;
            }
        }

        Log.i(TAG, "Accepting command socket connections");

        // 循环等待AMS请求Zygote进程fork子进程
        caller = zygoteServer.runSelectLoop(abiList);
    } catch (Throwable ex) {
        Log.e(TAG, "System zygote died with fatal exception", ex);
        throw ex;
    } finally {
        if (zygoteServer != null) {
            zygoteServer.closeServerSocket();
        }
    }

    // 子进程已退出select loop. 继续执行命令
    // 对于普通的App,run执行的其实是ActivityThread.main()入口函数
    if (caller != null) {
        caller.run();
    }
}

总结一下main方法主要做了以下几件事

  • 预加载,加载一些共享的资源,这样fork的子进程能够复用这些共享的资源,以提升速度
  • Fork system_server子进程,system_server进程会创建各种系统Service
  • 创建ZygoteServer并循环等待fork请求,创建zygote进程对应的socket 来接受AMS发出的创建App进程请求,从Android 10 (Q)开始,引入了USAP(unspecialized app process)池(可以理解为进程池,像线程池一样),提前创建好一批进程,当有新的应用启动时,节省fork动作从而提升性能。如何具体fork子进程,后续另开专题分析。

预加载:preload()

Copy Code

static void preload(TimingsTraceLog bootTimingsTraceLog) {
    Log.d(TAG, "begin preload");
    bootTimingsTraceLog.traceBegin("BeginPreload");
    beginPreload(); // ZygoteHooks.onBeginPreload();
    bootTimingsTraceLog.traceEnd(); // BeginPreload
    bootTimingsTraceLog.traceBegin("PreloadClasses");
    preloadClasses(); // 加载/system/etc/preloaded-classes文件中的类
    bootTimingsTraceLog.traceEnd(); // PreloadClasses
    bootTimingsTraceLog.traceBegin("CacheNonBootClasspathClassLoaders");
    cacheNonBootClasspathClassLoaders();
    bootTimingsTraceLog.traceEnd(); // CacheNonBootClasspathClassLoaders
    bootTimingsTraceLog.traceBegin("PreloadResources");
    preloadResources(); // com.android.internal.R.array.preload_xxx中的一些图片及颜色等资源
    bootTimingsTraceLog.traceEnd(); // PreloadResources
    Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadAppProcessHALs");
    nativePreloadAppProcessHALs(); // native方法没注释
    Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
    Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadGraphicsDriver");
    maybePreloadGraphicsDriver(); // native方法没注释
    Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
    preloadSharedLibraries(); // System.loadLibrary "android","compiler_rt","jnigraphics"
    preloadTextResources(); // Hyphenator及字体
    // Ask the WebViewFactory to do any initialization that must run in the zygote process,
    // for memory sharing purposes.
    WebViewFactory.prepareWebViewInZygote();
    endPreload();
    warmUpJcaProviders();
    Log.d(TAG, "end preload");

    sPreloadComplete = true;
}

启动系统服务:forkSystemServer

Copy Code

/**
 * 准备参数并fork system server进程.
 *
 * @return A {@code Runnable} 提供system_server子进程业务入口
 * process; {@code null} in the parent.
 */
private static Runnable forkSystemServer(String abiList, String socketName,
        ZygoteServer zygoteServer) {
    long capabilities = posixCapabilitiesAsBits(
            OsConstants.CAP_IPC_LOCK,
            OsConstants.CAP_KILL,
            OsConstants.CAP_NET_ADMIN,
            OsConstants.CAP_NET_BIND_SERVICE,
            OsConstants.CAP_NET_BROADCAST,
            OsConstants.CAP_NET_RAW,
            OsConstants.CAP_SYS_MODULE,
            OsConstants.CAP_SYS_NICE,
            OsConstants.CAP_SYS_PTRACE,
            OsConstants.CAP_SYS_TIME,
            OsConstants.CAP_SYS_TTY_CONFIG,
            OsConstants.CAP_WAKE_ALARM,
            OsConstants.CAP_BLOCK_SUSPEND
    );
    /* Containers run without some capabilities, so drop any caps that are not available. */
    StructCapUserHeader header = new StructCapUserHeader(
            OsConstants._LINUX_CAPABILITY_VERSION_3, 0);
    StructCapUserData[] data;
    try {
        data = Os.capget(header);
    } catch (ErrnoException ex) {
        throw new RuntimeException("Failed to capget()", ex);
    }
    capabilities &= ((long) data[0].effective) | (((long) data[1].effective) << 32);

    /* 启动system server进程的参数是硬编码的,这里可以看出其特殊的UID 1000 */
    String[] args = {
            "--setuid=1000",
            "--setgid=1000",
            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,"
                    + "1024,1032,1065,3001,3002,3003,3005,3006,3007,3009,3010,3011,3012",
            "--capabilities=" + capabilities + "," + capabilities,
            "--nice-name=system_server",
            "--runtime-args",
            "--target-sdk-version=" + VMRuntime.SDK_VERSION_CUR_DEVELOPMENT,
            "com.android.server.SystemServer",
    };
    ZygoteArguments parsedArgs;

    int pid;

    try {
        ZygoteCommandBuffer commandBuffer = new ZygoteCommandBuffer(args);
        try {
            // 参数格式转换
            parsedArgs = ZygoteArguments.getInstance(commandBuffer);
        } catch (EOFException e) {
            throw new AssertionError("Unexpected argument error for forking system server", e);
        }
        commandBuffer.close();
        Zygote.applyDebuggerSystemProperty(parsedArgs);
        Zygote.applyInvokeWithSystemProperty(parsedArgs);
        // 这块内存标记扩展特性参考https://source.android.google.cn/docs/security/test/memory-safety/arm-mte
        if (Zygote.nativeSupportsMemoryTagging()) {
            String mode = SystemProperties.get("arm64.memtag.process.system_server", "");
            if (mode.isEmpty()) {
              /* The system server has ASYNC MTE by default, in order to allow
               * system services to specify their own MTE level later, as you
               * can't re-enable MTE once it's disabled. */
              mode = SystemProperties.get("persist.arm64.memtag.default", "async");
            }
            if (mode.equals("async")) {
                parsedArgs.mRuntimeFlags |= Zygote.MEMORY_TAG_LEVEL_ASYNC;
            } else if (mode.equals("sync")) {
                parsedArgs.mRuntimeFlags |= Zygote.MEMORY_TAG_LEVEL_SYNC;
            } else if (!mode.equals("off")) {
                /* When we have an invalid memory tag level, keep the current level. */
                parsedArgs.mRuntimeFlags |= Zygote.nativeCurrentTaggingLevel();
                Slog.e(TAG, "Unknown memory tag level for the system server: \"" + mode + "\"");
            }
        } else if (Zygote.nativeSupportsTaggedPointers()) {
            /* Enable pointer tagging in the system server. Hardware support for this is present
             * in all ARMv8 CPUs. */
            parsedArgs.mRuntimeFlags |= Zygote.MEMORY_TAG_LEVEL_TBI;
        }

        /* Enable gwp-asan on the system server with a small probability. This is the same
         * policy as applied to native processes and system apps. */
        parsedArgs.mRuntimeFlags |= Zygote.GWP_ASAN_LEVEL_LOTTERY;

        if (shouldProfileSystemServer()) {
            parsedArgs.mRuntimeFlags |= Zygote.PROFILE_SYSTEM_SERVER;
        }

        /* Request to fork the system server process */
        /*
        * 会调用C层的nativeForkSystemServer,其中zygote::ForkCommon函数中,终于见到了fork()函数
        */
        pid = Zygote.forkSystemServer(
                parsedArgs.mUid, parsedArgs.mGid,
                parsedArgs.mGids,
                parsedArgs.mRuntimeFlags,
                null,
                parsedArgs.mPermittedCapabilities,
                parsedArgs.mEffectiveCapabilities);
    } catch (IllegalArgumentException ex) {
        throw new RuntimeException(ex);
    }

    /* system server子进程本身 */
    if (pid == 0) {
        if (hasSecondZygote(abiList)) {
            waitForSecondaryZygote(socketName);
        }
        // fork时会copy socket,system server需要主动关闭
        zygoteServer.closeServerSocket();
          // fork完之后的流程,比如创建PathClassLoader, ZygoteInit.zygoteInit()
        return handleSystemServerProcess(parsedArgs);
    }

    return null;
}

当fork完,system_server进程会调用handleSystemServerProcess创建app加载类的PathClassLoader,

Copy Code

public static Runnable zygoteInit(int targetSdkVersion, long[] disabledCompatChanges,
        String[] argv, ClassLoader classLoader) {
    if (RuntimeInit.DEBUG) {
        Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
    }

    Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
    RuntimeInit.redirectLogStreams();

    RuntimeInit.commonInit();
    ZygoteInit.nativeZygoteInit();
    // 会找到java类(com.android.server.SystemServer)的main入口方法MethodAndArgsCaller
    // Runnable的run方法其实就是反射调用main()方法
    return RuntimeInit.applicationInit(targetSdkVersion, disabledCompatChanges, argv,
            classLoader);
}

Java: RuntimeInit.java

RuntimeInit是zygote application模式的启动类,它比ZygoteInit简单多了

  • preForkInit 开启DDMS
  • commonInit 设置日志及异常处理器
  • nativeFinishInit 最终回调的是app_main.cpp AppRuntime中的onStarted回调,最终调用fork出来的进程main函数,system_server进程为com.android.server.SystemServer.main(),App进程为ActivityThread.main()

Copy Code

public static final void main(String[] argv) {
    preForkInit();
    if (argv.length == 2 && argv[1].equals("application")) {
        if (DEBUG) Slog.d(TAG, "RuntimeInit: Starting application");
        redirectLogStreams();
    } else {
        if (DEBUG) Slog.d(TAG, "RuntimeInit: Starting tool");
    }

    commonInit();

    /*
     * Now that we're running in interpreted code, call back into native code
     * to run the system.
     */
    nativeFinishInit();

    if (DEBUG) Slog.d(TAG, "Leaving RuntimeInit!");
}

commonInit()

主要设置日志及异常处理器

Copy Code

protected static final void commonInit() {
    if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");

    /*
     * set handlers; these apply to all threads in the VM. Apps can replace
     * the default handler, but not the pre handler.
     */
    LoggingHandler loggingHandler = new LoggingHandler();
    RuntimeHooks.setUncaughtExceptionPreHandler(loggingHandler);
      // 未处理的异常处理器,与java不一样,android app 任意线程发生了未捕获的异常,进程会终止
      // 面试官常问的问题之一,Android与Java线程的异常处理有何不同
    Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));

    /*
     * Install a time zone supplier that uses the Android persistent time zone system property.
     */
    RuntimeHooks.setTimeZoneIdSupplier(() -> SystemProperties.get("persist.sys.timezone"));

    /*
     * 重置JDK log处理器,android会向root logger注册AndroidLogHandler
     */
    LogManager.getLogManager().reset();
    new AndroidConfig();

    /*
     * 默认的UA为Dalvik/1.1.0 (Linux; U; Android Eclair Build/MAIN)这种
     */
    String userAgent = getDefaultUserAgent();
    System.setProperty("http.agent", userAgent);

    /*
     * 流量统计
     */
    TrafficStats.attachSocketTagger();

    initialized = true;
}

总结

最后上一张流程图
sequence