android6.0中app crash流程分析

xiaoxiao2021-02-28  73

要根据这个流程分析一下如何在应用中截获系统的app crash弹框,然后做到更人性化

基于Android 6.0的源码剖析, 分析Android应用Crash是如何处理的。

/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java /frameworks/base/core/java/android/app/ActivityManagerNative.java (含内部类AMP) /frameworks/base/core/java/android/app/ApplicationErrorReport.java /frameworks/base/services/core/java/com/android/server/ - am/ActivityManagerService.java - am/ProcessRecord.java - am/ActivityRecord.java - am/ActivityStackSupervisor.java - am/ActivityStack.java - am/ActivityRecord.java - am/BroadcastQueue.java - wm/WindowManagerService.java /libcore/libart/src/main/java/java/lang/Thread.java

一、概述

App crash(全称Application crash), 对于Crash可分为native crash和framework crash(包含app crash在内),对于crash相信很多app开发者都会遇到,那么上层什么时候会出现crash呢,系统又是如何处理crash的呢。例如,在app大家经常使用try...catch语句,那么如果没有有效catch exception,就是导致应用crash,发生没有catch exception,系统便会来进行捕获,并进入crash流程。如果你是从事Android系统开发或者架构相关工作,或者遇到需要解系统性的疑难杂症,那么很有必要了解系统Crash处理流程,知其然还需知其所以然;如果你仅仅是App初级开发,可能本文并非很适合阅读,整个系统流程错综复杂。

在Android系统启动系列文章,已讲述过上层应用都是由Zygote fork孵化而来,分为system_server系统进程和各种应用进程,在这些进程创建之初会设置未捕获异常的处理器,当系统抛出未捕获的异常时,最终都交给异常处理器。

对于system_server进程:文章Android系统启动-SystemServer上篇,system_server启动过程中由RuntimeInit.java的commonInit方法设置UncaughtHandler,用于处理未捕获异常;对于普通应用进程:文章理解Android进程创建流程 ,进程创建过程中,同样会调用RuntimeInit.java的commonInit方法设置UncaughtHandler。

1.1 crash调用链

crash流程的方法调用关系来结尾:

AMP.handleApplicationCrash AMS.handleApplicationCrash AMS.findAppProcess AMS.handleApplicationCrashInner AMS.addErrorToDropBox AMS.crashApplication AMS.makeAppCrashingLocked AMS.startAppProblemLocked ProcessRecord.stopFreezingAllLocked ActivityRecord.stopFreezingScreenLocked WMS.stopFreezingScreenLocked WMS.stopFreezingDisplayLocked AMS.handleAppCrashLocked mUiHandler.sendMessage(SHOW_ERROR_MSG) Process.killProcess(Process.myPid()); System.exit(10);

接下来说说这个过程。

二、Crash处理流程

那么接下来以commonInit()方法为起点来展开说明。

1. RuntimeInit.commonInit

public class RuntimeInit { ... private static final void commonInit() { //设置默认的未捕获异常处理器,UncaughtHandler实例化过程【见小节2】 Thread.setDefaultUncaughtExceptionHandler(new UncaughtHandler()); ... } }

setDefaultUncaughtExceptionHandler()只是将异常处理器handler对象赋给Thread成员变量,即Thread.defaultUncaughtHandler = new UncaughtHandler()。接下来看看UncaughtHandler对象实例化过程。

2. UncaughtHandler

[–>RuntimeInit.java]

private static class UncaughtHandler implements Thread.UncaughtExceptionHandler { //覆写接口方法 public void uncaughtException(Thread t, Throwable e) { try { //保证crash处理过程不会重入 if (mCrashing) return; mCrashing = true; if (mApplicationObject == null) { //system_server进程 Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e); } else { //普通应用进程 StringBuilder message = new StringBuilder(); message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n"); final String processName = ActivityThread.currentProcessName(); if (processName != null) { message.append("Process: ").append(processName).append(", "); } message.append("PID: ").append(Process.myPid()); Clog_e(TAG, message.toString(), e); } //启动crash对话框,等待处理完成 【见小节2.1和3】 ActivityManagerNative.getDefault().handleApplicationCrash( mApplicationObject, new ApplicationErrorReport.CrashInfo(e)); } catch (Throwable t2) { ... } finally { //确保当前进程彻底杀掉【见小节11】 Process.killProcess(Process.myPid()); System.exit(10); } } } 当system进程crash的信息: 开头*** FATAL EXCEPTION IN SYSTEM PROCESS [线程名];接着输出发生crash时的调用栈信息; 当app进程crash时的信息: 开头FATAL EXCEPTION: [线程名];紧接着 Process: [进程名], PID: [进程id];最后输出发生crash时的调用栈信息。

看到这里,你就会发现要从log中搜索crash信息,只需要搜索关键词FATAL EXCEPTION;如果需要进一步筛选只搜索系统crash信息,则可以搜索的关键词可以有多样,比如*** FATAL EXCEPTION。

当输出完crash信息到logcat里面,这只是crash流程的刚开始阶段,接下来弹出crash对话框,ActivityManagerNative.getDefault()返回的是ActivityManagerProxy(简称AMP),AMP经过binder调用最终交给ActivityManagerService(简称AMS)中相应的方法去处理,故接下来调用的是AMS.handleApplicationCrash()。

注意: mApplicationObject等于null,一定不是普通的app进程. 但是除了system进程, 也有可能是shell进程, 即通过app_process + 命令参数 的方式创建的进程.

2.1 CrashInfo

[-> ApplicationErrorReport.java]

public class ApplicationErrorReport implements Parcelable { ... public static class CrashInfo { public CrashInfo(Throwable tr) { StringWriter sw = new StringWriter(); PrintWriter pw = new FastPrintWriter(sw, false, 256); tr.printStackTrace(pw); //输出栈trace pw.flush(); stackTrace = sw.toString(); exceptionMessage = tr.getMessage(); Throwable rootTr = tr; while (tr.getCause() != null) { tr = tr.getCause(); if (tr.getStackTrace() != null && tr.getStackTrace().length > 0) { rootTr = tr; } String msg = tr.getMessage(); if (msg != null && msg.length() > 0) { exceptionMessage = msg; } } exceptionClassName = rootTr.getClass().getName(); if (rootTr.getStackTrace().length > 0) { StackTraceElement trace = rootTr.getStackTrace()[0]; throwFileName = trace.getFileName(); throwClassName = trace.getClassName(); throwMethodName = trace.getMethodName(); throwLineNumber = trace.getLineNumber(); } else { throwFileName = "unknown"; throwClassName = "unknown"; throwMethodName = "unknown"; throwLineNumber = 0; } } ... } }

将crash信息文件名,类名,方法名,对应行号以及异常信息都封装到CrashInfo对象。

3. handleApplicationCrash

[–>ActivityManagerService.java]

public void handleApplicationCrash(IBinder app, ApplicationErrorReport.CrashInfo crashInfo) { //获取进程record对象【见小节3.1】 ProcessRecord r = findAppProcess(app, "Crash"); final String processName = app == null ? "system_server" : (r == null ? "unknown" : r.processName); //【见小节4】 handleApplicationCrashInner("crash", r, processName, crashInfo); }

关于进程名(processName):

当远程IBinder对象为空时,则进程名为system_server;当远程IBinder对象不为空,且ProcessRecord为空时,则进程名为unknown;当远程IBinder对象不为空,且ProcessRecord不为空时,则进程名为ProcessRecord对象中相应进程名。

3.1 findAppProcess

[–>ActivityManagerService.java]

private ProcessRecord findAppProcess(IBinder app, String reason) { if (app == null) { return null; } synchronized (this) { final int NP = mProcessNames.getMap().size(); for (int ip=0; ip<NP; ip++) { SparseArray<ProcessRecord> apps = mProcessNames.getMap().valueAt(ip); final int NA = apps.size(); for (int ia=0; ia<NA; ia++) { ProcessRecord p = apps.valueAt(ia); //当找到目标进程则返回 if (p.thread != null && p.thread.asBinder() == app) { return p; } } } //如果代码执行到这里,表明无法找到应用所在的进程 return null; } }

其中 mProcessNames = new ProcessMap<ProcessRecord>();对于代码mProcessNames.getMap()返回的是mMap,而mMap= new ArrayMap<String, SparseArray<ProcessRecord>>();

知识延伸:SparseArray和ArrayMap是Android专门针对内存优化而设计的取代Java API中的HashMap的数据结构。对于key是int类型则使用SparseArray,可避免自动装箱过程;对于key为其他类型则使用ArrayMap。HashMap的查找和插入时间复杂度为O(1)的代价是牺牲大量的内存来实现的,而SparseArray和ArrayMap性能略逊于HashMap,但更节省内存。

再回到mMap,这是以进程name为key,再以(uid为key,以ProcessRecord为Value的)结构体作为value。下面看看其get()和put()方法

//获取mMap中(name,uid)所对应的ProcessRecord public ProcessRecord get(String name, int uid) {}; //将(name,uid, value)添加到mMap public ProcessRecord put(String name, int uid, ProcessRecord value) {};

findAppProcess()根据app(IBinder类型)来查询相应的目标对象ProcessRecord。

有了进程记录对象ProcessRecord和进程名processName,则进入执行Crash处理方法,继续往下看。

4. handleApplicationCrashInner

[–>ActivityManagerService.java]

void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName, ApplicationErrorReport.CrashInfo crashInfo) { //将Crash信息写入到Event log EventLog.writeEvent(EventLogTags.AM_CRASH,...); //将错误信息添加到DropBox addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo); //【见小节5】 crashApplication(r, crashInfo); }

其中addErrorToDropBox是将crash的信息输出到目录/data/system/dropbox。例如system_server的dropbox文件名为system_server_crash@xxx.txt (xxx代表的是时间戳)

5. crashApplication

[–>ActivityManagerService.java]

private void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) { long timeMillis = System.currentTimeMillis(); String shortMsg = crashInfo.exceptionClassName; String longMsg = crashInfo.exceptionMessage; String stackTrace = crashInfo.stackTrace; if (shortMsg != null && longMsg != null) { longMsg = shortMsg + ": " + longMsg; } else if (shortMsg != null) { longMsg = shortMsg; } AppErrorResult result = new AppErrorResult(); synchronized (this) { // 当存在ActivityController的情况,比如monkey if (mController != null) { try { String name = r != null ? r.processName : null; int pid = r != null ? r.pid : Binder.getCallingPid(); int uid = r != null ? r.info.uid : Binder.getCallingUid(); //调用monkey的appCrashed if (!mController.appCrashed(name, pid, shortMsg, longMsg, timeMillis, crashInfo.stackTrace)) { if ("1".equals(SystemProperties.get(SYSTEM_DEBUGGABLE, "0")) && "Native crash".equals(crashInfo.exceptionClassName)) { Slog.w(TAG, "Skip killing native crashed app " + name + "(" + pid + ") during testing"); } else { Slog.w(TAG, "Force-killing crashed app " + name + " at watcher's request"); if (r != null) { r.kill("crash", true); } else { Process.killProcess(pid); killProcessGroup(uid, pid); } } return; } } catch (RemoteException e) { mController = null; Watchdog.getInstance().setActivityController(null); } } //清除远程调用者uid和pid信息,并保存到origId final long origId = Binder.clearCallingIdentity(); ... //【见小节6】 if (r == null || !makeAppCrashingLocked(r, shortMsg, longMsg, stackTrace)) { Binder.restoreCallingIdentity(origId); return; } Message msg = Message.obtain(); msg.what = SHOW_ERROR_MSG; HashMap data = new HashMap(); data.put("result", result); data.put("app", r); msg.obj = data; //发送消息SHOW_ERROR_MSG,弹出提示crash的对话框,等待用户选择【见小节10】 mUiHandler.sendMessage(msg); //恢复远程调用者uid和pid Binder.restoreCallingIdentity(origId); } //进入阻塞等待,直到用户选择crash对话框"退出"或者"退出并报告" int res = result.get(); Intent appErrorIntent = null; synchronized (this) { if (r != null && !r.isolated) { // 将崩溃的进程信息保存到mProcessCrashTimes mProcessCrashTimes.put(r.info.processName, r.uid, SystemClock.uptimeMillis()); } if (res == AppErrorDialog.FORCE_QUIT_AND_REPORT) { //创建action="android.intent.action.APP_ERROR",组件为r.errorReportReceiver的Intent appErrorIntent = createAppErrorIntentLocked(r, timeMillis, crashInfo); } } if (appErrorIntent != null) { try { //启动Intent为appErrorIntent的Activity mContext.startActivityAsUser(appErrorIntent, new UserHandle(r.userId)); } catch (ActivityNotFoundException e) { Slog.w(TAG, "bug report receiver dissappeared", e); } } }

该方法主要做的两件事:

调用makeAppCrashingLocked,继续处理crash流程;发送消息SHOW_ERROR_MSG,弹出提示crash的对话框,等待用户选择;

6. makeAppCrashingLocked

[–>ActivityManagerService.java]

private boolean makeAppCrashingLocked(ProcessRecord app, String shortMsg, String longMsg, String stackTrace) { app.crashing = true; //封装crash信息到crashingReport对象 app.crashingReport = generateProcessError(app, ActivityManager.ProcessErrorStateInfo.CRASHED, null, shortMsg, longMsg, stackTrace); //【见小节7】 startAppProblemLocked(app); //停止屏幕冻结【见小节8】 app.stopFreezingAllLocked(); //【见小节9】 return handleAppCrashLocked(app, "force-crash", shortMsg, longMsg, stackTrace); }

7. startAppProblemLocked

[–>ActivityManagerService.java]

void startAppProblemLocked(ProcessRecord app) { app.errorReportReceiver = null; for (int userId : mCurrentProfileIds) { if (app.userId == userId) { //获取当前用户下的crash应用的error receiver【见小节7.1】 app.errorReportReceiver = ApplicationErrorReport.getErrorReportReceiver( mContext, app.info.packageName, app.info.flags); } } //忽略当前app的广播接收【见小节7.2】 skipCurrentReceiverLocked(app); }

该方法主要功能:

获取当前用户下的crash应用的error receiver;忽略当前app的广播接收;

7.1 getErrorReportReceiver

[-> ApplicationErrorReport.java]

public static ComponentName getErrorReportReceiver(Context context, String packageName, int appFlags) { //检查Settings中的"send_action_app_error"是否使能错误报告的功能 int enabled = Settings.Global.getInt(context.getContentResolver(), Settings.Global.SEND_ACTION_APP_ERROR, 0); if (enabled == 0) { //1.当未使能时,则直接返回 return null; } PackageManager pm = context.getPackageManager(); String candidate = null; ComponentName result = null; try { //获取该crash应用的安装器的包名 candidate = pm.getInstallerPackageName(packageName); } catch (IllegalArgumentException e) { } if (candidate != null) { result = getErrorReportReceiver(pm, packageName, candidate);//【见下文】 if (result != null) { //2.当找到该crash应用的安装器,则返回; return result; } } if ((appFlags&ApplicationInfo.FLAG_SYSTEM) != 0) { //该系统属性名为"ro.error.receiver.system.apps" candidate = SystemProperties.get(SYSTEM_APPS_ERROR_RECEIVER_PROPERTY); result = getErrorReportReceiver(pm, packageName, candidate);//【见下文】 if (result != null) { //3.当crash应用是系统应用时,且系统属性指定error receiver时,则返回; return result; } } //该默认属性名为"ro.error.receiver.default" candidate = SystemProperties.get(DEFAULT_ERROR_RECEIVER_PROPERTY); //4.当默认属性值指定error receiver时,则返回; return getErrorReportReceiver(pm, packageName, candidate); //【见下文】 }

getErrorReportReceiver:这是同名不同输入参数的另一个方法:

static ComponentName getErrorReportReceiver(PackageManager pm, String errorPackage, String receiverPackage) { if (receiverPackage == null || receiverPackage.length() == 0) { return null; } //当安装应用程序的安装器Crash,则直接返回 if (receiverPackage.equals(errorPackage)) { return null; } //ACTION_APP_ERROR值为"android.intent.action.APP_ERROR" Intent intent = new Intent(Intent.ACTION_APP_ERROR); intent.setPackage(receiverPackage); ResolveInfo info = pm.resolveActivity(intent, 0); if (info == null || info.activityInfo == null) { return null; } //创建包名为receiverPackage的组件 return new ComponentName(receiverPackage, info.activityInfo.name); }

7.2 skipCurrentReceiverLocked

[–>ActivityManagerService.java]

void skipCurrentReceiverLocked(ProcessRecord app) { for (BroadcastQueue queue : mBroadcastQueues) { queue.skipCurrentReceiverLocked(app); //【见小节7.2.1】 } }
7.2.1 skipCurrentReceiverLocked

[-> BroadcastQueue.java]

public void skipCurrentReceiverLocked(ProcessRecord app) { BroadcastRecord r = null; //查看app进程中的广播 if (mOrderedBroadcasts.size() > 0) { BroadcastRecord br = mOrderedBroadcasts.get(0); if (br.curApp == app) { r = br; } } if (r == null && mPendingBroadcast != null && mPendingBroadcast.curApp == app) { r = mPendingBroadcast; } if (r != null) { //结束app进程的广播结束 finishReceiverLocked(r, r.resultCode, r.resultData, r.resultExtras, r.resultAbort, false); //广播调度 scheduleBroadcastsLocked(); } }

8. PR.stopFreezingAllLocked

[-> ProcessRecord.java]

public void stopFreezingAllLocked() { int i = activities.size(); while (i > 0) { i--; activities.get(i).stopFreezingScreenLocked(true); //【见小节8.1】 } }

其中activities类型为ArrayList<ActivityRecord>,停止进程里所有的Activity

8.1. AR.stopFreezingScreenLocked

[-> ActivityRecord.java]

public void stopFreezingScreenLocked(boolean force) { if (force || frozenBeforeDestroy) { frozenBeforeDestroy = false; //mWindowManager类型为WMS //【见小节8.1.1】 service.mWindowManager.stopAppFreezingScreen(appToken, force); } }

其中appToken是IApplication.Stub类型,即WindowManager的token。

8.1.1 WMS.stopFreezingScreenLocked

[-> WindowManagerService.java]

@Override public void stopFreezingScreen() { //权限检查 if (!checkCallingPermission(android.Manifest.permission.FREEZE_SCREEN, "stopFreezingScreen()")) { throw new SecurityException("Requires FREEZE_SCREEN permission"); } synchronized(mWindowMap) { if (mClientFreezingScreen) { mClientFreezingScreen = false; mLastFinishedFreezeSource = "client"; final long origId = Binder.clearCallingIdentity(); try { stopFreezingDisplayLocked(); //【见流程8.1.1.1】 } finally { Binder.restoreCallingIdentity(origId); } } } }
8.1.1.1 WMS.stopFreezingDisplayLocked

[-> WindowManagerService.java]

private void stopFreezingDisplayLocked() { if (!mDisplayFrozen) { return; //显示没有冻结,则直接返回 } //往往跟屏幕旋转相关 ... mDisplayFrozen = false; //从上次冻屏到现在的总时长 mLastDisplayFreezeDuration = (int)(SystemClock.elapsedRealtime() - mDisplayFreezeTime); //移除冻屏的超时消息 mH.removeMessages(H.APP_FREEZE_TIMEOUT); mH.removeMessages(H.CLIENT_FREEZE_TIMEOUT); boolean updateRotation = false; //获取默认的DisplayContent final DisplayContent displayContent = getDefaultDisplayContentLocked(); final int displayId = displayContent.getDisplayId(); ScreenRotationAnimation screenRotationAnimation = mAnimator.getScreenRotationAnimationLocked(displayId); //屏幕旋转动画的相关操作 if (CUSTOM_SCREEN_ROTATION && screenRotationAnimation != null && screenRotationAnimation.hasScreenshot()) { DisplayInfo displayInfo = displayContent.getDisplayInfo(); boolean isDimming = displayContent.isDimming(); if (!mPolicy.validateRotationAnimationLw(mExitAnimId, mEnterAnimId, isDimming)) { mExitAnimId = mEnterAnimId = 0; } //加载动画最大时长为10s if (screenRotationAnimation.dismiss(mFxSession, MAX_ANIMATION_DURATION, getTransitionAnimationScaleLocked(), displayInfo.logicalWidth, displayInfo.logicalHeight, mExitAnimId, mEnterAnimId)) { scheduleAnimationLocked(); } else { screenRotationAnimation.kill(); mAnimator.setScreenRotationAnimationLocked(displayId, null); updateRotation = true; } } else { if (screenRotationAnimation != null) { screenRotationAnimation.kill(); mAnimator.setScreenRotationAnimationLocked(displayId, null); } updateRotation = true; } //经过层层调用到InputManagerService服务,IMS服务使能输入事件分发功能 mInputMonitor.thawInputDispatchingLw(); boolean configChanged; //当display被冻结时不再计算屏幕方向,以避免不连续的状态。 configChanged = updateOrientationFromAppTokensLocked(false); //display冻结时,执行gc操作 mH.removeMessages(H.FORCE_GC); mH.sendEmptyMessageDelayed(H.FORCE_GC, 2000); //mScreenFrozenLock的类型为PowerManager.WakeLock,即释放屏幕冻结的锁 mScreenFrozenLock.release(); if (updateRotation) { //更新当前的屏幕方向 configChanged |= updateRotationUncheckedLocked(false); } if (configChanged) { //向mH发送configuraion改变的消息 mH.sendEmptyMessage(H.SEND_NEW_CONFIGURATION); } }

该方法主要功能:

处理屏幕旋转相关逻辑;移除冻屏的超时消息;屏幕旋转动画的相关操作;使能输入事件分发功能;display冻结时,执行gc操作;更新当前的屏幕方向;向mH发送configuraion改变的消息。

9.AMS.handleAppCrashLocked

[-> ActivityManagerService.java]

private boolean handleAppCrashLocked(ProcessRecord app, String reason, String shortMsg, String longMsg, String stackTrace) { long now = SystemClock.uptimeMillis(); Long crashTime; if (!app.isolated) { crashTime = mProcessCrashTimes.get(app.info.processName, app.uid); } else { crashTime = null; } //当同一个进程,连续两次crash的时间间隔小于1分钟时,则认为crash太过于频繁 if (crashTime != null && now < crashTime+ProcessList.MIN_CRASH_INTERVAL) { EventLog.writeEvent(EventLogTags.AM_PROCESS_CRASHED_TOO_MUCH, app.userId, app.info.processName, app.uid); //【见小节9.1】 mStackSupervisor.handleAppCrashLocked(app); if (!app.persistent) { //不再重启非persistent进程,除非用户显式地调用 EventLog.writeEvent(EventLogTags.AM_PROC_BAD, app.userId, app.uid, app.info.processName); if (!app.isolated) { //将当前app加入到mBadProcesses mBadProcesses.put(app.info.processName, app.uid, new BadProcessInfo(now, shortMsg, longMsg, stackTrace)); mProcessCrashTimes.remove(app.info.processName, app.uid); } app.bad = true; app.removed = true; //移除进程的所有服务,保证不再重启【见小节9.2】 removeProcessLocked(app, false, false, "crash"); //恢复最顶部的Activity【见小节9.3】 mStackSupervisor.resumeTopActivitiesLocked(); return false; } mStackSupervisor.resumeTopActivitiesLocked(); } else { //此处reason="force-crash"【见小节9.4】 mStackSupervisor.finishTopRunningActivityLocked(app, reason); } //运行在当前进程中的所有服务的crash次数执行加1操作 for (int i=app.services.size()-1; i>=0; i--) { ServiceRecord sr = app.services.valueAt(i); sr.crashCount++; } //当桌面应用crash,并且被三方app所取代,那么需要清空桌面应用的偏爱选项。 final ArrayList<ActivityRecord> activities = app.activities; if (app == mHomeProcess && activities.size() > 0 && (mHomeProcess.info.flags & ApplicationInfo.FLAG_SYSTEM) == 0) { for (int activityNdx = activities.size() - 1; activityNdx >= 0; --activityNdx) { final ActivityRecord r = activities.get(activityNdx); if (r.isHomeActivity()) { //清空偏爱应用 ActivityThread.getPackageManager() .clearPackagePreferredActivities(r.packageName); } } } if (!app.isolated) { //无法记录孤立进程的crash时间点,由于他们并没有一个固定身份 mProcessCrashTimes.put(app.info.processName, app.uid, now); } //当app存在crash的handler,那么交给其处理 if (app.crashHandler != null) mHandler.post(app.crashHandler); return true; } 当同一进程在时间间隔小于1分钟时连续两次crash,则执行的情况下: 对于非persistent进程: [9.1] mStackSupervisor.handleAppCrashLocked(app);[9.2] removeProcessLocked(app, false, false, “crash”);[9.3] mStackSupervisor.resumeTopActivitiesLocked(); 对于persistent进程,则只执行 [9.3] mStackSupervisor.resumeTopActivitiesLocked(); 否则执行 [9.4] mStackSupervisor.finishTopRunningActivityLocked(app, reason);

9.1 ASS.handleAppCrashLocked

[-> ActivityStackSupervisor.java]

void handleAppCrashLocked(ProcessRecord app) { for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) { final ArrayList<ActivityStack> stacks = mActivityDisplays.valueAt(displayNdx).mStacks; int stackNdx = stacks.size() - 1; while (stackNdx >= 0) { //调用ActivityStack【见小节9.1.1】 stacks.get(stackNdx).handleAppCrashLocked(app); stackNdx--; } } }
9.1.1 AS.handleAppCrashLocked

[-> ActivityStack.java]

void handleAppCrashLocked(ProcessRecord app) { for (int taskNdx = mTaskHistory.size() - 1; taskNdx >= 0; --taskNdx) { final ArrayList<ActivityRecord> activities = mTaskHistory.get(taskNdx).mActivities; for (int activityNdx = activities.size() - 1; activityNdx >= 0; --activityNdx) { final ActivityRecord r = activities.get(activityNdx); if (r.app == app) { r.app = null; //结束当前activity finishCurrentActivityLocked(r, FINISH_IMMEDIATELY, false); } } } }

这里的mTaskHistory数据类型为ArrayList,记录着所有先前的后台activities。遍历所有activities,找到位于该ProcessRecord的所有ActivityRecord,并结束该Acitivity。

9.2 AMS.removeProcessLocked

[-> ActivityManagerService.java]

private final boolean removeProcessLocked(ProcessRecord app, boolean callerWillRestart, boolean allowRestart, String reason) { final String name = app.processName; final int uid = app.uid; //从mProcessNames移除该进程 removeProcessNameLocked(name, uid); ... if (app.pid > 0 && app.pid != MY_PID) { int pid = app.pid; synchronized (mPidsSelfLocked) { mPidsSelfLocked.remove(pid); //移除该pid mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app); } ... boolean willRestart = false; //对于非孤立的persistent进程设置成可重启flags if (app.persistent && !app.isolated) { if (!callerWillRestart) { willRestart = true; } else { needRestart = true; } } // 杀进程【9.2.1】 app.kill(reason, true); //移除进程并清空该进程相关联的activity/service等组件 【9.2.2】 handleAppDiedLocked(app, willRestart, allowRestart); if (willRestart) { //此处willRestart=false,不进入该分支 removeLruProcessLocked(app); addAppLocked(app.info, false, null /* ABI override */); } } else { mRemovedProcesses.add(app); } return needRestart; } mProcessNames数据类型为ProcessMap,这是以进程名为key,记录着所有的ProcessRecord信息mPidsSelfLocked数据类型为SparseArray,这是以pid为key,记录着所有的ProcessRecord信息。该对象的同步保护是通过自身锁,而非全局ActivityManager锁。
9.2.1 app.kill

[-> ProcessRecord.java]

void kill(String reason, boolean noisy) { if (!killedByAm) { Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "kill"); if (noisy) { Slog.i(TAG, "Killing " + toShortString() + " (adj " + setAdj + "): " + reason); } EventLog.writeEvent(EventLogTags.AM_KILL, userId, pid, processName, setAdj, reason); Process.killProcessQuiet(pid); //杀进程 Process.killProcessGroup(info.uid, pid); //杀进程组,包括native进程 if (!persistent) { killed = true; killedByAm = true; } Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER); } }

此处reason为“crash”,关于杀进程的过程见我的另一篇文章理解杀进程的实现原理.

9.2.2 handleAppDiedLocked

[-> ActivityManagerService.java]

private final void handleAppDiedLocked(ProcessRecord app, boolean restarting, boolean allowRestart) { int pid = app.pid; //清除应用中service/receiver/ContentProvider信息 boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1); if (!kept && !restarting) { removeLruProcessLocked(app); if (pid > 0) { ProcessList.remove(pid); } } if (mProfileProc == app) { clearProfilerLocked(); } //清除应用中activity相关信息 boolean hasVisibleActivities = mStackSupervisor.handleAppDiedLocked(app); app.activities.clear(); ... if (!restarting && hasVisibleActivities && !mStackSupervisor.resumeTopActivitiesLocked()) { mStackSupervisor.ensureActivitiesVisibleLocked(null, 0); } }

9.3 ASS.resumeTopActivitiesLocked

[-> ActivityStackSupervisor.java]

boolean resumeTopActivitiesLocked() { return resumeTopActivitiesLocked(null, null, null); } boolean resumeTopActivitiesLocked(ActivityStack targetStack, ActivityRecord target, Bundle targetOptions) { if (targetStack == null) { targetStack = mFocusedStack; } boolean result = false; if (isFrontStack(targetStack)) { //【见小节9.3.1】 result = targetStack.resumeTopActivityLocked(target, targetOptions); } for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) { final ArrayList<ActivityStack> stacks = mActivityDisplays.valueAt(displayNdx).mStacks; for (int stackNdx = stacks.size() - 1; stackNdx >= 0; --stackNdx) { final ActivityStack stack = stacks.get(stackNdx); if (stack == targetStack) { continue; //已经启动 } if (isFrontStack(stack)) { stack.resumeTopActivityLocked(null); } } } return result; }

此处mFocusedStack是当前正在等待接收input事件或者正在启动下一个activity的ActivityStack。

9.3.1 AS.resumeTopActivityLocked

[-> ActivityStack.java]

final boolean .resumeTopActivityLocked(ActivityRecord prev, Bundle options) { ... result = resumeTopActivityInnerLocked(prev, options);//【见小节9.3.2】 return result; }
9.3.2 AS.resumeTopActivityInnerLocked

[-> ActivityStack.java]

private boolean resumeTopActivityInnerLocked(ActivityRecord prev, Bundle options) { //找到mTaskHistory栈中第一个未处于finishing状态的Activity final ActivityRecord next = topRunningActivityLocked(null); if (mResumedActivity == next && next.state == ActivityState.RESUMED && mStackSupervisor.allResumedActivitiesComplete()) { //当top activity已经处于resume,则无需操作; return false; } if (mService.isSleepingOrShuttingDown() && mLastPausedActivity == next && mStackSupervisor.allPausedActivitiesComplete()) { //当正处于sleeping状态,top activity处于paused,则无需操作 return false; } //正在启动app的activity,确保app不会被设置为stopped AppGlobals.getPackageManager().setPackageStoppedState( next.packageName, false, next.userId); //回调应用onResume方法 next.app.thread.scheduleResumeActivity(next.appToken, next.app.repProcState, mService.isNextTransitionForward(), resumeAnimOptions); ... }

该方法代码比较长,这里就简单列举几条比较重要的代码。执行完该方法,应用也便完成了activity的resume过程。

9.4 finishTopRunningActivityLocked

9.4.1 ASS.finishTopRunningActivityLocked

[-> ActivityStackSupervisor.java]

void finishTopRunningActivityLocked(ProcessRecord app, String reason) { for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) { final ArrayList<ActivityStack> stacks = mActivityDisplays.valueAt(displayNdx).mStacks; final int numStacks = stacks.size(); for (int stackNdx = 0; stackNdx < numStacks; ++stackNdx) { final ActivityStack stack = stacks.get(stackNdx); //此处reason= "force-crash"【见小节9.4.2】 stack.finishTopRunningActivityLocked(app, reason); } } }
9.4.2 AS.finishTopRunningActivityLocked
final void finishTopRunningActivityLocked(ProcessRecord app, String reason) { //找到栈顶第一个不处于finishing状态的activity ActivityRecord r = topRunningActivityLocked(null); if (r != null && r.app == app) { int taskNdx = mTaskHistory.indexOf(r.task); int activityNdx = r.task.mActivities.indexOf(r); //【见小节9.4.3】 finishActivityLocked(r, Activity.RESULT_CANCELED, null, reason, false); --activityNdx; if (activityNdx < 0) { do { --taskNdx; if (taskNdx < 0) { break; } activityNdx = mTaskHistory.get(taskNdx).mActivities.size() - 1; } while (activityNdx < 0); } if (activityNdx >= 0) { r = mTaskHistory.get(taskNdx).mActivities.get(activityNdx); if (r.state == ActivityState.RESUMED || r.state == ActivityState.PAUSING || r.state == ActivityState.PAUSED) { if (!r.isHomeActivity() || mService.mHomeProcess != r.app) { //【见小节9.4.3】 finishActivityLocked(r, Activity.RESULT_CANCELED, null, reason, false); } } } } }
9.4.3 AS.finishActivityLocked
final boolean finishActivityLocked(ActivityRecord r, int resultCode, Intent resultData, String reason, boolean oomAdj) { if (r.finishing) { return false; //正在finishing则返回 } //设置finish状态的activity不可见 r.makeFinishingLocked(); //暂停key的分发事件 r.pauseKeyDispatchingLocked(); mWindowManager.prepareAppTransition(endTask ? AppTransition.TRANSIT_TASK_CLOSE : AppTransition.TRANSIT_ACTIVITY_CLOSE, false); mWindowManager.setAppVisibility(r.appToken, false); //回调activity的onPause方法 startPausingLocked(false, false, false, false); ... }

该方法最终会回调到activity的pause方法。

执行到这,我们还回过来看小节5.crashApplication中,处理完makeAppCrashingLocked,则会再发送消息SHOW_ERROR_MSG,弹出提示crash的对话框,接下来再看看该过程。

10. UiHandler

通过mUiHandler发送message,且消息的msg.waht=SHOW_ERROR_MSG,接下来进入UiHandler来看看handleMessage的处理过程。

[-> ActivityManagerService.java]

final class UiHandler extends Handler { public void handleMessage(Message msg) { switch (msg.what) { case SHOW_ERROR_MSG: { HashMap<String, Object> data = (HashMap<String, Object>) msg.obj; synchronized (ActivityManagerService.this) { ProcessRecord proc = (ProcessRecord)data.get("app"); AppErrorResult res = (AppErrorResult) data.get("result"); 、 boolean isBackground = (UserHandle.getAppId(proc.uid) >= Process.FIRST_APPLICATION_UID && proc.pid != MY_PID); ... if (mShowDialogs && !mSleeping && !mShuttingDown) { //创建提示crash对话框,等待用户选择,5分钟操作等待。 Dialog d = new AppErrorDialog(mContext, ActivityManagerService.this, res, proc); d.show(); proc.crashDialog = d; } else { //当处于sleep状态,则默认选择退出。 if (res != null) { res.set(0); } } } } break; ... } }

在发生crash时,默认系统会弹出提示crash的对话框,并阻塞等待用户选择是“退出”或 “退出并报告”,当用户不做任何选择时5min超时后,默认选择“退出”,当手机休眠时也默认选择“退出”。到这里也并没有真正结束,在小节2.uncaughtException中在finnally语句块还有一个杀进程的动作。

11. killProcess

Process.killProcess(Process.myPid()); System.exit(10);

通过finnally语句块保证能执行并彻底杀掉Crash进程,关于杀进程的过程见我的另一篇文章理解杀进程的实现原理.。当Crash进程被杀后,并没有完全结束,还有Binder死亡通知的流程还没有处理完成。

12. 小结

当进程抛出未捕获异常时,则系统会处理该异常并进入crash处理流程。

其中最为核心的工作图中红色部分AMS.handleAppCrashLocked的主要功能:

当同一进程1分钟之内连续两次crash,则执行的情况下: 对于非persistent进程: ASS.handleAppCrashLocked, 直接结束该应用所有activityAMS.removeProcessLocked,杀死该进程以及同一个进程组下的所有进ASS.resumeTopActivitiesLocked,恢复栈顶第一个非finishing状态的activity 对于persistent进程,则只执行 ASS.resumeTopActivitiesLocked,恢复栈顶第一个非finishing状态的activity 否则,当进程没连续频繁crash ASS.finishTopRunningActivityLocked,执行结束栈顶正在运行activity

另外,AMS.handleAppCrashLocked,该方法内部主要调用链,如下:

AMS.handleAppCrashLocked ASS.handleAppCrashLocked AS.handleAppCrashLocked AS.finishCurrentActivityLocked AMS.removeProcessLocked ProcessRecord.kill AMS.handleAppDiedLocked ASS.handleAppDiedLocked AMS.cleanUpApplicationRecordLocked AS.handleAppDiedLocked AS.removeHistoryRecordsForAppLocked ASS.resumeTopActivitiesLocked AS.resumeTopActivityLocked AS.resumeTopActivityInnerLocked ASS.finishTopRunningActivityLocked AS.finishTopRunningActivityLocked AS.finishActivityLocked

三、Binder死亡通知

进程被杀,如果还记得Binder的死亡回调机制,在应用进程创建的过程中有一个attachApplicationLocked方法的过程中便会创建死亡通知。

[-> ActivityManagerService.java]

private final boolean attachApplicationLocked(IApplicationThread thread, int pid) { try { //创建binder死亡通知 AppDeathRecipient adr = new AppDeathRecipient( app, pid, thread); thread.asBinder().linkToDeath(adr, 0); app.deathRecipient = adr; } catch (RemoteException e) { app.resetPackageList(mProcessStats); startProcessLocked(app, "link fail", processName); return false; } ... }

当binder服务端挂了之后,便会通过binder的DeathRecipient来通知AMS进行相应的清理收尾工作。前面已经降到crash的进程会被kill掉,那么当该进程会杀,则会回调到binderDied()方法。

1. binderDied

[-> ActivityManagerService.java]

private final class AppDeathRecipient implements IBinder.DeathRecipient { public void binderDied() { synchronized(ActivityManagerService.this) { appDiedLocked(mApp, mPid, mAppThread, true);//【见小节2】 } } }

2. appDiedLocked

final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread, boolean fromBinderDied) { ... if (!app.killed) { if (!fromBinderDied) { Process.killProcessQuiet(pid); } killProcessGroup(app.info.uid, pid); app.killed = true; } // Clean up already done if the process has been re-started. if (app.pid == pid && app.thread != null && app.thread.asBinder() == thread.asBinder()) { boolean doLowMem = app.instrumentationClass == null; boolean doOomAdj = doLowMem; if (!app.killedByAm) { mAllowLowerMemLevel = true; } else { mAllowLowerMemLevel = false; doLowMem = false; } //【见小节3】 handleAppDiedLocked(app, false, true); if (doOomAdj) { updateOomAdjLocked(); } if (doLowMem) { doLowMemReportIfNeededLocked(app); } } ... }

3 handleAppDiedLocked

[-> ActivityManagerService.java]

private final void handleAppDiedLocked(ProcessRecord app, boolean restarting, boolean allowRestart) { int pid = app.pid; //清理应用程序service, BroadcastReceiver, ContentProvider相关信息【见小节4】 boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1); if (!kept && !restarting) { removeLruProcessLocked(app); if (pid > 0) { ProcessList.remove(pid); } } //清理activity相关信息 boolean hasVisibleActivities = mStackSupervisor.handleAppDiedLocked(app); app.activities.clear(); ... //恢复栈顶第一个非finish的activity if (!restarting && hasVisibleActivities && !mStackSupervisor.resumeTopActivitiesLocked()) { mStackSupervisor.ensureActivitiesVisibleLocked(null, 0); } }

4 cleanUpApplicationRecordLocked

该方法清理应用程序service, BroadcastReceiver, ContentProvider,process相关信息,为了便于说明将该方法划分为4个部分讲解

4.1 清理service

参数restarting = false, allowRestart =true, index =-1

private final boolean cleanUpApplicationRecordLocked(ProcessRecord app, boolean restarting, boolean allowRestart, int index) { ... mProcessesToGc.remove(app); mPendingPssProcesses.remove(app); //如果存在,则清除crash/anr/wait对话框 if (app.crashDialog != null && !app.forceCrashReport) { app.crashDialog.dismiss(); app.crashDialog = null; } if (app.anrDialog != null) { app.anrDialog.dismiss(); app.anrDialog = null; } if (app.waitDialog != null) { app.waitDialog.dismiss(); app.waitDialog = null; } app.crashing = false; app.notResponding = false; app.resetPackageList(mProcessStats); app.unlinkDeathRecipient(); //解除app的死亡通告 app.makeInactive(mProcessStats); app.waitingToKill = null; app.forcingToForeground = null; //将app移除前台进程 updateProcessForegroundLocked(app, false, false); app.foregroundActivities = false; app.hasShownUi = false; app.treatLikeActivity = false; app.hasAboveClient = false; app.hasClientActivities = false; //清理service信息,这个过程也比较复杂,后续再展开 mServices.killServicesLocked(app, allowRestart); boolean restart = false; } mProcessesToGc:记录着需要尽快执行gc的进程列表mPendingPssProcesses:记录着需要收集内存信息的进程列表

4.2 清理ContentProvider

private final boolean cleanUpApplicationRecordLocked(...) { ... for (int i = app.pubProviders.size() - 1; i >= 0; i--) { //获取该进程已发表的ContentProvider ContentProviderRecord cpr = app.pubProviders.valueAt(i); final boolean always = app.bad || !allowRestart; //ContentProvider服务端被杀,则client端进程也会被杀 boolean inLaunching = removeDyingProviderLocked(app, cpr, always); if ((inLaunching || always) && cpr.hasConnectionOrHandle()) { restart = true; //需要重启 } cpr.provider = null; cpr.proc = null; } app.pubProviders.clear(); //处理正在启动并且是有client端正在等待的ContentProvider if (cleanupAppInLaunchingProvidersLocked(app, false)) { restart = true; } //取消已连接的ContentProvider的注册 if (!app.conProviders.isEmpty()) { for (int i = app.conProviders.size() - 1; i >= 0; i--) { ContentProviderConnection conn = app.conProviders.get(i); conn.provider.connections.remove(conn); stopAssociationLocked(app.uid, app.processName, conn.provider.uid, conn.provider.name); } app.conProviders.clear(); }

4.3 清理BroadcastReceiver

private final boolean cleanUpApplicationRecordLocked(...) { ... skipCurrentReceiverLocked(app); // 取消注册的广播接收者 for (int i = app.receivers.size() - 1; i >= 0; i--) { removeReceiverLocked(app.receivers.valueAt(i)); } app.receivers.clear(); }

4.4 清理Process

private final boolean cleanUpApplicationRecordLocked(...) { ... //当app正在备份时的处理方式 if (mBackupTarget != null && app.pid == mBackupTarget.app.pid) { ... IBackupManager bm = IBackupManager.Stub.asInterface( ServiceManager.getService(Context.BACKUP_SERVICE)); bm.agentDisconnected(app.info.packageName); } for (int i = mPendingProcessChanges.size() - 1; i >= 0; i--) { ProcessChangeItem item = mPendingProcessChanges.get(i); if (item.pid == app.pid) { mPendingProcessChanges.remove(i); mAvailProcessChanges.add(item); } } mUiHandler.obtainMessage(DISPATCH_PROCESS_DIED, app.pid, app.info.uid, null).sendToTarget(); if (!app.persistent || app.isolated) { removeProcessNameLocked(app.processName, app.uid); if (mHeavyWeightProcess == app) { mHandler.sendMessage(mHandler.obtainMessage(CANCEL_HEAVY_NOTIFICATION_MSG, mHeavyWeightProcess.userId, 0)); mHeavyWeightProcess = null; } } else if (!app.removed) { //对于persistent应用,则需要重启 if (mPersistentStartingProcesses.indexOf(app) < 0) { mPersistentStartingProcesses.add(app); restart = true; } } //mProcessesOnHold:记录着试图在系统ready之前就启动的进程。 //在那时并不启动这些进程,先记录下来,等系统启动完成则启动这些进程。 mProcessesOnHold.remove(app); if (app == mHomeProcess) { mHomeProcess = null; } if (app == mPreviousProcess) { mPreviousProcess = null; } if (restart && !app.isolated) { //仍有组件需要运行在该进程中,因此重启该进程 if (index < 0) { ProcessList.remove(app.pid); } addProcessNameLocked(app); startProcessLocked(app, "restart", app.processName); return true; } else if (app.pid > 0 && app.pid != MY_PID) { //移除该进程相关信息 boolean removed; synchronized (mPidsSelfLocked) { mPidsSelfLocked.remove(app.pid); mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app); } app.setPid(0); } return false; }

对于需要重启进程的情形有:

mLaunchingProviders:记录着存在client端等待的ContentProvider。应用当前正在启动中,当ContentProvider一旦发布则将该ContentProvider将从该list去除。当进程包含这样的ContentProvider,则需要重启进程。mPersistentStartingProcesses:记录着试图在系统ready之前就启动的进程。在那时并不启动这些进程,先记录下来,等系统启动完成则启动这些进程。当进程属于这种类型也需要重启。

5. 小结

当crash进程执行kill操作后,进程被杀。此时需要掌握binder 死亡通知原理,由于Crash进程中拥有一个Binder服务端ApplicationThread,而应用进程在创建过程调用attachApplicationLocked(),从而attach到system_server进程,在system_server进程内有一个ApplicationThreadProxy,这是相对应的Binder客户端。当Binder服务端ApplicationThread所在进程(即Crash进程)挂掉后,则Binder客户端能收到相应的死亡通知,从而进入binderDied流程。更多关于bInder原理,这里就不细说,博客中有关于binder系列的专题。

四、 总结

本文主要以源码的视角,详细介绍了到应用crash后系统的处理流程:

首先发生crash所在进程,在创建之初便准备好了defaultUncaughtHandler,用来来处理Uncaught Exception,并输出当前crash基本信息;调用当前进程中的AMP.handleApplicationCrash;经过binder ipc机制,传递到system_server进程;接下来,进入system_server进程,调用binder服务端执行AMS.handleApplicationCrash;从mProcessNames查找到目标进程的ProcessRecord对象;并将进程crash信息输出到目录/data/system/dropbox;执行makeAppCrashingLocked 创建当前用户下的crash应用的error receiver,并忽略当前应用的广播;停止当前进程中所有activity中的WMS的冻结屏幕消息,并执行相关一些屏幕相关操作; 再执行handleAppCrashLocked方法, 当1分钟内同一进程连续crash两次时,且非persistent进程,则直接结束该应用所有activity,并杀死该进程以及同一个进程组下的所有进程。然后再恢复栈顶第一个非finishing状态的activity;当1分钟内同一进程连续crash两次时,且persistent进程,,则只执行恢复栈顶第一个非finishing状态的activity;当1分钟内同一进程未发生连续crash两次时,则执行结束栈顶正在运行activity的流程。 通过mUiHandler发送消息SHOW_ERROR_MSG,弹出crash对话框;到此,system_server进程执行完成。回到crash进程开始执行杀掉当前进程的操作;当crash进程被杀,通过binder死亡通知,告知system_server进程来执行appDiedLocked();最后,执行清理应用相关的activity/service/ContentProvider/receiver组件信息。

这基本就是整个应用Crash后系统的执行过程。 最后,再说说对于同一个app连续crash的情况:

当60s内连续crash两次的非persistent进程时,被认定为bad进程:那么如果第3次从后台启动该进程(Intent.getFlags来判断),则会拒绝创建进程;当crash次数达到两次的非persistent进程时,则再次杀该进程,即便允许自启的service也会在被杀后拒绝再次启动。 原文地址:android6.0中app crash流程分析
转载请注明原文地址: https://www.6miu.com/read-46217.html

最新回复(0)