Android Crash and ANR Monitoring: Implementation Insights and Best Practices
This article shares practical experiences from developing the Shenghai crash monitoring SDK for Android, covering Java exception capture, stack trace collection, processing, real‑time reporting, crash metrics, and ANR detection using FileObserver and WatchDog mechanisms.
The article introduces the Shenghai mobile quality monitoring platform, detailing its crash and ANR monitoring capabilities for Android apps.
It begins by explaining how to capture Java-level exceptions using Thread.setDefaultUncaughtExceptionHandler, preserving the previous handler to allow multiple SDKs to coexist.
// Thread.java
public static void setDefaultUncaughtExceptionHandler(UncaughtExceptionHandler eh) {
...
public interface UncaughtExceptionHandler {
void uncaughtException(Thread t, Throwable e);
}After obtaining the Throwable, the article shows how to extract exception type, message, stack trace, and cause.
Throwable ex
1) 异常类型:ex.getClass().getName();
2) 异常信息:ex.getLocalizedMessage();
3) 堆栈信息:ex.getStackTrace();
4) 异常起因:ex.getCause();It then describes ways to get stack traces for the main thread, current thread, and all threads.
Looper.getMainLooper().getThread().getStackTrace();
Thread.currentThread().getStackTrace();
Thread.getAllStackTraces();The StackTraceElement class is outlined, noting each element corresponds to a method call.
public final class StackTraceElement implements java.io.Serializable {
private String declaringClass;
private String methodName;
private String fileName;
private int lineNumber;
}The piece explains how crash logs are formatted (e.g., Fabric style) by concatenating class name, message, and stack trace, and repeats for each cause in the exception chain.
Fatal Exception:xxxThrowable:xxxMessage
at xxxStackTraceElement11
...
Caused by xxxCauseThrowable:xxxCauseMessage
...It mentions Fabric’s practical limits: stack trace length ≤1024 bytes, duplicate line removal (max 10 consecutive repeats), and exception chain depth ≤8.
For real‑time upload, the SDK uses ExecutorService with Future.get to avoid NetworkOnMainThreadException, while also planning retry logic.
Crash metrics discussed include device/user crash rate and session crash rate, with a definition of a session as a foreground interval ≥30 seconds.
The article then turns to ANR detection, describing a combined FileObserver (pre‑Android 5.0) and WatchDog approach, and provides the core method getProcessInANRState that polls ActivityManager for processes in state NOT_RESPONDING (value 2).
public static ActivityManager.ProcessErrorStateInfo getProcessInANRState(Context context, int totalCounts) {
if (context == null) return null;
ActivityManager activityManager = (ActivityManager) context.getSystemService(Context.ACTIVITY_SERVICE);
if (activityManager == null) return null;
ActivityManager.ProcessErrorStateInfo errorStateInfo;
int i = 0;
do {
List processErrorStateInfoList = activityManager.getProcessesInErrorState();
if (processErrorStateInfoList != null && !processErrorStateInfoList.isEmpty()) {
for (Object process : processErrorStateInfoList) {
errorStateInfo = (ActivityManager.ProcessErrorStateInfo) process;
if (errorStateInfo.condition == 2) {
return errorStateInfo;
}
}
}
ThreadUtils.sleep(500L);
} while (i++ <= totalCounts);
return null;
}ANR field information can be gathered from traces.txt, ProcessErrorStateInfo, or current thread stacks, each with trade‑offs in accuracy and latency.
The Shenghai SDK combines ProcessErrorStateInfo and live stack traces to achieve real‑time ANR upload.
Finally, the article briefly notes obfuscation handling (mapping file retention, retrace caching) and ends with an invitation to integrate the platform.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.