什么是 OOM?
OOM 的全称是 Out-Of-Memory,是由于 iOS 的 Jetsam 机制造成的一种“另类” Crash,它不同于常规的 Crash,通过 Signal 捕获等 Crash 监控方案无法捕获到 OOM 事件。
为什么会发生 oom?
目前猜测两种情况会造成 OOM,
系统整体内存使用较高,系统基于优先级杀死优先级较低的 App
当前使用的 App 达到了 “high water mark”,也就是达到了系统对单个 App 的内存限制,系统会将你 Kill
验证方案 1 :
XNU 中 opensource.apple.com/source/xnu/… 、opensource.apple.com/source/xnu/… 提供了一些函数和宏,我们可以在 root 权限下使用这些宏和函数来获取当前状态下的所有 App 的 oom 内存阈值,并且基于 PID 甚至可以修改进程的 内存阈值,达到增大 oom内存阈值的效果。
对我们最有用的信息如下:
// 获取进程的 pid、优先级、状态、内存阈值等信息 typedef struct memorystatus_priority_entry { pid_t pid; int32_t priority; uint64_t user_data; int32_t limit; uint32_t state; } memorystatus_priority_entry_t; // 基于下面这些宏可以达到查询内存阈值等信息,也可以修改内存阈值等 /* Commands */ #define MEMORYSTATUS_CMD_GET_PRIORITY_LIST 1 #define MEMORYSTATUS_CMD_SET_PRIORITY_PROPERTIES 2 #define MEMORYSTATUS_CMD_GET_JETSAM_SNAPSHOT 3 #define MEMORYSTATUS_CMD_GET_PRESSURE_STATUS 4 #define MEMORYSTATUS_CMD_SET_JETSAM_HIGH_WATER_MARK 5 /* Set active memory limit = inactive memory limit, both non-fatal */ #define MEMORYSTATUS_CMD_SET_JETSAM_TASK_LIMIT 6 /* Set active memory limit = inactive memory limit, both fatal */ #define MEMORYSTATUS_CMD_SET_MEMLIMIT_PROPERTIES 7 /* Set memory limits plus attributes independently */ #define MEMORYSTATUS_CMD_GET_MEMLIMIT_PROPERTIES 8 /* Get memory limits plus attributes */ #define MEMORYSTATUS_CMD_PRIVILEGED_LISTENER_ENABLE 9 /* Set the task's status as a privileged listener w.r.t memory notifications */ #define MEMORYSTATUS_CMD_PRIVILEGED_LISTENER_DISABLE 10 /* Reset the task's status as a privileged listener w.r.t memory notifications */ /* Commands that act on a group of processes */ #define MEMORYSTATUS_CMD_GRP_SET_PROPERTIES 100
我们可以创建一个如下代码的程序
#include <stdlib.h> #include <string.h> #include <stdio.h> #include "kern_memorystatus.h" #define NUM_ENTRIES 1024 char *state_to_text(int State) { // Convert kMemoryStatus constants to a textual representation static char returned[80]; sprintf (returned, "0x%02x ",State); if (State & kMemorystatusSuspended) strcat(returned,"Suspended,"); if (State & kMemorystatusFrozen) strcat(returned,"Frozen,"); if (State & kMemorystatusWasThawed) strcat(returned,"WasThawed,"); if (State & kMemorystatusTracked) strcat(returned,"Tracked,"); if (State & kMemorystatusSupportsIdleExit) strcat(returned,"IdleExit,"); if (State & kMemorystatusDirty) strcat(returned,"Dirty,"); if (returned[strlen(returned) -1] == ',') returned[strlen(returned) -1] = '\0'; return (returned); } int main (int argc, char **argv) { struct memorystatus_priority_entry memstatus[NUM_ENTRIES]; size_t count = sizeof(struct memorystatus_priority_entry) * NUM_ENTRIES; // call memorystatus_control int rc = memorystatus_control (MEMORYSTATUS_CMD_GET_PRIORITY_LIST, // 1 - only supported command on OS X 0, // pid 0, // flags memstatus, // buffer count); // buffersize if (rc < 0) { perror ("memorystatus_control"); exit(rc);} int entry = 0; for (; rc > 0; rc -= sizeof(struct memorystatus_priority_entry)) { printf ("PID: %5d\tPriority:%2d\tUser Data: %llx\tLimit:%2d\tState:%s\n", memstatus[entry].pid, memstatus[entry].priority, memstatus[entry].user_data, memstatus[entry].limit, state_to_text(memstatus[entry].state)); entry++; } }
然后通过 MonekyDev 提供的 Command-line Tool 工具将程序注入到越狱设备(当时的测试环境为5s、iOS 9.1)中去,通过 SSH 连接到设备,然后通过终端运行该程序。就可以得到 dump 的信息。如下所示:
PID: 9967 Priority: 3 User Data: 0 Limit: 6 State:0x38 Tracked,IdleExit,Dirty PID: 11151 Priority: 3 User Data: 0 Limit: 6 State:0x38 Tracked,IdleExit,Dirty PID: 11154 Priority: 3 User Data: 0 Limit:10 State:0x38 Tracked,IdleExit,Dirty PID: 11165 Priority: 3 User Data: 0 Limit: 6 State:0x38 Tracked,IdleExit,Dirty PID: 11499 Priority: 3 User Data: 0 Limit:18 State:0x28 Tracked,Dirty PID: 10039 Priority: 4 User Data: 2100 Limit:108 State:0x00 PID: 9981 Priority: 7 User Data: 0 Limit:10 State:0x08 Tracked PID: 9977 Priority: 7 User Data: 0 Limit:20 State:0x08 Tracked PID: 9979 Priority: 7 User Data: 0 Limit:25 State:0x38 Tracked,IdleExit,Dirty PID: 10021 Priority: 7 User Data: 0 Limit: 6 State:0x08 Tracked PID: 11575 Priority:10 User Data: 10100 Limit:650 State:0x00 PID: 103 Priority:11 User Data: 0 Limit:96 State:0x08 Tracked PID: 11442 Priority:11 User Data: 0 Limit:38 State:0x08 Tracked PID: 67 Priority:12 User Data: 0 Limit:24 State:0x28 Tracked,Dirty PID: 31 Priority:14 User Data: 0 Limit:650 State:0x08 Tracked PID: 45 Priority:14 User Data: 0 Limit: 9 State:0x08 Tracked
以上代码中,Priority:10 的进程就是我测试的 好好学习 App,此时 App 在前台并且活跃,所以优先级是 10,并且得到 oom 内存阈值是 650
验证方案 2 :
当我们的 App 由于 jetsam 被杀死的时候,在手机中会有系统日志,从手机设置-隐私-分析这条操作路径中,可以拿到JetsamEvent 开头的日志。这些日志中就可以获取一些关于 App 的内存信息,以我的 6s 为例,pageSize * rpages 的值获取的值便是阈值,同时日志中也表明原因是 "reason" : "per-process-limit" (并不是所有的 JetsamEvent 中都可以拿到准确的阈值,有的存在偏差。。。)
"pageSize" : 16384 { "uuid" : "b8d6682c-5903-3007-b9c2-561d1e6ca9d5", "states" : [ "frontmost", "resume" ], "killDelta" : 18859, "genCount" : 0, "age" : 1775369503, "purgeable" : 0, "fds" : 50, "coalition" : 691, "rpages" : 89600, "reason" : "per-process-limit", "pid" : 960, "cpuTime" : 1.6920809999999999, "name" : "MemoryLimitTest", "lifetimeMax" : 34182 }
验证方案 3:
可以通过大量的测试来寻找它的oom 内存阈值是多少,StackOverFlow 上已经存在一个清单,该清单列举了一些常见设备的 oom 阈值。该清单阈值和真实阈值存在偏差,我猜测原有有二,第一,它取内存的时机不可能完全和 oom 时机吻合,只能尽可能接近这个时机,第二,他取内存的方法和 XNU 中 jetsam 机制所用的内存获取方式不一致。正确获取内存的方式下面会阐述。
Results of testing with the utility Split wrote (link is in his answer): device: (crash amount/total amount/percentage of total) iPad1: 127MB/256MB/49% iPad2: 275MB/512MB/53% iPad3: 645MB/1024MB/62% iPad4: 585MB/1024MB/57% (iOS 8.1) iPad Mini 1st Generation: 297MB/512MB/58% iPad Mini retina: 696MB/1024MB/68% (iOS 7.1) iPad Air: 697MB/1024MB/68% iPad Air 2: 1383MB/2048MB/68% (iOS 10.2.1) iPad Pro 9.7": 1395MB/1971MB/71% (iOS 10.0.2 (14A456)) iPad Pro 10.5”: 3057/4000/76% (iOS 11 beta4) iPad Pro 12.9” (2015): 3058/3999/76% (iOS 11.2.1) iPad Pro 12.9” (2017): 3057/3974/77% (iOS 11 beta4) iPod touch 4th gen: 130MB/256MB/51% (iOS 6.1.1) iPod touch 5th gen: 286MB/512MB/56% (iOS 7.0) iPhone4: 325MB/512MB/63% iPhone4s: 286MB/512MB/56% iPhone5: 645MB/1024MB/62% iPhone5s: 646MB/1024MB/63% iPhone6: 645MB/1024MB/62% (iOS 8.x) iPhone6+: 645MB/1024MB/62% (iOS 8.x) iPhone6s: 1396MB/2048MB/68% (iOS 9.2) iPhone6s+: 1392MB/2048MB/68% (iOS 10.2.1) iPhoneSE: 1395MB/2048MB/69% (iOS 9.3) iPhone7: 1395/2048MB/68% (iOS 10.2) iPhone7+: 2040MB/3072MB/66% (iOS 10.2.1) iPhone X: 1392/2785/50% (iOS 11.2.1) https://stackoverflow.com/questions/5887248/ios-app-maximum-memory-budget/15200855#15200855
如何正确度量 App 的使用内存
常见的获取 App 内存的方式是使用 resident_size 代码如下:
#import <mach/mach.h> - (int64_t)memoryUsage { int64_t memoryUsageInByte = 0; struct task_basic_info taskBasicInfo; mach_msg_type_number_t size = sizeof(taskBasicInfo); kern_return_t kernelReturn = task_info(mach_task_self(), TASK_BASIC_INFO, (task_info_t) &taskBasicInfo, &size); if(kernelReturn == KERN_SUCCESS) { memoryUsageInByte = (int64_t) taskBasicInfo.resident_size; NSLog(@"Memory in use (in bytes): %lld", memoryUsageInByte); } else { NSLog(@"Error with task_info(): %s", mach_error_string(kernelReturn)); } return memoryUsageInByte; }
而正确的方式应该是使用 phys_footprint,因为 Apple 就是用的这个指标,和 Apple 保持一致才能说明问题。可以看源码验证一下:opensource.apple.com/source/xnu/…
#import <mach/mach.h> - (int64_t)memoryUsage { int64_t memoryUsageInByte = 0; task_vm_info_data_t vmInfo; mach_msg_type_number_t count = TASK_VM_INFO_COUNT; kern_return_t kernelReturn = task_info(mach_task_self(), TASK_VM_INFO, (task_info_t) &vmInfo, &count); if(kernelReturn == KERN_SUCCESS) { memoryUsageInByte = (int64_t) vmInfo.phys_footprint; NSLog(@"Memory in use (in bytes): %lld", memoryUsageInByte); } else { NSLog(@"Error with task_info(): %s", mach_error_string(kernelReturn)); } return memoryUsageInByte; }
oom 定位的方案
方案1:
最早看到 oom 相关的方案是 FaceBook 的一篇博客中讲到的,code.facebook.com/posts/11469…,通过排除法来统计 OOM 率是多少。当然这种方案统计的结果多少会与实际数据存在误差,比如 ApplicationState 不准确,watchdog 也被统计在 oom 中之类的。
方案2:
近期腾讯也开源了自己的 OOM 定位方案,OOMDetector 组件:github.com/Tencent/OOM… 。这种方案通过利用 libmalloc 中的 malloc_logger 函数指针,可以通过堆栈来帮助开发定位大内存。但是也存在一些缺陷,就是频繁的 dump 堆栈对 App 性能造成了影响,只能灰度一小部分用户来进行数据统计和定位。
方案3:
基于近期的发现,可以在线下获取 App 的 high water mark,也就是 oom 内存阈值。 那么就产生了方案3
监控内存增长,在达到 high water mark 附近的时候,dump 内存信息,获取对象名称、对象个数、各对象的内存值;如果稳定可以全量开启,不会有性能问题
OOMDetector 可以拿到分配内存的堆栈,对于定位到代码层面更加有效;可以灰度开放
作者:Joy_xx
原文链接:https://juejin.im/post/5c28646f5188257abf1d947d
共同学习,写下你的评论
评论加载中...
作者其他优质文章