MTK 驱动开发(37)--如何确定阻止进入suspend的原因

xiaoxiao2021-02-28  31

1、系统无法进入suspend

系统没有进入suspend,主要的原因是因为系统有锁导致.

锁一般分为:APP透过PowerManager拿锁,以及kernel wakelock.

分析上层持锁的问题:

目前PowerManagerService的log 默认不会打开,可以通过修改:

/frameworks/base/services/core/java/com/android/server/power/PowerManagerService.java

private static final boolean DEBUG = true; private static final boolean DEBUG_SPEW = DEBUG && false; 修改为: private static final boolean DEBUG = true; private static final boolean DEBUG_SPEW = true; 打开上层的log

通过syslog:搜索关键字:total_time=来确定持锁的时间.

PowerManagerService: releaseWakeLockInternal: lock=31602562 [*job*/DownloadManager:com.android.providers.downloads], flags=0x0, total_time=600051ms

或者通过正则表达式:total_time=[\d]{4,}ms 过滤出持锁时间比较长的锁.

PowerManagerService: releaseWakeLockInternal: lock=31602562 [*job*/DownloadManager:com.android.providers.downloads], flags=0x0, total_time=600051msPowerManagerService: releaseWakeLockInternal: lock=56317918 [*job*/DownloadManager:com.android.providers.downloads], flags=0x0, total_time=283062msPowerManagerService: releaseWakeLockInternal: lock=216012597 [AudioMix], flags=0x0, total_time=120003msPowerManagerService: releaseWakeLockInternal: lock=41036921 [AudioMix], flags=0x0, total_time=167984msPowerManagerService: releaseWakeLockInternal: lock=70859243 [GsmInboundSmsHandler], flags=0x0, total_time=3206ms

PowerManagerService: releaseWakeLockInternal: lock=242046348 [AudioMix], flags=0x0, total_time=122205ms

如何分析wakelock(wakeup source)持锁问题

锁一般分为:APP透过PowerManager拿锁,以及kernel wakelock.

分析上层持锁的问题:

目前PowerManagerService的log 默认不会打开,可以通过修改:

/frameworks/base/services/core/java/com/android/server/power/PowerManagerService.java

private static final boolean DEBUG = true; private static final boolean DEBUG_SPEW = DEBUG && false; 修改为: private static final boolean DEBUG = true; private static final boolean DEBUG_SPEW = true; 打开上层的log

通过syslog:搜索关键字:total_time=来确定持锁的时间.

PowerManagerService: releaseWakeLockInternal: lock=31602562 [*job*/DownloadManager:com.android.providers.downloads], flags=0x0, total_time=600051ms

或者通过正则表达式:total_time=[\d]{4,}ms 过滤出持锁时间比较长的锁.

PowerManagerService: releaseWakeLockInternal: lock=31602562 [*job*/DownloadManager:com.android.providers.downloads], flags=0x0, total_time=600051msPowerManagerService: releaseWakeLockInternal: lock=56317918 [*job*/DownloadManager:com.android.providers.downloads], flags=0x0, total_time=283062msPowerManagerService: releaseWakeLockInternal: lock=216012597 [AudioMix], flags=0x0, total_time=120003msPowerManagerService: releaseWakeLockInternal: lock=41036921 [AudioMix], flags=0x0, total_time=167984msPowerManagerService: releaseWakeLockInternal: lock=70859243 [GsmInboundSmsHandler], flags=0x0, total_time=3206msPowerManagerService: releaseWakeLockInternal: lock=242046348 [AudioMix], flags=0x0, total_time=122205ms

 

kernel的锁默认不会打印出来,一般是待机结束后通过节点来获取:

adb shell  cat /sys/kernel/debug/wakeup_sources >  wakeup_sources.log

active_count:对应wakeup source被激活的次数.event_count:被信号唤醒的次数 wakeup_count:中止suspend的次数. expire_count:对应wakeup source超时的次数. active_since:上一次还活跃的时间点.时间单位跟kernel log前缀时间是一样(kernel单调递增时间). total_time:对应wakeup source活跃的总时长. max_time:对应的wakeup source持续活跃最长的一次时间. last_change:上一次wakeup source变化的时间(从持锁到释放or释放到持锁),时间单位跟kernel log前缀时间是一样(kernel单调递增时间). prevent_suspend_time:对应wakeup source阻止进入autosleep的总累加时间.

一般情况下:如果是复现机,前面没有捉log,也没有dump log,只有一份wakeup_sources.log可以看下prevent_suspend_time,一般时间越大越可能是阻止系统进入suspend的wakeup sources.

如果测试前后,都有捉 wakeup_sources.log 请对两份wakeup_sources.log的total time的差值.差值时间跟灭屏的时间对得上,一般就是这个锁引起的问题.

把捉出来的wakeup_sources.log复制到excel表格中,比较好对齐,一个是比较好计算.

其中dispsys_wakelock total_time的时间有697614mS 也就是总共有697s.

 

或者在待机测试结束后通过命令:

adb bugreport > bugreport.txt

搜索关键:

底层的锁:All kernel wake locks: Kernel Wake lock ttyC0 : 1h 33m 15s 668ms (3856 times) realtime Kernel Wake lock radio-interface: 1h 20m 56s 210ms (3995 times) realtime Kernel Wake lock ccci3_at : 1h 9m 43s 491ms (2932 times) realtime Kernel Wake lock ccci_fs : 1h 0m 52s 818ms (3432 times) realtime Kernel Wake lock ccci3_at2 : 41m 16s 938ms (2465 times) realtime

上层的锁:All partial wake locks: Wake lock 1001 RILJ: 5m 29s 768ms (13118 times) realtime Wake lock 1000 *alarm*: 4m 7s 823ms (2330 times) realtime Wake lock 1000 ConnectivityService: 59s 513ms (1 times) realtime Wake lock u0a111 *alarm*: 50s 334ms (751 times) realtime Wake lock u0a111 WakerLock:999603354: 28s 655ms (125 times) realtime Wake lock 1000 NetworkStats: 11s 434ms (569 times) realtime

3 如何看SPM的状态是否正确

待机被唤醒之后,确认:大部分debug_flag最后的bit位是不是f or ff, 说明系统还有模块咬住26M,系统并没有最后进入真正的suspend.

譬如下面的log都是有问题的:<4>[  250.874040] -(0)[1244:system_server][SPM] wake up byCONN2AP, timer_out = 8635, r13 = 0x20045038, debug_flag = 0x90

<4>[  600.704307] -(0)[2722:system_server][SPM] wake up by R12_EINT_EVENT_B, timer_out = 81779, r13 = 0xe040000, debug_flag = 0x113f0

如何查找待机唤醒源 

系统场景的唤醒源:EINT/CONN/CLDMA

EINT:

PMIC的唤醒.a.Powerkey唤醒后面的log会有pwrkey_int_handler

b. rtc alarm唤醒后面的log会有alarm time is up

<2>[ 1145.475797]<3>-(0)[4497:kworker/u8:10][SPM] wake up by EINT, timer_out = 1406849, r13 = 0x41208, debug_flag = 0x9f<7>[ 1145.475797]<3>-(0)[4497:kworker/u8:10]EINT 206 is pending<5>[ 1145.476310]<3> (0)[53:pmic_thread]mtk_rtc_common: alarm time is up<6>[ 1145.895970]<0> (0)[4497:kworker/u8:10]PM: suspend exit 2016-01-24 17:23:04.023386308 UTC<2>[ 1146.231012]<3>-(0)[4497:kworker/u8:10][SPM] md_settle = 99, settle = 99

 

具体类型的唤醒包,可以确认:从syslog里面搜索关键字wakeup alarm:01-25 01:23:04.026 830 898 D AlarmManager: wakeup alarm = Alarm{3e671462 type 2 when 27213196 com.android.phone}; package = com.android.phoneneedGrouping = true

一般alarm的唤醒,除了第三方APK之外,有时会遇到类似android/phone APK的唤醒.确认具体android的唤醒的原因,需要确定唤醒后,紧接着发下来的intent事件.01-24 19:50:30.031 830 898 D AlarmManager: wakeup alarm = Alarm{9ba1b41 type 2 when 7259546 android}; package = androidneedGrouping = false

01-24 19:50:30.031 830 898 V ActivityManager: Broadcast: Intent { act=android.content.syncmanager.SYNC_ALARM flg=0x114 (has extras) } ordered=true userid=0 callerApp=null搜索:android.content.syncmanager.SYNC_ALARM可以定位到:/frameworks/base/services/core/java/com/android/server/content/SyncManager.java进一步找对应owner确认.

phone apk的唤醒:数据网络的定时恢复.

c. otherskernel log有关键字:EINT.*is pending序号:206EINT 206 is pending

需要结合DCT跟cat /proc/interrupts

1.通过DCT:

2.cat /proc/interrupts :pmic-eint对应的序号是150.

289: 149 mt-eint 1 TOUCH_PANEL-eint291: 0 mt-eint 3 11240000.msdc1 cd294: 0 mt-eint 6 ALS-eint295: 0 mt-eint 7 mrdump_ext_rst-eint314: 73 mt-eint 26 irq_nfc-eint332: 246 mt-eint 44432: 0 mt-eint 144 iddig_eint438: 341 mt-eint 150 pmic-eint440: 0 mt-eint 152 spm_vcorefs_start_eint441: 0 mt-eint 153 spm_vcorefs_end_eint442: 0 mt-eint 154 spm_vcorefs_err_eint

<5>[30640.939329] -(0)[1191:system_server]EINT 150 is pending......<3>[30640.942131] (0)[69:pmic_thread]kpd: Power Key generate, pressed=1

<3>[30640.942189] (0)[69:pmic_thread]kpd: kpd: (pressed) HW keycode =116 using PMIC

CLDMA:

确认唤醒的channel ID,关键字CLDMA_MD wakeup source<2>[ 3072.796534]<2>-(0)[14284:kworker/u8:0][SPM] wake up by CLDMA_MD, timer_out = 5598687, r13 = 0x20001238, debug_flag = 0x9f//唤醒点之后第一句CLDMA_MD wakeup source会打印出channel ID. 譬如该题是10<5>[ 3072.810528]<2> (1)[23510:kworker/u9:0][ccci1/mcd]CLDMA_MD wakeup source:(1/10)<6>[ 3072.872362]<3> (1)[14284:kworker/u8:0]PM: suspend exit 2016-02-05 11:28:18.030905923 UTC<6>[ 3074.000405]<2> (0)[14284:kworker/u8:0]PM: suspend exit 2016-02-05 11:28:19.158954462 UTC<2>[ 3075.262223]<3>-(0)[14284:kworker/u8:0][SPM] md_settle = 99, settle = 99<2>[ 3075.262223]<3>-(0)[14284:kworker/u8:0][SPM] wake up by EINT, timer_out = 4159454, r13 = 0x41000, debug_flag = 0x9f<6>[ 3075.341978]<3> (1)[14284:kworker/u8:0]PM: suspend exit 2016-02-05 11:30:28.031774693 UTC<2>[ 3075.996184]<2>-(0)[14284:kworker/u8:0][SPM] md_settle = 99, settle = 99

常用的唤醒的channel:

[channel 10]:需要搜索radio log确认是什么AT command唤醒了系统.时间点:2016-02-05 11:28:18.030905923 + 8h = 2016-02-05 19:28:18.030905923

以19:28:18搜索radio log,往下搜索,找到第一条AT<02-05 19:28:18.032 20835 20856 D AT : AT< +CIREPI: 0说明唤醒的AT command是AT< +CIREPI

针对AT command的唤醒,可以下载DCC文档 :AT_DOCUMENT_Modem

 常见的AT command的唤醒:参考下面【常见AT 命令解析】

AT +ECOPS ------->PLMN信息变化.搜网的次数  ------------>  AT: CREG   CGREG   (一个是CS,一个是PS)网络PDP状态变化的次数         ---------->   AT:CGEV  (PDN activate/deactivate)VOLTE功能导致唤醒的次数      ----------->   AT: CIREPI   CIREPH CNEMS1  CIREG EIMSLTE数据连接 ------------> AT: EDRBSTATE

[channel 14]一般是跟channel10一起产生,modem小区消息变化,会记录小区的信息到nvram.

[channel20/24]数据连接的唤醒.<2>[10133.766886]<2>-0)[SPM] wake up by CLDMA_MD, timer_out = 1901490, r13 = 0x340a133c, debug_flag = 0x31f<5>[10133.774491]<2> 2)[ccci1/mcd]CLDMA_MD wakeup source:(4/24)<6>[10133.822549]<1> (0)[14053:kworker/u16:2]PM: suspend exit 2016-01-06 16:37:59.027987308 UTC<2>[10134.994190]<0>-0)[SPM] md_settle = 99, settle = 99<2>[10134.994190]<0>-0)[SPM] wake up by CLDMA_MD, timer_out = 1918768, r13 = 0x340a133c, debug_flag = 0x31f<5>[10134.996421]<0> 0)[ccci1/mcd]CLDMA_MD wakeup source:(4/24)<6>[10135.087194]<1> (0)[14053:kworker/u16:2]PM: suspend exit 2016-01-06 16:38:59.069914154 UTC<2>[10136.211339]<1>-0)[SPM] md_settle = 99, settle = 99

使用winshark打开netlog:

main log里面搜索IP 地址:112.17.251.14801-06 21:47:58.060   313 25080 D libc-netbsd: res_queryN name = push.hexin.cn succeed01-06 21:47:58.060 25056 25081 I AppStore.StorageManager: ab:e:150407-372=>in removeSpuriousFiles

01-06 21:47:58.060 21561 25079 D libc-netbsd: getaddrinfo: push.hexin.cn get result from proxy >>01-06 21:47:58.060 21561 25079 I System.out: propertyValue:true01-06 21:47:58.061 21561 25079 I System.out: [CDS]connect[push.hexin.cn/112.17.251.148:8887] tm:90请参考【如何确认具体一次唤醒,是哪个APK引起的】:

[channel32]modem SIM driver获取sim GPIO状态所造成的唤醒. 这部分情况比较少,确认review 贵司sim driver是否有相关的design.

[channel34]WIFI 跟4G 有部分频段是重叠的,modem 需要把频段的信息通知WIFI所造成的唤醒. 这部分属于正常的design.

[channel55]VOLTE 网络的唤醒.

[channel6/42]这是打开modem log造成的,分析功耗问题除非发现跟modem有关系,否则捉log时,不要打开modem log

具体的channel 对应的定义可以参考:

N版本:/kernel-4.4/drivers/misc/mediatek/eccci/ccci_core.hM版本:/kernel-3.18/drivers/misc/mediatek/include/mt-plat/mt_ccci_common.hL版本:/kernel-3.10/include/mach/mt_ccci_common.h

[channel 14与10]常见AT 命令解析,注意红色字体部分字段

 

网络切换状态AT:

命令1:AT+EDRBSTATE

命令2:AT< +COPS(运营商信息)

命令4: AT< +CGREG

命令6:AT+EGTYPE(attach状态)

命令8:AT+EI3GPPIRAT(C2K切网)

 

信号强度相关AT:

命令7:AT< +ECSQ(信号强度)

 

网络注册状态AT:

命令3:AT< +CREG(NEWORK REGISTER)

命令4: AT< +CGREG

 

网络VOLTE支持情况上报:

命令5:AT+CIREP (IMS网络支持情况)

 

 

详细解析:

命令1:AT+EDRBSTATE( 4G data link 的URC data shapping,状态变原因是data link状态是否发生变化)

+EDRBSTATE主动上报有三种情况:

  (1)每次下AT+EDRB=1,会主动上报EDRBSTATE

  (2)如果当前act的状态和之前记录的act的状态一个是7,一个非7,那么就会主动上报EDRBSTATE。需要继续check log中主动上报的CGREG并不多

  (3)当主动上报的消息MSG_ID_L4C_EVAL_CONNECTED_DRB_IND中带的参数is_drb_existent与之前得到的不同时,会主动上报EDRBSTATE。

AT< +EDRBSTATE: 0,7

第一个参数:

0:DRB(data radio state)不存在 DRB OFF 表示現在可能是4G data link but AS IDLE/ 23G / 4G no data link

1:drb存在 DRB ON 表示 4G data link but AS CONNECTED

第二参数:act 同其他 可以同CREG一起理解

AT+EDRB:

MODE:0 :TURN OFF URC+EDRBSTATE 1:TURN ON URC+EDRBSTATE

 

命令2:AT< +COPS(运营商信息)

 AT< +COPS: 0,2,"46001",2

 第一个参数mode:0 自动,1自动

 第二参数:format 对第三个参数的制式

 第三个参数:oper 运营商string,46000 移动 46001联通

 第四个参数:网络act

 0:gsm

 2:utran

 7:lte

 

命令3:AT< +CREG(NEWORK REGISTER)

  AT< +CREG: 3,1,"A80F","00CD5C14",6,0,0

  第1个参数:0:disable 网络register urc 1: enable 网络register +cgreg 2:enable 网络cgreg+local info +cgreg stat+lac+ci 3:同2+error

  第2个参数:0:not register 1:register home network 2:没有注册,但是try 3:register 拒绝 4:...

 第3个参数:lac:location area code

  第4个参数:ci:cell info

 第5个参数:网络act

  0:gsm

 2:utran

 3:gsm w/egprs

4:utran w/hsdpa (High Speed Downlink Packet Access)高速下行分组接入

 5:utran w/hsupa(high speed uplink packet access)高速上行链路分组接入

 6:utran w/hsdpa and hsupa

 7:e-utran

 

命令4: AT< +CGREG(GPRS NETWORK):AT< +CEREG(EPS/LTE NETWORK) 这个是注册gprs网络和LTE网络的命令

 AT< +CGREG: 3,1,"A80F","00CD5C14",6,"0F",0,0

 第1个参数:0:disable 网络register urc 1: enable 网络register +cgreg 2:enable 网络cgreg+local info +cgreg stat+lac+ci 3:同2+error

  第2个参数:0:not register 1:register home network 2:没有注册,但是try 3:register 拒绝 4:...

 第3个参数:lac:location area code

第4个参数:ci:cell info

 第5个参数:网络act

0:gsm

  2:utran

  3:gsm w/egprs

  4:utran w/hsdpa

  5:utran w/hsupa

 6:utran w/hsdpa and hsupa

  7:e-utran

  第6个参数:rac:route area code

  第7个参数:cause type

  第8个参数:reject cause  

命令3,命令4情况:registration status changed, modem report +CREG/+CGREG URC

 

命令5:AT+CIREP (IMS网络支持情况)

AT+ECSQ信号格相关

 

命令6:AT+EGTYPE(attach状态)

  1: Always attached

  0: attached when needed

 

命令7:AT< +ECSQ(信号强度)

AT< +ECSQ: 41,99,-279,1,1,1,1,0

主要解析第一个参数,第一个参数变化说明信号强度发生变化

命令8:AT+EI3GPPIRAT(C2K切网)

C2K切网的AT命令

+EI3GPPIRAT= <act>, <src_rat>, <dst_rat>

<act>: Integer

1              source RAT start

2              target RAT start

3              source RAT finish

4              target RAT finish 

<src_rat>: Integer

                1              1xRTT

                2              HRPD

                3              EHRPD

                4              LTE

解释下EI3GPPIRAT的含义,

1(start),4(LTE),3(CDMA)

3(finishe),4(LTE),4(LTE)

 

命令9:modem开关

▪       AT+EPOF – used to power off modem

▪       AT+EPON – used to power on modem 

▪       The first command trigger whole protocol stack: AT+CFUN - Set phone functionality

–         0: minimum functionality

–         1: full functionality

–         2: disable TX RF circuits

–         3: disable RX RF circuits

–         4: disable both TX and RX circuits

 

命令10:AT+COPS( turn on / off ECOPS) EOPS(ENhence operator selection) COPS(operator selection)

0:disable +EOPS URC

1:enable+EOPS URC

 

AT< +COPS +N +LIST plmn

NW->MS 第一个参数plmn count

第二个参数 plmn list

命令11:AT+ESMCI(switch modem cause info)

AT< +ESMCI: "46000",0,64

第一参数:plmn id

第二参数:plmn source

0:plmn search proc

1:cell reselection proc

255:invalid proc

第三参数:rat

0:RATE_NONE(DEFAULT)

1:GSM

2:UMTS

4:FDD_LTE

64:TD_LTE

68:TD_FDD_LTE

255:RAT_ANY

channel对应的表格:

CONN:

一般情况是因为打开了wifi 或者蓝牙,一般是wifi居多,APK如果建立起TCP 连接,网络包通过wifi回传给手机,都会这种类型的唤醒.

其中一次完整的唤醒有如下的log://第一句唤醒,wake up by CONN2AP,代表被connectivity wakeup,大部分情况是WIFI数据.<2>[ 172.612907]<1>-(0)[142:kworker/u8:4][SPM] wake up by CONN2AP, timer_out = 19411, r13 = 0x20047238, debug_flag = 0x9c //唤醒的时间点:<6>[ 172.679674]<1> (3)[142:kworker/u8:4]PM: suspend exit 2015-12-27 03:09:38.023614769 UTC<6>[ 172.948171]<1> (3)[142:kworker/u8:4]PM: suspend exit 2015-12-27 03:09:38.292113615 UTC<6>[ 173.572666]<1> (0)[142:kworker/u8:4]PM: suspend exit 2015-12-27 03:09:38.916604538 UTC<6>[ 173.989622]<3> (1)[142:kworker/u8:4]PM: suspend exit 2015-12-27 03:09:39.333548769 UTC<6>[ 174.399687]<3> (3)[142:kworker/u8:4]PM: suspend exit 2015-12-27 03:09:39.743622615 UTC//最后一句sleep的log. 只有出现md_settle才代表真正sleep.<2>[ 176.028364]<3>-(0)[95:kworker/u8:2][SPM] md_settle = 99, settle = 99

2015-12-27 03:09:38.023614769对应的main log时间为:2015-12-27 11:09:38.023614769

确认引起唤醒的问题包:在main log里面搜索关键字:Posix_connect Debug

Posix_connect Debug的log,只有在应用第一次建立TCP的连接的时候才会打印,后期TCP的来回握手,并不会打印出来.

譬如下面的log,可以发现对应的时间点,是豌豆荚这个应用在发包

【下面的log代表是那些APK在发包,有发包就有收包】 12-27 11:05:44.279 2895 3138 D Posix : [Posix_connect Debug]Process com. lava:pushservice  :80  12-27 11:06:02.786 4715 4778 D Posix : [Posix_connect Debug]Process com.sina.weibo :443  12-27 11:06:07.779 4715 4798 D Posix : [Posix_connect Debug]Process com.sina.weibo :443  12-27 11:06:10.080 4827 4917 D Posix : [Posix_connect Debug]Process com.sina. weibo:remote  :4828  12-27 11:06:15.465 4827 4917 D Posix : [Posix_connect Debug]Process com.sina. weibo:remote  :4828  12-27  11:09:38 .313 3635 3670 D Posix : [Posix_connect Debug] Process com.wandoujia.phoenix2:update_service  :443  12-27 11:09:42.347 4827 4917 D Posix : [Posix_connect Debug]Process com.sina. weibo:remote  :4828  12-27 11:09:49.055 4827 4917 D Posix : [Posix_connect Debug]Process com.sina. weibo:remote  :4829  12-27 11:09:49.111 4827 4942 D Posix : [Posix_connect Debug]Process com.sina. weibo:remote  :80  12-27 11:09:51.199 3495 3570 D Posix : [Posix_connect Debug]Process com.wandoujia.phoenix2 :80

转载请注明原文地址: https://www.6miu.com/read-2400160.html

最新回复(0)