oracle 11G rac服务不能停止

xiaoxiao2021-02-28  128

1.问题: 节点二用crsctl stop crs -f停rac服务,无法停止,d.bin相关的9个进程都还存在 版本:oracle 11.2.0.4 for solaris 2.分析: 查看/abcapp/oragrid/11.2.0/log/abc208下的alert.log文件,日志如下: [/abcapp/oragrid/11.2.0/bin/scriptagent.bin(10605)]CRS-5818:Aborted command 'clean' for resource 'ora.oc4j'. Details at (:CRSAGF00 113:) {2:26009:18659} in /abcapp/oragrid/11.2.0/log/abc208/agent/crsd/scriptagent_oragrid/scriptagent_oragrid.log. 2017-08-30 23:28:10.192:  [crsd(62374)]CRS-2757:Command 'Clean' timed out waiting for response from the resource 'ora.oc4j'. Details at (:CRSPE00111:) {2:2600 9:18659} in /abcapp/oragrid/11.2.0/log/abc208/crsd/crsd.log. /abcapp/oragrid/11.2.0/log/abc208/crsd/crsd.log报错如下: 2017-08-30 23:48:10.228: [UiServer][47]{2:26009:18672} Container [ Name: ORDER         MESSAGE:          TextMessage[CRS-2680: Clean of 'ora.oc4j' on 'abc208' failed]         MSGTYPE:          TextMessage[1]         OBJID:          TextMessage[ora.oc4j]         WAIT:          TextMessage[0] ] 2017-08-30 23:48:10.228: [   CRSPE][46]{2:26009:18672} Sequencer for [ora.oc4j 1 1] has completed with error: CRS-0216: Could not st op resource 'ora.oc4j'. 2017-08-30 23:48:10.230: [UiServer][47]{2:26009:18673} Container [ Name: ORDER         MESSAGE:          TextMessage[CRS-2503: Resource 'ora.oc4j' is in UNKNOWN state and must be stopped first]         MSGTYPE:          TextMessage[1]         OBJID:          TextMessage[ora.oc4j]         WAIT:          TextMessage[0] ] /abcapp/oragrid/11.2.0/log/abc208/agent/crsd/scriptagent_oragrid/scriptagent_oragrid.log如下: 2017-08-30 22:37:10.040: [ora.oc4j][46]{1:63945:12686} [check] Executing action script: /abcapp/oragrid/11.2.0/bin/oc4jctl[check] 2017-08-30 22:37:49.597: [    AGFW][9]{1:63945:12686} Agent received the message: AGENT_HB[Engine] ID 12293:21601515 2017-08-30 22:38:10.044: [   AGENT][58]{1:63945:12686} {1:63945:12686} Created alert : (:CRSAGF00113:) :  Aborting the command: chec k for resource: ora.oc4j 1 1 2017-08-30 22:38:10.044: [ora.oc4j][58]{1:63945:12686} [check] Killing action script: check 2017-08-30 22:38:10.044: [    AGFW][58]{1:63945:12686} Command: check for resource: ora.oc4j 1 1 completed with status: TIMEDOUT 2017-08-30 22:38:10.072: [    AGFW][46]{1:63945:12686} Received unknown resource status code: 255 2017-08-30 22:38:49.600: [    AGFW][9]{1:63945:12686} Agent received the message: AGENT_HB[Engine] ID 12293:21601539 2017-08-30 22:39:10.047: [ora.oc4j][46]{1:63945:12686} [check] Executing action script: /abcapp/oragrid/11.2.0/bin/oc4jctl[check] 2017-08-30 22:39:49.603: [    AGFW][9]{1:63945:12686} Agent received the message: AGENT_HB[Engine] ID 12293:21601561 2017-08-30 22:40:10.049: [   AGENT][58]{1:63945:12686} {1:63945:12686} Created alert : (:CRSAGF00113:) :  Aborting the command: chec k for resource: ora.oc4j 1 1 上面明显为oc4j服务停不下来阻塞了后面的服务引起,oc4j为jvm的进程,理论上杀掉grid用户下的java进程即可。 -bash-4.1$ kill -9 10789 -bash-4.1$ ps -ef |grep 10789  oragrid 10789     1   0   May 29 ?         847:17 /abcapp/oragrid/11.2.0/jdk/bin/sparcv9/java -server -Xcheck:jni -Xms128M -Xmx 杀了很多遍,没有反应。 说明问题是由java 进程僵死导致的。而检查发现实例1上没有跑oc4j服务,grid用户下没有对应java进程,所以,不会有这个问题。 3.解决: 节点二重启OS,执行init 6,若执行后没有反应的话,将crsd进程kill后,os就能重启了。 启动OS后能正常启crs服务和数据库实例,并启动oc4j服务,crsctl start res ora.oc4j,最后节点一重启crs服务非常顺利。
转载请注明原文地址: https://www.6miu.com/read-23579.html

最新回复(0)