环境:
OS:Oracle Linux As 5.5
DB:11.2.0.1 RAC
如下我们通过使用hangalyze获取到两个会话更新同一行数据的等待事件情况.
1.session1更新行数据
SQL> create table tb_test (id number,name varchar2(10)) tablespace users;
Table created.
SQL> insert into tb_test values(1,'hang test');
1 row created.
SQL> commit;
Commit complete.
SQL> select userenv('sid') from dual;
USERENV('SID')
--------------
69
SQL> update tb_test set name='hang' where id=1;
1 row updated.
SQL>
注意:这里不执行commit动作
2.session2同样更新session1更新的行
SQL> conn scott/oracle
Connected.
SQL> select userenv('sid') from dual;
USERENV('SID')
--------------
33
SQL> update tb_test set name='hang' where id=1;
注意:这个时候已经hang住了
3.session3使用hangalyze生成trace文件
SQL> connect / as sysdba
Connected.
SQL> oradebug -g def hanganalyze 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
SQL>
4.查看trace文件的内容
[oracle@host01-racdb1 ~]$ cat /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
Trace file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name: Linux
Node name: host01.example.com
Release: 2.6.18-194.el5
Version: #1 SMP Mon Mar 29 22:10:29 EDT 2010
Machine: x86_64
Instance name: racdb1
Redo thread mounted by this instance: 0 <none>
Oracle process number: 5
Unix process pid: 6042, image: oracle@host01.example.com (DIAG)
*** 2017-09-17 20:25:29.980
*** SESSION ID:(5.1) 2017-09-17 20:25:29.980
*** CLIENT ID:() 2017-09-17 20:25:29.980
*** SERVICE NAME:() 2017-09-17 20:25:29.980
*** MODULE NAME:() 2017-09-17 20:25:29.980
*** ACTION NAME:() 2017-09-17 20:25:29.980
Node id: 0
List of nodes: 0,
*** 2017-09-17 20:25:29.981
Reconfiguration starts [incarn=0]
*** 2017-09-17 20:25:29.981
I'm the master node
*** 2017-09-17 20:25:29.981
Reconfiguration completes [incarn=1]
Group reconfiguration cleanup
*** 2017-09-17 20:25:34.375
DIAG attached to DLM
*** 2017-09-17 20:27:18.434
cluster reconfiguration is ongoing: 0, 1,
*** 2017-09-17 20:27:18.467
Reconfiguration starts [incarn=2]
*** 2017-09-17 20:27:18.467
I'm the master node
Group reconfiguration cleanup
*** 2017-09-17 20:27:18.954
A rcfg proposal from node 1 is received
*** 2017-09-17 20:27:18.954
Reconfiguration completes [incarn=2]
*** 2017-09-17 21:30:25.497
DIAG sends out oradebug command to nodes [0,1]
*** 2017-09-17 21:30:26.223
****************************************************************
PORADEBUG BEGIN ORIGINATING INST:1 SERIAL:0 PID:19045
PORADATA COMMAND:hanganalyze 3
session (kjzddbx) switches to a new action
PORADATA TRACEFILE /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
PORADEBUG END ORIGINATING INST:1 SERIAL:0 PID:19045
****************************************************************
*** 2017-09-17 21:30:28.293
===============================================================================
HANG ANALYSIS:
instances (db_name.oracle_sid): racdb.racdb1, racdb.racdb2
oradebug_node_dump_level: 3
analysis initiated by oradebug
===============================================================================
Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
[b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 2 Signature Hash: 0xa00e2e87
[c] Chain 3 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 3 Signature Hash: 0xa00e2e87
===============================================================================
Non-intersecting chains:
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (racdb.racdb1)
os id: 18330
process id: 50, oracle@host01.example.com (TNS V1-V3)
session id: 33
session serial #: 1207
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x3001a
p3: 'sequence'=0x7b2
time in wait: 4 min 6 sec
timeout after: never
wait id: 126
blocking: 0 sessions
current sql: update tb_test set name='hang' where id=1
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-skgpwwait()+156<-ksliwat()+1821<-kslwaitctx()+162<-kjusuc()+3520<-ksipgetctx()+1413<-ksqcmi()+20639<-ksqgtlctx()+3401<-ksqgelctx()+552<-ktcwit1()+330<-kdddgb()+7099<-kdusru()+490<-updrowFastPath()+1075<-qerupFetch()+2171<-updaul()+1179<-updThreePhaseExe()+328<-updexe()+663<-opiexe()+14377<-kpoal8()+2288<-opiodr()+1149<-ttcpip()+1251<-opitsk()+1633<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-
wait history:
* time between current wait and wait #1: 0.000372 sec
1. event: 'Disk file operations I/O'
time waited: 0.000011 sec
wait id: 125 p1: 'FileOperation'=0x2
p2: 'fileno'=0x4
p3: 'filetype'=0x2
* time between wait #1 and #2: 0.000348 sec
2. event: 'SQL*Net message from client'
time waited: 17.425030 sec
wait id: 124 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000002 sec
3. event: 'SQL*Net message to client'
time waited: 0.000001 sec
wait id: 123 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
}
and is blocked by
=> Oracle session identified by:
{
instance: 1 (racdb.racdb1)
os id: 17385
process id: 41, oracle@host01.example.com (TNS V1-V3)
session id: 69
session serial #: 1789
}
which is waiting for 'SQL*Net message from client' with wait info:
{
p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
time in wait: 4 min 48 sec
timeout after: never
wait id: 839
blocking: 1 session
current sql: <none>
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-read()+14<-ntpfprd()+115<-nsbasic_brc()+338<-nsbrecv()+69<-nioqrc()+485<-__PGOSF24_opikndf2()+978<-opitsk()+824<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+214<-main()+201<-__libc_start_main()+244<-_start()+36
wait history:
* time between current wait and wait #1: 0.000008 sec
1. event: 'SQL*Net message to client'
time waited: 0.000005 sec
wait id: 838 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.008319 sec
2. event: 'SQL*Net message from client'
time waited: 1 min 53 sec
wait id: 837 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000004 sec
3. event: 'SQL*Net message to client'
time waited: 0.000004 sec
wait id: 836 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
}
Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------
===============================================================================
Sessions in an involuntary wait or not in a wait:
-------------------------------------------------------------------------------
Chain 2:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (racdb.racdb1)
os id: 6174
process id: 37, oracle@host01.example.com (TNS V1-V3)
session id: 41
session serial #: 1
}
is waiting for 'Streams AQ: waiting for messages in the queue' with wait info:
{
p1: 'queue id'=0x3254
p2: 'process#'=0x7849d908
p3: 'wait time'=0x1
time in wait: 0.935844 sec
timeout after: 0.064156 sec
wait id: 20756
blocking: 0 sessions
current sql: <none>
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-skgpwwait()+156<-ksliwat()+1821<-kslwaitctx()+162<-kslwait()+141<-kwqidexfcy()+809<-kwqidsc81i()+1183<-kwqidrdq()+572<-kwqidsfmp()+418<-kwqidafm0()+404<-kwqididqx()+4620<-kpoaqxdq()+1520<-kpoaqxq()+705<-opiodr()+1149<-ttcpip()+1251<-opitsk()+1633<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+214<-main()+201<-__libc_start_main()+244<-_start()+36
wait history:
* time between current wait and wait #1: 0.000587 sec
1. event: 'SQL*Net message from client'
time waited: 0.000130 sec
wait id: 20755 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000011 sec
2. event: 'SQL*Net message to client'
time waited: 0.000001 sec
wait id: 20754 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000002 sec
3. event: 'SQL*Net break/reset to client'
time waited: 0.000061 sec
wait id: 20753 p1: 'driver id'=0x62657100
p2: 'break?'=0x0
}
Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 2 Signature Hash: 0xa00e2e87
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Chain 3:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (racdb.racdb1)
os id: 9078
process id: 55, oracle@host01.example.com
session id: 47
session serial #: 226
}
is waiting for 'Streams AQ: waiting for messages in the queue' with wait info:
{
p1: 'queue id'=0x3037
p2: 'process#'=0x784afd88
p3: 'wait time'=0x5
time in wait: 1.428620 sec
timeout after: 3.571380 sec
wait id: 6120
blocking: 0 sessions
current sql: <none>
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-skgpwwait()+156<-ksliwat()+1821<-kslwaitctx()+162<-kslwait()+141<-kwqidexfcy()+809<-kwqidsc81i()+1183<-kwqidrdq()+572<-kwqidsfmp()+418<-kwqidafm0()+404<-kwqididqx()+4620<-kpoaqdq()+3301<-opiodr()+1149<-ttcpip()+1251<-opitsk()+1633<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+214<-main()+201<-__libc_start_main()+244<-_start()+36
wait history:
* time between current wait and wait #1: 0.000281 sec
1. event: 'SQL*Net message from client'
time waited: 0.000550 sec
wait id: 6119 p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000016 sec
2. event: 'SQL*Net message to client'
time waited: 0.000005 sec
wait id: 6118 p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000566 sec
3. event: 'Streams AQ: waiting for messages in the queue'
time waited: 4.962538 sec
wait id: 6117 p1: 'queue id'=0x3037
p2: 'process#'=0x784afd88
p3: 'wait time'=0x5
}
Chain 3 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 3 Signature Hash: 0xa00e2e87
-------------------------------------------------------------------------------
===============================================================================
Extra information that will be dumped at higher levels:
[level 4] : 1 node dumps -- [LEAF] [LEAF_NW]
[level 5] : 3 node dumps -- [NO_WAIT] [INVOL_WT] [SINGLE_NODE] [NLEAF] [SINGLE_NODE_NW]
State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[32]/1/33/1207/0x7879b3d8/18330/NLEAF/[68]
[40]/1/41/1/0x78784058/6174/SINGLE_NODE/
[46]/1/47/226/0x787729b8/9078/SINGLE_NODE/
[68]/1/69/1789/0x78732c18/17385/LEAF/
*** 2017-09-17 21:30:28.295
===============================================================================
END OF HANG ANALYSIS
===============================================================================
*** 2017-09-17 21:30:28.296
===============================================================================
HANG ANALYSIS DUMPS:
oradebug_node_dump_level: 3
===============================================================================
State of LOCAL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[32]/1/33/1207/0x7879b3d8/18330/NLEAF/[68]
[40]/1/41/1/0x78784058/6174/SINGLE_NODE/
[46]/1/47/226/0x787729b8/9078/SINGLE_NODE/
[68]/1/69/1789/0x78732c18/17385/LEAF/
No processes qualify for dumping.
===============================================================================
HANG ANALYSIS DUMPS: END
===============================================================================
session (kjzddbx) switches to a new action
PORADATA SERIAL:0 RINST:2 RDIAG:5548
PORADATA SERIAL:0 RINST:2 REMOTE EXEC OSPID:5548
PORADATA SERIAL:0 RINST:2 TRACEFILE:/u01/app/oracle/diag/rdbms/racdb/racdb2/trace/racdb2_diag_5548.trc
PORADATA SERIAL:0 RINST:2 DONE
[oracle@host01-racdb1 ~]$
对上面的trace文件说明:
1.trace中最重要的部分
State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[32]/1/33/1207/0x7879b3d8/18330/NLEAF/[68] --这里的NLEAF表示该SID=33被阻塞了,这里adjlist=68,对应的SID=69,即为阻塞者.
[40]/1/41/1/0x78784058/6174/SINGLE_NODE/
[46]/1/47/226/0x787729b8/9078/SINGLE_NODE/
[68]/1/69/1789/0x78732c18/17385/LEAF/ --这里的LEAF表示该SID=69,即为阻塞者.
在上面的实例中的解释如下:
nodenum:定义每个session的序列号 sid: session的sid sess_srno: session的Serial# ospid: OS的进程ID state: node的状态 adjlist: 表示blocker node
IN_HANG:这表示该node处于死锁状态,通常还有其他node(blocker)也处于该状态 LEAF/LEAF_NW:该node通常是blocker.通过条目的”predecessor”列可以判断这个node是否是blocker。LEAF说明该NODE没有等待其他资源,而LEAF_NW则可能是没有等待其他资源或者是在使用CPU NLEAF:通常可以看作这些会话是被阻塞的资源.发生这种情况一般说明数据库发生性能问题而不是数据库hang IGN/IGN_DMP:这类会话通常被认为是空闲会话,除非其adjlist列里存在node。如果是非空闲会话则说明其adjlist里的node正在等待其他node释放资源。 SINGLE_NODE/SINGLE_NODE_NW:近似于空闲会话.
2.hanganalyze级别的定义
10 Dump all processes (IGN state)
5 Level 4 + Dump all processes involved in wait chains (NLEAF state)
4 Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3 Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2 Only HANGANALYZE output, no process dump at all
一般情况下,我们都定义到级别3,级别3以上的使用需要谨慎,因为级别3以上的,dump信息很多,影响I/O.
3.单节点和RAC下hanganalyze的使用:
单节点:
ALTER SESSION SET EVENTS 'immediate trace name HANGANALYZE level <level>'; ORADEBUG hanganalyze <level>
Rac:
ORADEBUG setmypid ORADEBUG setinst all ORADEBUG -g def hanganalyze <level>
4、单节点和RAC下hanganalyze的使用,举例:
单节点:
SQL> alter session set events 'immediate trace name HANGANALYZE level 3';
Session altered.
--获取trace 文件名称和路径,该脚本会放在最后贴出来
SQL> @/u01/getTracePath.sql
trace_file_name
--------------------------------------------------------------------------------
/u01/app/oracle/admin/rac/udump/dave1_ora_20962.trc
/u01/app/oracle/admin/rac/udump/dave2_ora_20962.trc
[oracle@host01-racdb1 ~]$cat getTracePath.sql
SELECT d.VALUE
|| '/'
|| LOWER (RTRIM (i.INSTANCE, CHR (0)))
|| '_ora_'
|| p.spid
|| '.trc'
AS "trace_file_name"
FROM (SELECT p.spid FROM v$mystat m, v$session s, v$process p
WHERE m.statistic# = 1 AND s.SID = m.SID AND p.addr = s.paddr) p,
(SELECT t.INSTANCE FROM v$thread t, v$parameter v
WHERE v.NAME = 'thread'
AND (v.VALUE = 0 OR t.thread# = TO_NUMBER (v.VALUE))) i,
(SELECT VALUE FROM v$parameter
WHERE NAME = 'user_dump_dest') d;
SQL> oradebug hanganalyze 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_19045.trc
Rac:
SQL> oradebug setmypid
Statement processed.
SQL> oradebug setinst all
Statement processed.
SQL> oradebug -g def hanganalyze 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
文档参考:http://blog.chinaunix.net/uid-77311-id-3245150.html
http://blog.csdn.net/tianlesoftware/article/details/6321961