生产系统rac遇到一个问题,在alert*.log: Sat Mar 17 11:43:31 2012 Global Enqueue Services Deadlock detected. More info in file /u01/app/oracle/admin/orcl/udump/orcl2_ora_18076.trc. Sat Mar 17 11:43:31 2012 Trace dumping is performing id=[cdmp_20120317114331] Sat Mar 17 11:44:29 2012 System State dumped to trace file /u01/app/oracle/admin/orcl/bdump/orcl2_diag_20766.trc
正是这个核心转储文件,占用了大量的磁盘空间。检查alertSID.log文件发现如下的内容: Sun Jun 6 18:51:35 2010 Global Enqueue Services Deadlock detected. More info in file /opt/oracle/product/admin/orcl/udump/orcl1_ora_26649.trc. Sun Jun 6 19:14:05 2010 System State dumped to trace file /opt/oracle/product/admin/orcl/bdump/orcl1_diag_4783.trc .... 日志显示是由于Oracle检查到了一个ges的死锁,于是做了系统级的dump操作,这种系统级的dump出来的内容很多,导致了dump文件特别大。 感觉象是遇到了bug,在metalink找到了现象类似的bug:6145177,bug中描述的现象与医院有些相似,但不完全一样,但打了补丁后,没再 出现这个问题。这个问题在10.2.0.4中已经解决了,目前医院使用的是10.2.0.3,Oracle已经提供了单独的Patch,为了避免大的版本升级, 就应用了单独的patch,整个过程比较顺利。打了这个patch,一直用到第二天上午,再也没有出现这个错误 ,看来patch已经起了作用。
2012年3月18号作了升级到10.2.0.4,我开始一直关注以前的情况是否解决。检查发现diag文件不再迅速增加,但是 Global Enqueue Services Deadlock detected依旧出现,不过这回我可以定位问题了。
Fri Apr 13 16:29:03 2012 Global Enqueue Services Deadlock detected. More info in file /u01/app/oracle/admin/orcl/udump/orcl2_ora_3135.trc. Fri Apr 13 16:29:03 2012 Trace dumping is performing id=[cdmp_20120413162903]
1.测试环境: SQL> select * from v$version ; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production
create table t (id number,name varchar2(10)); CREATE UNIQUE INDEX I_T_I ON SCOTT.T(ID);
CREATE PROCEDURE test1( l_id number,l_name varchar2,flag VARCHAR2) AS PRAGMA AUTONOMOUS_TRANSACTION; BEGIN commit ; IF flag = 'INSERT' THEN insert into t values(l_id,l_name); END IF; IF flag = 'UPDATE' THEN update t set id=l_id,name=l_name where id=l_id; END IF; IF flag = 'DELETE' THEN delete from t where id=l_id; END IF; commit; dbms_output.put_line (flag); END; /
insert into t values(1,'a'); insert into t values(2,'b'); commit;
2.测试: 打开会话1,执行如下: SQL> set SERVEROUT on SQL> update t set name='aaa' where id=1;
SQL> exec test1(1,'A','UPDATE'); BEGIN test1(1,'A','UPDATE'); END;
* ERROR at line 1: ORA-00060: deadlock detected while waiting for resource ORA-06512: at "SCOTT.TEST1", line 9 ORA-06512: at line 1
alert提示如下: Mon Apr 16 10:37:38 2012 ORA-00060: Deadlock detected. More info in file /u01/app/oracle11g/diag/rdbms/test/test/trace/test_ora_13751.trc.
--而且11G下记录更加详细,定位更加方便。 Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-00090008-00001193 26 132 X 26 132 X
session 132: DID 0001-001A-00000007 session 132: DID 0001-001A-00000007
----- Information for the OTHER waiting sessions ----- ----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=2bbypw0dj45w2) ----- UPDATE T SET ID=:B1 ,NAME=:B2 WHERE ID=:B1 ----- PL/SQL Stack ----- ----- PL/SQL Call Stack ----- object line object handle number name 0x95fcdac0 9 procedure SCOTT.TEST1 0x95f2c440 1 anonymous block ===================================================
--我的测试环境是单机,如果在rac环境下,alert提示如下: Thu Apr 12 15:56:49 2012 Global Enqueue Services Deadlock detected. More info in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_8070.trc. Thu Apr 12 15:56:49 2012 Trace dumping is performing id=[cdmp_20120412155649]
3.再做一个另外的测试: SQL> rollback ; Rollback complete.
SQL> select * from t; ID NAME ---------- ---------- 1 a 2 b
SQL> insert into t values(3,'c'); 1 row created.
SQL> exec test1(1,'A','UPDATE'); UPDATE
PL/SQL procedure successfully completed.
SQL> select * from t; ID NAME ---------- ---------- 1 A 2 b 3 c
--打开另外的会话,执行: SQL> select * from t; ID NAME ---------- ---------- 1 A 2 b