Stuck recovery of database ORA-00600[3020] (Doc ID 283269.1)

发布时间：2020-08-10 10:10:43 作者：hadba
来源：ITPUB博客阅读：166

In this Document

Symptoms
Cause
Solution
References

Applies to:
Enterprise Manager for Oracle Database - Version 10.1.0.2 to 12.1.0.2.0 [Release 10.1 to 12.1]
Oracle Database - Enterprise Edition - Version 9.0.1.4 to 12.1.0.2 [Release 9.0.1 to 12.1]
Oracle Database - Personal Edition - Version 9.2.0.1 to 12.1.0.2 [Release 9.2 to 12.1]
Information in this document applies to any platform.
***Checked for relevance on 16-Apr-2014***

Symptoms
While doing a recovery the recovery process may fails with ORA-600[3020] error message

ORA-00600: internal error code, arguments: [3020], [2885689059], [1], [419819],[26750], [808], [], []
ORA-10567: Redo is inconsistent with data block (file# %s, block# %xxxx)
ORA-10564: tablespace
ORA-01110: data file %s

Cause
Recovery stops because of failed consistency checks, a problem called stuck recovery. Stuck recovery can occur when an underlying operating system or storage system loses a write issued by the database during normal operation. There is an inconsistency between the information stored in the redo and the information stored in a database block being recovered.

The database signals an internal error when applying the redo. This problem can be caused by an Oracle Database bug or may be because of I/O problem ( hardware or O/S related issue )

There is a known EMC issue related to an RDBMS ORA-600 [3020] where the root-cause is on OS/Hardware level.

Details from EMC on the nature of the fix (problem with Symmetrix microcode)
ID: emc230687
Domain: EMC1
Solution Class: 3.X Compatibility

Solution

When media recovery encounters a problem, the alert log may indicate that recovery can continue if it is allowed to corrupt the data block causing the problem. The alert log contains information about the block: its block type, block address, the tablespace it belongs to, and so forth. For blocks containing user data, the alert log may also report the data object number.

In this case, the database can proceed with recovery if it is allowed to mark the problem block as corrupt. Nevertheless, this response is not always advisable. For example, if the block is an important block in the SYSTEM tablespace, marking the block as corrupt can eventually prevent you from opening the recovered database. Another consideration is whether the recovery problem is isolated. If this problem is followed immediately by many other problems in the redo stream, then you may want to open the database with the RESETLOGS option.

For a block containing user data, you can usually query the database to find out which object or table owns this block. If the database is not open, then you should be able to open the database read-only, even if you are recovering a whole database backup. The following example cancels recovery and opens read-only:

CANCEL
ALTER DATABASE OPEN READ ONLY;

From the alert.log error message we can found the dataifle and its corresponding block number. Even for the blocks containing user data, the alert log may also report the data object number. From this we can get the object information.

SQL>SELECT SEGMENT_NAME FROM DBA_EXTENTS WHERE FILE_ID= &file_number
AND &BLOCK_NUMBER BETWEEN BLOCK_ID AND BLOCK_ID+BLOCKS-1;

-- where File_number is the datafile number in the error message and
Block_Number is the block number in the error message.

In case we get the data object number in the alert.log then we can determine the owner, object name, and object type by issuing this query:

SQL>SELECT OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_TYPE
FROM DBA_OBJECTS
WHERE DATA_OBJECT_ID = &Object_number;

-- Where object_number is object_id in the error message.

To determine whether a recovery problem is isolated, we can run a diagnostic trial recovery, which scans the redo stream for problems but does not actually make any changes to the recovered database. If a trial recovery discovers any recovery problems, then it reports them in the alert_SID.log. You can use the RECOVER ... TEST statement to invoke trial recovery. Kindly refer Note 283262.1 Trial Recovery

Note : If the problem is not isolated or its belongs to SYSTEM tablespace then its better to open the database with the RESETLOGS option.

If the block is relatively unimportant like belogs to index tablespace or if the problem is isolated, then it is better to corrupt the block. If you decide to allow recovery to proceed in spite of block corruptions, then run the RECOVER command with the ALLOW n CORRUPTION clause, where n is the number of allowable corrupt blocks.

To allow recovery to corrupt blocks:

1. Ensure that all normal recovery preconditions are met.

2. Run the RECOVER command, allowing a single corruption, repeating as necessary for each corruption to be made. for an example :

SQL>RECOVER DATABASE ALLOW 1 CORRUPTION;

Note : The ALLOW integer CORRUPTION clause lets you specify, in the event of logfile corruption, the number of corrupt blocks that can be tolerated while allowing recovery to proceed.

When you use this clause during trial recovery (that is, in conjunction with the TEST clause), integer can exceed 1. When using this clause during normal recovery, integer cannot exceed 1.

In 10gr2 and Below version the restriction was we could specify only 1 block in allow 1 corruption during recovery phase , however from 11gr1 this restriction is removed You can specify n value in Allow <n> corruption during recovery where n is the total number of blocks found to be corrupted during trail recovery using Recover database test.

Example : SQL> Recover database allow 10 corruption ;

References
NOTE:283262.1 - Trial Recovery

Stuck recovery of database ORA-00600[3020] (Doc ID 283269.1)

相关阅读