Corrupted Undo Tablespace in an Oracle RAC Database
Overview
Recently, I came across an issue in one of the Oracle RAC database environments where the undo tablespace became problematic and needed to be replaced safely. Undo tablespaces are a very critical part of Oracle Database because they manage transaction rollback, read consistency, and undo data required by running sessions.
In a production or RAC environment, any corruption or issue with an undo tablespace can lead to transaction failures, startup issues, or abnormal database behavior. Therefore, replacing a corrupted undo tablespace must be handled very carefully, especially when the database is using SPFILE and RAC-specific undo configurations.
Step 1: Create a PFILE from the Existing SPFILE
First, I created a PFILE from the existing SPFILE. This is important because we need to make temporary parameter changes without directly modifying the existing SPFILE at the initial stage.
SQL> CREATE PFILE=’/tmp/initPROD.ora’ FROM SPFILE;
Step 2: Shutdown the Database Instance
After creating the PFILE, I shut down the database instance cleanly.
SQL> SHUTDOWN IMMEDIATE;
A clean shutdown is recommended so that no active transactions remain open and the database can be restarted using the modified PFILE.
Step 3: Update the PFILE Parameters
Next, I edited the newly created PFILE:
/tmp/initPROD.ora
I added/updated the following parameters:
*.undo_management = manual
*.event=’10513 trace name context forever, level 2′
The purpose of setting undo management to manual is to allow the database to start without automatically trying to use the problematic undo tablespace.
The event parameter is used during this activity to help bypass certain undo-related issues while the database is being started for maintenance and recovery purposes.
Step 4: Start the Database in Restricted Mode Using the Modified PFILE
After updating the PFILE, I started the database in restricted mode using the modified PFILE.
SQL> STARTUP RESTRICT PFILE=’/tmp/initPROD.ora’;
Restricted mode ensures that only users with the required privileges can connect to the database while this maintenance activity is being performed.
Step 5: Verify Undo/Rollback Segment Status
Before creating and switching to a new undo tablespace, it is very important to verify the current rollback/undo segment status.
I used the following query:
SQL> SELECT tablespace_name, status, segment_name
FROM dba_rollback_segs
WHERE status != ‘OFFLINE’;
At this stage, all undo segments related to the problematic undo tablespace should ideally be offline.
Critical Check:
If any undo segment is showing status such as PARTLY AVAILABLE or NEEDS RECOVERY, do not proceed further without proper analysis or Oracle Support guidance.
If all undo segments are offline, then we can safely continue to the next step.
Step 6: Create a New Undo Tablespace
Once the rollback segment status was verified, I created a new undo tablespace to replace the corrupted one.
SQL> CREATE UNDO TABLESPACE UNDOTBS01
DATAFILE ‘+DATA/PROD/DATAFILE/undotbs01.262.789564546’
SIZE 2048M AUTOEXTEND ON NEXT 120M MAXSIZE UNLIMITED;
In this command:
UNDOTBS01 is the new undo tablespace name.
The datafile is created under ASM diskgroup +DATA.
Initial size is set to 2048M.
Autoextend is enabled with next extent size of 120M.
Maximum size is set to unlimited.
In a RAC environment, make sure the undo tablespace naming and instance mapping are aligned with the database configuration.
Step 7: Drop the Old Corrupted Undo Tablespace
After successfully creating the new undo tablespace, I dropped the old corrupted undo tablespace.
SQL> DROP TABLESPACE UNDOTBS1 INCLUDING CONTENTS AND DATAFILES;
This command removes the old undo tablespace along with its contents and datafiles.
Important:
Before dropping the old undo tablespace, make sure it is no longer actively used and all related undo segments are offline.
Step 8: Shutdown the Database
After dropping the old undo tablespace, I shut down the database again.
SQL> SHUTDOWN IMMEDIATE;
This restart is required so that the database can be brought back using the original SPFILE and updated undo configuration.
Step 9: Start the Database in NOMOUNT Mode
Next, I started the database in NOMOUNT mode.
SQL> STARTUP NOMOUNT;
Starting in NOMOUNT mode allows us to update the SPFILE before opening the database normally.
Step 10: Update the SPFILE with the New Undo Tablespace
Now I updated the database parameter to point the instance to the newly created undo tablespace.
SQL> ALTER SYSTEM SET undo_tablespace=UNDOTBS01 SCOPE=BOTH SID=’PROD1′;
Here:
undo_tablespace=UNDOTBS01 sets the new undo tablespace.
SCOPE=BOTH updates both memory and SPFILE.
SID=’PROD1′ applies the change to the specific RAC instance.
RAC Note:
In an Oracle RAC database, each instance normally has its own undo tablespace. Therefore, verify the undo configuration for all RAC instances and update the correct SID accordingly.
Step 11: Restart the Database Normally
Finally, I restarted the database normally using the updated SPFILE.
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP;
After the database startup, I verified that the instance was using the new undo tablespace and that the database was functioning normally.
Post-Activity Validation
After completing the activity, the following checks should be performed:
SQL> SHOW PARAMETER undo_tablespace;
Also verify rollback segment status again:
SQL> SELECT tablespace_name, status, segment_name
FROM dba_rollback_segs
ORDER BY tablespace_name, segment_name;
In RAC, also validate instance-level undo configuration:
SQL> SELECT inst_id, name, value
FROM gv$parameter
WHERE name = ‘undo_tablespace’
ORDER BY inst_id;
These checks help confirm that the new undo tablespace is correctly assigned and there are no abnormal undo segment statuses.
Oracle Solutions We believe in delivering tangible results for our customers in a cost-effective manner