Oracle Data Guard Broker Drill (19c): Switchover + Failover Using DGMGRL (End-to-End Runbook)
This week I ran a Data Guard Broker DR drill for a client on Oracle 19c, covering both planned role transition (Switchover) and unplanned role transition (Failover) using the DGMGRL utility. The goal was simple: prove that we can safely switch roles during planned maintenance, and also recover services quickly if the primary becomes unavailable—while validating readiness and capturing the post-event cleanup steps.
The walkthrough below is written as a single, practical runbook (switchover + failover), keeping the same technical flow/commands used in the original procedures.
Lab / Drill Context
In my drill setup:
- Primary DB initially: PRODDB
- Physical Standby initially: PRODDB_ST
- Broker config name shown as: prod
- Protection mode observed: MaxAvailability (SYNC transport)
- Note from the drill outcome:
- After Switchover, the primary becomes PRODDB_ST
What I validated before role transitions
Switchover readiness (planned activity)
A switchover is a planned role reversal (Primary ↔ Standby) with no data loss, typically for maintenance. Before doing it, I ensured:
- Both databases were healthy (no broker warnings)
- Intended states were correct (TRANSPORT-ON on primary, APPLY-ON on standby)
- Standby Redo Logs were in place and redo transport/apply were running properly
Failover readiness (emergency activity)
A failover is triggered when the primary is unavailable/unrecoverable in time. Key checks I focused on:
- Flashback Database enabled (helps reinstatement after failover)
- Standby Redo Logs present to minimize/avoid data loss (depends on protection mode & last redo shipped)
Part A — Switchover Drill Using DGMGRL
Step 1: Check primary database (broker view)
I started by checking the primary with verbose output:
DGMGRL> SHOW DATABASE VERBOSE ‘PRODDB’;
This confirms role, intended state, transport mode, connect identifiers, and overall broker health.
Step 2: Check target standby (broker view)
Next, I verified the standby that will become the new primary:
Step 3: Validate switchover readiness (broker + database)
Broker-side validation:
I also confirmed overall configuration status:
Step 4: Execute switchover
This is the exact command I ran to flip roles:
Broker handled the orchestration (connections, role transition, startup/mount/open steps) and confirmed:
- New primary: PRODDB_ST
- Old primary: PRODDB now becomes standby
Step 5: Post-switchover verification (SQL checks)
I verified roles on both sides:
Old Primary (now standby):
New Primary:
During the drill I also kept both alert logs open to monitor transitions in real time (log shipping/apply/service opens).
Part B — Failover Drill Using DGMGRL
In real incidents, failover is executed only when the primary is not accessible or cannot be restored within RTO. For the drill, I performed it in a controlled way and documented the aftermath.
Step 1: Confirm Flashback Database enabled (both sides)
Before failover, I confirmed flashback was enabled:
SQL> select flashback_on, instance_name, database_role from v$database,V$instance;
This was done on both databases in the drill output.
Step 2: Connect to DGMGRL and show configuration
I connected to the current primary PRODDB and checked the configuration:
$ dgmgrl
DGMGRL> connect sys/oracle@PRODDB
DGMGRL> SHOW CONFIGURATION;
This confirmed PRODDB as primary and PRODDB_ST as physical standby.
Step 3: Validate the target standby for failover
I validated the target standby PRODDB_ST to ensure it is ready:
DGMGRL> VALIDATE DATABASE ‘PRODDB_ST’;
In the drill, readiness included:
-
Ready for Failover: Yes (Primary Running)
Step 4: Connect to the target standby
Then I connected to the standby that will become the new primary:
Step 5: Execute failover
This is the failover command I issued:
Broker confirmed: “Failover succeeded, new primary is proddb_st”
Step 6: Post-failover configuration check (expected warnings)
After failover, I checked:
In the drill output, broker showed warnings like:
- Standby database disabled
- “the standby database needs to be reinstated” (reinstate required for the old primary)
This is normal behavior after failover—the old primary typically must be reinstated (or rebuilt) before it can rejoin as a standby.
What I documented for the client (practical notes)
- Switchover is the clean, preferred method for planned maintenance (no data loss, broker-managed transitions).
- Failover is for emergencies; data loss depends on transport/protection mode and whether redo was fully shipped/applied. Having SRLs + Flashback ON greatly improves recovery and reinstatement workflow.
- Always run VALIDATE DATABASE before the operation (it catches readiness issues early).
- After failover, plan the reinstate (or rebuild) of the old primary.
Quick Command Summary
Switchover
DGMGRL> SHOW DATABASE VERBOSE ‘PRODDB’;
DGMGRL> SHOW DATABASE VERBOSE ‘PRODDB_ST’;
DGMGRL> VALIDATE DATABASE ‘PRODDB’;
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
DGMGRL> SHOW CONFIGURATION;
DGMGRL> switchover to ‘PRODDB_ST’;
DGMGRL> SHOW CONFIGURATION;
Failover
SQL> select flashback_on,instance_name,database_role from v$database,V$instance;
DGMGRL> connect sys/oracle@PRODDB
DGMGRL> SHOW CONFIGURATION;
DGMGRL> VALIDATE DATABASE ‘PRODB_ST’;
DGMGRL> connect sys/oracle@PRODDB_ST
DGMGRL> FAILOVER TO ‘PRODDB_ST’;
DGMGRL> SHOW CONFIGURATION;
DR Drill Checklist
DR Drill Checklist (Switchover + Failover)
A) Before Drill
Confirm maintenance/drill window approved
Confirm stakeholders on bridge (DBA/App/Infra/Network)
Take pre-drill backups (or confirm last good backup)
Capture baseline: SHOW CONFIGURATION (DGMGRL)
Confirm standby apply healthy (lag acceptable)
Confirm SWITCHOVER_STATUS = TO STANDBY (primary)
Confirm Flashback ON on both DBs (recommended)
Confirm monitoring/on-call ready (OEM, scripts, alerts)
B) Switchover Execution
VALIDATE DATABASE (primary)
SWITCHOVER TO ‘<standby>’ executed
Verify roles in SQL on both nodes
Verify app/services on new primary
Confirm broker shows expected primary/standby roles
Save outputs and timestamps
C) Failover Drill (Controlled)
Confirm failover trigger criteria (primary unreachable / simulated incident)
VALIDATE DATABASE ‘<target>’
FAILOVER TO ‘<target>’ executed
Verify target is PRIMARY and open read/write
Confirm broker messages indicate reinstate needed for old primary
Capture all outputs, alert logs, and drill notes
D) After Drill
Confirm application fully stable
Confirm protection mode and transport settings
Plan reinstate/rebuild steps for old primary (if failover done)
Send DR drill summary (what changed, what verified, evidence links)
Update DR runbook with lessons learned
Oracle Solutions We believe in delivering tangible results for our customers in a cost-effective manner