Creating a Single Instance Physical Standby for a RAC Primary
When people configure physical dataguard it is common for most of DBAs that they manually copy datafiles and create standby controlfiles prior to configure any required parameters. But when the primary database become ASM database copying datafiles are not as easy as OS commands and it will be tedious to transfer file by file using oracle provided procedure DBMS_FILE_TRANSFER.COPY.
In addition that the primary database must be force logging mode to avoid potential block corruption when it has nologging option enabled on table level.
SQL> alter database force logging;
This article explains how to create physical dataguard through RMAN specially when the primary database is on RAC and physical dataguard is on filesystem.
1. Perform rman backup and transfer to dataguard machine.
| $ rman target / RMAN> backup device type disk format ‘/backup/rman/%U’ database plus archivelog; RMAN> backup device type disk format ‘/backup/rman/%U’ current controlfile for standby; |
| $ scp /backup/rman/* oracle@dgmachine:/backup/rman/ |
2. Prepare configuration files
2.1 Create service and prepare tnsnames.ora file.
We want seamless failover to the standby database in case if primary RAC is unreachable which mean that we should not change any configuration on both clients and server side to failover or switchover to the physical standby database. This require us to create services on primary and standby database. (This way, you can maintain inentical configuration files on both primary and standby database.)
| (On Primary Server) $ srvctl add service -d FXPRET -s FXPRET_OLTP -r “db01,pret,db02.pret” -P BASIC(On Standby Server) * Add service_names parameter. SQL> alter system set service_names=”FXPRET,FXPRET_STDBY”; |
| $ cat tnsnames.ora FXPRET = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = db01-vip.pret)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = db02-vip.pret)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = pret-db3)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = FXPRET_OLTP) ) )FXPRET_STDBY = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = db01-vip.pret)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = db02-vip.pret)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = pret-db3)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = FXPRET_STDBY) ) ) |
2.2 Prepare password and parameter file.
* To enable secure transmission of redo data, make sure the primary and physical standby databases use a password file, and make sure the password for SYS user is identical on both systems.
$orapwd file=orapwFXPRET password=oracle entries=30
* Add below below dataguard specific parameters on both primary and standby database in case when primary become a standby database.
| *.remote_login_passwordfile=’exclusive’ *.fal_client=’FXPRET_STDBY’ *.fal_server=’FXPRET’ *.log_archive_config=’DG_CONFIG=(FXPRET,FXPRET_STDBY)’ *.standby_archive_dest=’/oracle/archive/FXPRET’ |
Add below physical dataguard specific to the physical dataguard machine’s parameter file.
| *.db_create_file_dest=’/oradata *.db_create_online_log_dest_1=’/oradata *.db_create_online_log_dest_2=’/oradata |
* Create spfile from pfile on physical standby database.
3. Duplicate standby database from the primary database’s backup.
| SQL> startup nomount $ rman target sys/oracle@fxpret auxiliary / RMAN> duplicate target database for standby; |
4. Create standby log files on primary and standby servers.
It is recommended to create standby logfile on both primary and standby servers. It is also recommended to create standby logfile for each thread if your primary is RAC.
| SQL> alter database add standby logfile thread 1; SQL> alter database add standby logfile thread 1; SQL> alter database add standby logfile thread 1; SQL> alter database add standby logfile thread 2; SQL> alter database add standby logfile thread 2; SQL> alter database add standby logfile thread 2; |
5. Start managed recovery mode
| SQL> alter database recover managed standby database using current logfile disconnect from session; |
Script to Collect Data Guard Physical Standby Diagnostic Information
Overview
——–
This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.
Script Notes
————-
This script is intended to be run via sqlplus as the SYS or Internal user.
Script
——-
- – - – - – - – - – - – - – - – Script begins here – - – - – - – - – - – - – - – -
– NAME: DG_phy_stby_diag.sql
– ————————————————————————
– AUTHOR:
– Michael Smith – Oracle Support Services – DataServer Group
– Copyright 2002, Oracle Corporation
– ————————————————————————
– PURPOSE:
– This script is to be used to assist in collection information to help
– troubeshoot Data Guard issues.
– ————————————————————————
– DISCLAIMER:
– This script is provided for educational purposes only. It is NOT
– supported by Oracle World Wide Technical Support.
– The script has been tested and appears to work as intended.
– You should always run new scripts on a test instance initially.
– ————————————————————————
– Script output is as follows:
set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,’Mondd_hhmi’) timecol,
‘.out’ spool_extension from sys.dual;
column output new_value dbname
select value || ‘_’ output
from v$parameter where name = ‘db_name’;
spool dgdiag_phystby_&&dbname&×tamp&&suffix
set lines 200
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = ‘MON-DD-YYYY HH24:MI:SS’;
set feedback on
select to_char(sysdate) time from dual;
set echo on
–
– ARCHIVER can be (STOPPED | STARTED | FAILED) FAILED means that the archiver failed
– to archive a — log last time, but will try again within 5 minutes. LOG_SWITCH_WAIT
– The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log switching is waiting for. Note that
– if ALTER SYSTEM SWITCH LOGFILE is hung, but there is room in the current online
– redo log, then value is NULL
column host_name format a20 tru
column version format a9 tru
select instance_name,host_name,version,archiver,log_switch_wait from v$instance;
– The following select will give us the generic information about how this standby is
– setup. The database_role should be standby as that is what this script is intended
– to be ran on. If protection_level is different than protection_mode then for some
– reason the mode listed in protection_mode experienced a need to downgrade. Once the
– error condition has been corrected the protection_level should match the protection_mode
– after the next log switch.
column ROLE format a7 tru
select name,database_role,log_mode,controlfile_type,protection_mode,protection_level
from v$database;
– Force logging is not mandatory but is recommended. Supplemental logging should be enabled
– on the standby if a logical standby is in the configuration. During normal
– operations it is acceptable for SWITCHOVER_STATUS to be SESSIONS ACTIVE or NOT ALLOWED.
column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru
select force_logging,remote_archive,supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker from v$database;
– This query produces a list of all archive destinations and shows if they are enabled,
– what process is servicing that destination, if the destination is local or remote,
– and if remote what the current mount ID is. For a physical standby we should have at
– least one remote destination that points the primary set but it should be deferred.
COLUMN destination FORMAT A35 WRAP
column process format a7
column archiver format a8
column ID format 99
select dest_id “ID”,destination,status,target,
archiver,schedule,process,mountid
from v$archive_dest;
– If the protection mode of the standby is set to anything higher than max performance
– then we need to make sure the remote destination that points to the primary is set
– with the correct options else we will have issues during switchover.
select dest_id,process,transmit_mode,async_blocks,
net_timeout,delay_mins,reopen_secs,register,binding
from v$archive_dest;
– The following select will show any errors that occured the last time an attempt to
– archive to the destination was attempted. If ERROR is blank and status is VALID then
– the archive completed correctly.
column error format a55 tru
select dest_id,status,error from v$archive_dest;
– Determine if any error conditions have been reached by querying thev$dataguard_status
– view (view only available in 9.2.0 and above):
column message format a80
select message, timestamp
from v$dataguard_status
where severity in (‘Error’,'Fatal’)
order by timestamp;
– The following query is ran to get the status of the SRL’s on the standby. If the
– primary is archiving with the LGWR process and SRL’s are present (in the correct
– number and size) then we should see a group# active.
select group#,sequence#,bytes,used,archived,status from v$standby_log;
– The above SRL’s should match in number and in size with the ORL’s returned below:
select group#,thread#,sequence#,bytes,archived,status from v$log;
– Query v$managed_standby to see the status of processes involved in the
– configuration.
select process,status,client_process,sequence#,block#,active_agents,known_agents
from v$managed_standby;
– Verify that the last sequence# received and the last sequence# applied to standby
– database.
select max(al.sequence#) “Last Seq Recieved”, max(lh.sequence#) “Last Seq Applied”
from v$archived_log al, v$log_history lh;
– The V$ARCHIVE_GAP fixed view on a physical standby database only returns the next
– gap that is currently blocking redo apply from continuing. After resolving the
– identified gap and starting redo apply, query the V$ARCHIVE_GAP fixed view again
– on the physical standby database to determine the next gap sequence, if there is
– one.
select * from v$archive_gap;
– Non-default init parameters.
set numwidth 5
column name format a30 tru
column value format a50 wra
select name, value
from v$parameter
where isdefault = ‘FALSE’;
spool off
Dataguard configuration
1. Implementation
1.1 Make Primary database archivelog mode
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /db/archive/FXPROD/DC2
Oldest online log sequence 1244
Next log sequence to archive 1248
Current log sequence 1248
1.2 Create Standby Controlfile
SQL> ALTER DATABASE CREATE STANDBY CONTROLFILE AS ‘/tmp/standby.ctl’;
1.3 Copy Datafiles and Standby Ctonrolfile to standby site.
If the standby is on a different directory structure then you must rename the primary datafiles in the standby control file after copying them to the standby site. This can be done using the db_file_name_consert and log_file_name_convert parameters or by manually using the alter database statements.
1.4 Set the initialization parameters for the primary database
log_archive_dest_1=’LOCATION=LOCATION=/db/archive/FXPROD/DC2’
log_archive_dest_2=’ SERVICE=FXPROD5_STANDBY LGWR SYNC AFFIRM’
log_archive_dest_state_1=enable
log_archive_dest_state_2=enable
log_archive_format=’FXPROD_DC2_%t_%s.arc’
log_archive_start=true
remote_archive_enable=true
fal_client=’FXPROD_STANDBY’
fal_server=’FXPROD5_STANDBY’
standby_file_management=’auto’
All these parameters are not necessary for the initial setup but it is still better to set it up to avoid later configuration.
You can optionally set additional information for the remote archive. log_archive_dest_2=’SERVICE=GMLP2 ARCH SYNC NOAFFIRM delay=0 OPTIONAL max_failure=0 reopen=300 register’
1.5 Set the initialization parameters for the physical standby database
log_archive_dest_1=’LOCATION=/db/archive/FXPROD/DC2’
log_archive_dest_state_1=enable
log_archive_dest_state_2=enable
log_archive_format=’FXPROD_DC2_%t_%s.arc’
log_archive_start=true
remote_archive_enable=true
fal_client=’FXPROD5_STANDBY’
fal_server=’FXPROD_STANDBY’
standby_file_management=’auto’
1.6 Configure tnsnames.ora
Create three TNS entries one is for primary , the other is for standby and the third is for failover purpose.
|
FXPROD_STANDBY = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = prod-db2)(PORT = 1526)) # (ADDRESS = (PROTOCOL = TCP)(HOST = prod-db1)(PORT = 1526)) ) (CONNECT_DATA = (SERVICE_NAME = FXPROD) ) )
FXPROD5_STANDBY = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = prod-db5)(PORT = 1526)) ) (CONNECT_DATA = (SERVICE_NAME = FXPROD) ) ) |
1.7 Start Standby database
SQL> startup nomount
SQL> alter database mount standby database;
1.8 Create standby redo log files, if necessary
Standby redo logs are necessary for the higher protection levels such as guaranteed, Instant, and Rapid. In these protection modes LGWR from the Primary host writes transactions directly to the standby redo logs. This enables no data loss solutions and reduces the amount of data loss in the event of failure. Standby redo logs are not necessary if you are using the delayed protection mode.
If you configure standby redo on the standby then you should also configure standby redo logs on the primary database. Even though the standby redo logs are not used when the database is running in the primary role, configuring the standby redo logs on the primary database is recommended in preparation for an eventual switchover operation.
alter database add standby logfile ‘/data/FXPROD/standby_redo_FXPROD01.log’ size 100M;
alter database add standby logfile ‘/data/FXPROD/standby_redo_FXPROD02.log’ size 100M;
alter database add standby logfile ‘/data/FXPROD/standby_redo_FXPROD03.log’ size 100M;
alter database add standby logfile ‘/data/FXPROD/standby_redo_FXPROD04.log’ size 100M;
alter database add standby logfile ‘/data/FXPROD/standby_redo_FXPROD05.log’ size 100M;
à Verify if the standby redo log files are created.
SQL> select group#, status, type from v$logfile;
1.9 Verify that automatic archiving is occurring
On the primary database switch a log and verify that it has been shipped properly using the v$archive_dest view.
SQL> alter system switch logfile;
SQL> select status, error from v$archive_dest where dest_id = 2;
1.10 Put standby database in managed recovery mode
SQL> recover managed standby database disconnect from session;
à Check Recovery mode from v$archive_dest_status
SQL> select recovery_mode from v$archive_dest_status;
RECOVERY_MODE = IDLE managed recovery not active
RECOVERY_MODE = MANUAL manual media recovery active
RECOVERY_MODE = MANAGED managed recovery is active
2. Failover and Switchover
2.1 SWITCHOVER
à From the Original Primary database
Verify switchover is possible:
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
Required output: ‘TO STANDBY’
Generate switchover items in the primary DB:
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;
Result:
Primary DB is closed (but not shut down !!!)
Archiving completes
Inserts end-of-redo marker in redo log and control file
Backs up control file to trace
Shutdown primary DB:
SHUTDOWN NORMAL / IMMEDIATE / TRANSACTIONAL;
Change the init.ora of primary DB for standby role.
Start the instance:
STARTUP NOMOUNT;
Mount this ex-primary DB as a standby DB:
ALTER DATABASE MOUNT STANDBY DATABASE;
…hold off on starting up the standby recovery until the other DB is ready…
à From the Original Standby database
Switch the original standby DB to primary role:
Verify switchover is possible:
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
Required output: “TO PRIMARY”
Generate switchover items in this ex-standby DB:
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL PRIMARY WITH SESSION SHUTDOWN;
Result:
Receives all remaining redo log files
Applies all remaining redo log files up to end-of-redo marker
Converts the local standby control file to the primary one
Automatically creates on-line redo logs
Closes the db, if opened for read-only (but doesn’t shutdown!!!)
Shutdown this ex-standby DB:
SHUTDOWN NORMAL / IMMEDIATE / TRANSACTIONAL;
Change the init.ora of this ex-standby DB for primary role.
Startup the instance:
STARTUP;
Now go back to the new standby DB and start recovery:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
2.2 FAILOVER
A Failover can be performed when all or most of the information until the unavailability of the Primary Database was propageted to the Standby. The usage of Standby RedoLogs ia a great advantage here. If you have no Standby RedoLogs available, you will always encounter some Data Loss (depending on the Changes since the latest LogSwitch). To perform a Failover just follow these steps:
- The Primary Database is down for any reason
- Verify a Standby RedoLog is in use for Primary current Online RedoLog. You then find in the ALERT.LOG of the Standby something like:
RFS: Successfully opened standby logfile :'C:ORACLEORADATAPRIMARYSTBY01.LOG'
- If this is the case run the following commands:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
This cancels the normal managed Recovery. To get the Standby RedoLog Information is still required. Therefore issue this command:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
- If a Standby RedoLog is not used for any reason, then run this one:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH SKIP STANDBY LOGFILE;
Please keep in mind that this one causes (Minimal) Data Loss as the latestet
information from the down Primary Database is not available anymore.
- Once this is complete (This performs a complete Recovery or incomplete Recovery until the last SCN included in the latest archived Log available at the Standby), you can now make the Standby Database a Primary:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY
WARNING: This will only succeed if the correct RECOVER FINISH-statement was issued before. If you forgot the 'SKIP STANDBY LOGFILE' although you have no Standby RedoLogs, the COMMIT to Switchover will fail with the error that more Media Recovery is required here.
- If the COMMIT TO SWITCHOVER fails for any reason you have to use the ACTIVATE command which forces the Failover (and may cause Data Loss !!)
SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;
- Shutdown and restart the Databse after this command ended successfully:
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP
- Now the Standby is open as a new Primary Database
leave a comment