Backup best practices for Oracle clusterware

I recommend you to backup clusterware related files after initial setup and at any change. The backup files can save you from OCR, OLR corruption during GI patch. If any of the files become corrupted you will be able to recover it in several minutes (or seconds). Depends on the failure, you may lose several hours to recover your cluster to the state it was before something happened.

Here are the steps to protect your cluster:

1. Backup ASM spfile initially and at any change.

There are several ways to backup ASM spfile using spcopy, spbackup or create pfile=<backup location> from spfile.

To locate the Oracle ASM SPFILE, use the ASMCMD spget command:

ASMCMD> spget
+GRID/myrac/ASMPARAMETERFILE/registry.253.974466047

Copy the Oracle ASM SPFILE to the backup location:

ASMCMD> spbackup +GRID/myrac/ASMPARAMETERFILE/registry.253.974466047 /backup/spfileasm.ora

2. Backing up ASM password file once should be enough. If you change password for pwfile users or add another user into the list, then make a new backup.

Locate the password file using the ASMCMD pwget command.

ASMCMD> pwget --asm
+GRID/orapwASM

Back up the password file to another location with the pwcopy command.

ASMCMD> pwcopy +GRID/orapwASM  /backup/orapwASM 
copying +GRID/orapwASM -> /backup/orapwASM

3. Use md_backup command to create a backup file containing metadata for one or more disk groups.

To backup metadata for all disk groups, do the following:

ASMCMD> md_backup /tmp/dgmetabackup

Disk group metadata to be backed up: DATA
Disk group metadata to be backed up: FRA
Disk group metadata to be backed up: GRID

In case you need to backup metadata only for a specific disk group, use -G option.

4. Backup OLR on each node.

If OLR is missing or corrupted, clusterware can’t be started on that node. So make manual backup initially and after any change:

Do the following on each node:

# ocrconfig -local -manualbackup

Copy generated file to the backup location:

# cp /u01/app/12.2.0/grid/cdata/rac1/backup_20180510_230359.olr /backup/

Or change default backup location to /backup before making the actual backup:

# ocrconfig -local -backuploc /backup

# ocrconfig -local -manualbackup

5. Mirror and Backup OCR.

You should configure OCR in two independent disk groups. Typically, this is the work area and the recovery area. At least two OCR locations should be configured.

# ocrconfig -add +FRA

There are automatic OCR backups that are taken in the past 4 hours, 8 hours, 12 hours, and in the last day and week.

You can also manually backup OCR before applying any patch or upgrade GI home:

# ocrconfig -manualbackup

Regularly save taken backup to another location using the following way:

Identify the latest backup (manual or automatic):

[grid@rac1 ~]$ ocrconfig -showbackup
rac1 2018/05/10 13:06:18 +GRID:/myrac/OCRBACKUP/backup00.ocr.289.975762375 830990544
..

Copy it to the backup location:

$ ocrconfig -copy +GRID:/myrac/OCRBACKUP/backup00.ocr.289.975762375 /backup/backup00.ocr

Or change default backup locations to another diskgroup other than GRID:

# ocrconfig -backuploc +FRA

 

 

Advertisements

ORA-01103: when creating a Standby Database on the same Host as the Primary Database

Typically the standby and the primary databases are located on the different hosts to ensure the full DR capabilities. However, there are some situations when you want to have the primary and standby database on the same Host.

Problem #1: You are not able to start two databases with the same SID on the same server.

Problem #2: You cannot change db_name, because it is used in the controlfile and if you try to duplicate the standby database from the primary using different db_name, you will get the following error:

ORA-01103: database name 'orcldgst' in control file is not 'orcldg'

Assume db_name=orcldg and ORACLE_SID for the primary is orcldg1. To solve problem #1 and problem #2, you need to the following steps:

db_name must be the same for both databases. But during startup nomount of the standby database, you need to set ORACLE_SID to the different value:

$ export ORACLE_SID=orcldgst1
$ sqlplus / as sysdba
SQL> startup nomount pfile='/tmp/mypfile.ora'

After that you will be able to run RMAN duplicate command to create the standby database on the same host as the primary.

Add filegroup fails with ORA-15067: command or option incompatible with diskgroup redundancy

Problem:

I was trying to add filegroup to the FRA diskgroup:

SQL> alter diskgroup FRA add filegroup high_filegroup database orcl set ‘datafile.redundancy’ = ‘HIGH’;

Error:

ORA-15067: command or option incompatible with diskgroup redundancy

Troubleshooting:

Checking diskgroup type:

SQL> select name,type,compatibility,database_compatibility from v$asm_diskgroup where name=’FRA’;

NAME      TYPE   COMPATIBILITY    DATABASE_COMPATIBILITY
————- —— ————————– ————————————————————
FRA        NORMAL 18.0.0.0.0    12.2.0.1.0

Solution:

Change diskgroup type to FLEX:

SQL> alter diskgroup FRA convert redundancy to flex;
Diskgroup altered.

Check that type was changed:

SQL> select name,type,compatibility,database_compatibility from v$asm_diskgroup where name=’FRA’;

NAME      TYPE   COMPATIBILITY    DATABASE_COMPATIBILITY
————- —— ————————– ————————————————————
FRA        FLEX   18.0.0.0.0    12.2.0.1.0

Adding filegroup succeeds:

SQL> alter diskgroup FRA add filegroup high_filegroup database orcl set ‘datafile.redundancy’ = ‘HIGH’;
Diskgroup altered.

SRVCTL: CRS-2678, CRS-0267, CRS-5802: Unable to start the agent process

We had the following problem with some customer:

srvctl start database -db dbname was failing on one of the cluster nodes with the following error:

[oracle@node1 ~]$ srvctl start database -db dbname
PRCR-1079 : Failed to start resource ora.dbname.db
CRS-2674: Start of ‘ora.dbname.db’ on ‘rac1’ failed
CRS-2678: ‘ora.dbname.db’ on ‘rac1’ has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5802: Unable to start the agent process

But during that time we were able to startup database using sqlplus:

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL> startup

ORACLE instance started.

Total System Global Area 1577058304 bytes
Fixed Size 8621136 bytes
Variable Size 805307312 bytes
Database Buffers 754974720 bytes
Redo Buffers 8155136 bytes
Database mounted.
Database opened.

It was strange and took a lot of time for me to troubleshoot this issue.

I tried many things:
* removed srvctl config using srvctl remove database -db orcl
* readded it again srvctl add database -db orcl
* readded instances
* also tried to restart crs and even the servers
but with no luck.

Then I found the following documentation Doc ID 1957360.1 on Oracle site and tried to reproduce the same problem on my lab servers and I did it.

I tried to change the ownership for the file on my test cluster on only one node:

[root@rac1 ~]# ll /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc
-rw-r–r– 1 oracle oinstall 1085 Sep 5 20:17 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc
[root@rac1 ~]# ll /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid
-rw-r–r– 1 oracle oinstall 6 Sep 5 20:17 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid
[root@rac1 ~]# chown root:root /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid
[root@rac1 ~]# chown root:root /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

I tried to startup instance using sqlplus and it was successful:

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL> startup

ORACLE instance started
Database mounted.
Database opened.

Stopped the database and tried with srvctl :

After a long wait it failed:

[oracle@rac1 ~]$ srvctl start database -db orcl
PRCR-1079 : Failed to start resource ora.orcl.db
CRS-2674: Start of ‘ora.orcl.db’ on ‘rac1’ failed
CRS-2678: ‘ora.orcl.db’ on ‘rac1’ has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5802: Unable to start the agent process

I also checked customer logs and found that files crsd_oraagent_oracle.pidcrsd_oraagent_oracleOUT.trc were not updated for a long time, they were older than other files.

So to solve such problem you need to assign correct owner, group and access permission for the above two files and you will be able to start database using srvctl.

[root@rac1 ~]# chown oracle:oinstall /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid
[root@rac1 ~]# chown oracle:oinstall /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc
[root@rac1 ~]# chown 644 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid
[root@rac1 ~]# chown 644 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

You may never have such errors but if you have you know how to solve.

Daylight saving time support in Oracle CRS

Dear readers,

I am glad to announce that my blog has been entered in Top 50 Oracle Blogs. For more information about Top 100 Oracle Blogs And Websites for Oracle DBAs To Follow in 2018 please visit https://blog.feedspot.com/oracle_blogs. You will improve your knowledge and experience by following them. 

In this post, I want to share my experience of how I solved the daylight saving time problem with Oracle CRS. With the default setup, in case timezone changes on your system, the client/application who connects to the database remotely(local/BEQ connections have correct timezone) will still have old timezone information and will enter wrong data.

Some countries,  that are not affected by daylight saving time are lucky and does not have to worry about it. But if your servers are not located in lucky countries then you must make CRS DTS aware.

During the GI installation, Oracle saves Timezone information in $CRS_HOME/crs/install/s_crsconfig_hostname_env.txt file, that makes TZ not to change for CRS even it is changed on OS level.

Please note that timezone can be changed for the database using srvctl:

srvctl setenv database -env 'TZ=time zone'

But I do not recommend to do that, because you must do the same everytime you create a new database.
Better to change TZ globally at CRS level.

In simple words just commenting out the TZ variable in $CRS_HOME/crs/install/s_crsconfig_hostname_env.txt and restarting the CRS on each node just one time is enough to do that, but let’s check it.

1.  List the current timezone settings:

[root@rac1 ~]# timedatectl status|grep zone
Time zone: UTC (UTC, +0000)
[root@rac2 ~]#  timedatectl status|grep zone
Time zone: UTC (UTC, +0000)

2. Change timezone at OS level:

[root@rac1 ~]# timedatectl set-timezone Europe/Bratislava
[root@rac2 ~]# timedatectl set-timezone Europe/Bratislava

3. Check local and scan connections:

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL> select to_char(sysdate,'HH24:MI:SS AM')  dbtime from dual;

DBTIME
-----------
18:50:05 PM     <<<<<<<<<<<<Correct , same as OS

[oracle@rac1 ~]$ sqlplus marik/123@ORCL

SQL> select to_char(sysdate,'HH24:MI:SS AM') dbtime from dual;

DBTIME
-----------
16:50:10 PM     <<<<<<<<<<<<Incorrect

4. Comment TZ in the config file:

[root@rac1 ~]# cat /u01/app/18.3.0/grid/crs/install/s_crsconfig_rac1_env.txt|grep TZ=
#   the appropriate time zone name. For example, TZ=America/New_York
#TZ=UTC

[root@rac2 ~]# cat /u01/app/18.3.0/grid/crs/install/s_crsconfig_rac2_env.txt|grep TZ=
#   the appropriate time zone name. For example, TZ=America/New_York
#TZ=UTC

5. Restart CRS on both nodes:

[root@rac1 ~]#  crsctl stop crs
[root@rac1 ~]#  crsctl start crs -wait
[root@rac2 ~]#  crsctl stop crs
[root@rac2 ~]#  crsctl start crs -wait

6. Change timezone on OS level several times and check local & scan connections:

[root@rac1 ~]# timedatectl set-timezone Africa/Conakry
[root@rac2 ~]# timedatectl set-timezone Africa/Conakry

Important: You need to reconnect to the database(so consider that sessions must be disconnected and reconnected again, old connections have old settings)

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL> Select to_char(sysdate,'HH24:MI:SS AM') dbtime from dual;

DBTIME
-----------
17:15:56 PM <<<<<<<<<<<<Correct


[oracle@rac1 ~]$ sqlplus marik/123@ORCL

SQL> Select to_char(sysdate,'HH24:MI:SS AM') dbtime from dual;

DBTIME
-----------
17:15:27 PM <<<<<<<<<<<<Correct

Change one more time:

[root@rac1 ~]# timedatectl set-timezone America/Aruba
[root@rac2 ~]# timedatectl set-timezone America/Aruba

Exit connections and reconnect:

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL> Select to_char(sysdate,'HH24:MI:SS AM') dbtime from dual;

DBTIME
-----------
13:17:47 PM <<<<<<<<<<<<Correct

[oracle@rac1 ~]$ sqlplus marik/123@ORCL

SQL> Select to_char(sysdate,'HH24:MI:SS AM') dbtime from dual;

DBTIME
-----------
13:17:31 PM <<<<<<<<<<<<Correct

Downloading Oracle files on Linux via wget

There are several ways to download files from Oracle site.

We will use one of the methods to download Oracle Proactive Bundle Patch on the Linux machine.

First of all, find the desired file and copy its link address:

Run WGET by passing the following parameters:

# wget --http-user=mariam.kupa@gmail.com --ask-password  "https://updates.oracle.com/Orion/Services/download/p27968010_121020_Linux-x86-64.zip?aru=22331652&patch_file=p27968010_121020_Linux-x86-64.zip" -O p27968010_121020_Linux-x86-64.zip
Password:

That’s it!

 

Linux: Rename files from uppercase to lowercase

If you have downloaded Oracle 18c installation files, you may need to change downloaded file  names from uppercase letters into lowercase. 🙂

[root@rac1 ~]# cd /sw
[root@rac1 sw]# for i in LINUX.X64_180000_*; do mv $i `echo $i |tr [:upper:] [:lower:]`; done

You may think these are just two files and why I need script? I can do it manually.. 🙂
You are right , but scripting is much more fun. Good luck!