Automatic Failback of a Service in a Oracle-19c-RAC-Database
High-availability of database services has been a feature of Oracle Real Application Servers since many versions.
Basically, when a database instance fails, a service which has got this instance as a preferred instance, fails over to another available instance.
Unfortunately, the service did not fail back to the original instance as soon as the instance is up again.
The administrator had to relocate the service. This has changed with Oracle Database 19c.
Environment running a three node Oracle 19c cluster.
Both Grid Infrastructure and RDBMS are Oracle 19.3.0.0.0.
An administrator managed database is running on all nodes:
oracle@rac01$] srvctl status database -db RACDB
Instance RACDB01 is running on node rac01
Instance RACDB02 is running on node rac02
Instance RACDB03 is running on node rac03
Let’s create a simple service for this database:
oracle@rac01$] srvctl add service -db RACDB -service FOTEST -preferred RACDB02 -available RACDB01 -failback YES
Please note the new option “-failback YES”.
This will make the service fail back to the original instance (in my case “RACDB02”) .
The default is “NO”, i.e. Oracle will keep the old behavior by default.
oracle@rac01$] srvctl start service -db RACDB -service FOTEST
oracle@rac01$] srvctl status service -db RACDB -service FOTEST
Service FOTEST is running on instance(s) RACDB02
oracle@rac01$] srvctl config service -db RACDB -service FOTEST
Service name: FOTEST
Server pool:
Cardinality: 1
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Global: false
Commit Outcome: false
Failover type:
Failover method:
Failover retries:
Failover delay:
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name:
Hub service:
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Failback : true
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Preferred instances: RACDB02
Available instances: RACDB01
CSS critical: no
Service uses Java: false
The line “Failback: true” shows, that failback is configured.
Unfortunately, there is no line “Failback: false” if failback is not configured.
Let’s reboot node white (which hosts instance RACDB02) in another session and see what happens:
When the instance is down on node white, the service is started on node rac01 (instance RACDB02). This is the expected, well known behavior):
oracle@rac01$] srvctl status service -db RACDB -service FOTEST
Service FOTEST is running on instance(s) RACDB01
It takes some time for node white to reboot and to start all the Grid Infrastructure stuff, but after some time – without intervention of the DBA:
oracle@rac01$] srvctl status service -db RACDB -service FOTEST
Service FOTEST is running on instance(s) RACDB02
The service is back on instance RACDB02 again.
To sum up, a feature which has been expected by RAC administrators for years, finally was implemented in Oracle 19c. And it works fine.
No comments:
Post a Comment