Q What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application
Clusters (RAC) 11g Release 2 feature that provides a single name for clients to
access an Oracle Database running in a cluster. The benefit is clients using
SCAN do not need to change if you add or remove nodes in the cluster.
Q what is dynamic remastering ? When will the dynamic remastering
dynamic remastering is ability to move the ownership of resource
from one instance to another instance in RAC. dynamic resource remastering is
used to implement for resource affinity for increased performance. resource
affinity optimized the system in situation where update transactions are being
executed in one instance. when activity shift to another instance the resource
affinity correspondingly move to another instance. If activity is not localized
then resource ownership is hashed to the instance.
In 10g dynamic remastering happens in file+object level.the process of
remastering is very stringent. For one instance should touch more than 50 times
than the other instance in particular period(say 10 mints). this touch ratio
and time can be tuned by gc_affinity_limit and _gc_affinity_time parameter.
Q why we required to maintain odd number of voting disks?
Odd number of disk are to avoid split brain, When Nodes in cluster
can't talk to each other they run to lock the Voting disk and whoever lock the
more disk will survive, if disk number are even there are chances that node
might lock 50% of disk (2 out of 4) then how to decide which node to evict.
whereas when number is odd, one will be higher than other and each for cluster
to evict the node with less number
Q How you check the health of Your RAC Database?
'crsctl' command from root or
oracle user can be used to check the clusterware health But for starting or
stopping we have to use root user or any privilege user.
[oracle@TEST_NODE1 ~]$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
Q How you check the services in RAC Node?
We can check the service or start the services with 'srvctl'
command.load balanced/TAF service named RAC online.
[oracle@TEST_NODE1 ~]$ srvctl start service -d orcl -s RAC
[oracle@TEST_NODE1 ~]$ crsstat
Q If there is some issue with virtual IP how will you troubleshoot
it?How will you change virtual ip?
change the VIP (virtual IP) on a RAC node, use the command
[oracle@testnode oracle]$ srvctl modify nodeapps
-A new_address
Q How you will backup your RAC Database?
strategy of RAC Database:
An RAC Database consists of
2)Voting disk &
3)Database files, controlfiles, redolog files & Archive log files
Q Do you have any idea of load balancing in application?How load
balancing is done?
Q What is RAC?
RAC stands for
Real Application cluster. It is a clustering solution from Oracle Corporation
that ensures high availability of databases by providing instance failover,
media failover features.
Q What is RAC and how is it different from non RAC databases?
RAC stands for
Real Application Cluster, you have n number of instances running in their own
separate nodes and based on the shared storage. Cluster is the key component
and is a collection of servers operations as one unit. RAC is the best solution
for high performance and high availably. Non RAC databases has single point of
failure in case of hardware failure or server crash.
Q Give the usage of srvctl ?
srvctl start
instance -d db_name -i "inst_name_list" [-o start_options]
srvctl stop
instance -d name -i "inst_name_list" [-o stop_options]
srvctl stop
instance -d orcl -i "orcl3,orcl4" -o immediate
srvctl start
database -d name [-o start_options]
srvctl stop
database -d name [-o stop_options]
srvctl start
database -d orcl -o mount
Q Mention the Oracle RAC software components ?
Oracle RAC is
composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle
RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache
Service) that enable cache fusion.Oracle RAC instances are composed of
following background processes:
Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
Q What is GRD?
GRD stands for
Global Resource Directory. The GES and GCS maintains records of the statuses of
each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.
Q What are the different network components are in 10g RAC?
private, and vip components
Private interfaces is for intra node communication. VIP is all about
availability of application. When a node fails then the VIP component fail over
to some other node, this is the reason that all applications should based on
vip components means tns entries should have vip entry in the host list
Q Give Details on ACMS:
ACMS stands
for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an
agent that ensures a distributed SGA memory update(ie)SGA updates are globally
committed on success or globally aborted in event of a failure.
Q What are the major RAC wait events?
In a RAC
environment the buffer cache is global across all instances in the cluster and
hence the processing differs.The most common wait events related to this are gc
cr request and gc buffer busy
GC CR request :the time it takes to retrieve the data from the remote cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned
queries will increase the amount of data blocks requested by an Oracle session.
The more blocks requested typically means the more often a block will need to
be read from a remote instance via the interconnect.)
GC BUFFER BUSY: It is the time the remote instance locally spends accessing the
requested data block.
Q Give details on GTX0-j
The process
provides transparent support for XA global transactions in a RAC
environment.The database autotunes the number of these processes based on the
workload of XA global transactions.
Q Give details on LMON
This process
monitors global enques and resources across the cluster and performs global
enqueue recovery operations.This is called as Global Enqueue Service Monitor.
Q Give details on LMD
This process
is called as global enqueue service daemon. This process manages incoming
remote resource requests within each instance.
Q Give details on LMS
This process
is called as Global Cache service process.This process maintains statuses of
datafiles and each cahed block by recording information in a Global Resource
Dectory(GRD).This process also controls the flow of messages to remote
instances and manages global data block access and transmits block images
between the buffer caches of different instances.This processing is a part of
cache fusion feature.
Q Give details on LCK0
This process
is called as Instance enqueue process.This process manages non-cache fusion
resource requests such as libry and row cache requests.
Q Give details on RMSn
This process
is called as Oracle RAC management process.These pocesses perform managability
tasks for Oracle RAC.Tasks include creation of resources related Oracle RAC
when new instances are added to the cluster.
Q How to export and
import crs resources while migrating Oracle RAC to new server.
Below script
generate svrctl add script for database, instance, service and 11G listeners
from OCR from current RAC.
Save the result of the script and run it at new RAC.
for DBNAME in $(srvctl config database)
# Generate DB resource
srvctl config database -d $DBNAME -a | awk -v dbname="$DBNAME" \
'BEGIN { FS=":" }
$1~/Oracle home/ || $1~/ORACLE_HOME/ {dbhome = "-o" $2}
$1~/Spfile/ || $1~/SPFILE/ {spfile = "-p" $2}
$1~/Disk Groups/ {dg = "-a" $2}
END { if (avail == "-a ") {avail = ""}; printf "%s %s
%s %s %s\n", "srvctl add database -d ", dbname, dbhome, spfile,
dg }'
# Generate Instance resource
srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \
'$4~/running/ { printf "%s %s %s %s %s %s\n", "srvctl add
instance -d ",dbname, " -i ", $2 ," -n ", $7 }
$5~/running/ { printf "%s %s %s %s %s %s \n", "srvctl add
instance -d ",dbname, " -i ", $2 ," -n ", $8 }'
# Modify instance for 10G - ASM dependency
if [ $(echo $ORACLE_HOME | grep "1020" | wc -l ) -eq 1 ]
srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \
'$2~/1$/ { printf "%s %s %s %s %s \n", "srvctl modify instance
-d ",dbname, " -i ", $2 ," -s +ASM1" }
$2~/2$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d
",dbname, " -i ", $2 ," -s +ASM2" }
$2~/3$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d
",dbname, " -i ", $2 ," -s +ASM3" }
$2~/4$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d
",dbname, " -i ", $2 ," -s +ASM4" }'
echo "srvctl start database -d $DBNAME"
# Generate Service resource
snamelist=$(srvctl status service -d $DBNAME | awk '{print $2}')
for sname in $snamelist
srvctl config service -d $DBNAME -s $sname| awk -v dbname="$DBNAME"
-v sname=$sname \
'BEGIN { FS=":"}
$1~/Preferred instances/ {pref = "-r" $2}
$1~/PREF/ {pref = "-r" $2; sub(/AVAIL/, "", pref) }
$1~/Available instances/ {avail = "-a" $2}
$2~/AVAIL/ {avail = "-a" $3}
$1~/Failover type/ {ft = "-e" $2}
$1~/Failover method/ {fm = "-m" $2}
$1~/Runtime Load Balancing Goal/ {g = "-B" $2}
END { if (avail == "-a ") {avail = ""}; printf "%s %s
%s %s %s %s %s %s %s %s\n", "srvctl add service -d ",dbname,
"-s ", sname, pref, avail ,ft, fm,g, "-P BASIC"}'
echo "srvctl start service -d $DBNAME -s $sname"
# Listener at 11G Home. 10G listener can't ba added with srvctl.
srvctl config listener | awk \
'BEGIN { FS=":"; state = 0; }
$1~/Name/ {lname = "-l" $2; state=1};
$1~/Home/ && state == 1 {ohome = "-o" $2; state=2;}
$1~/End points/ && state == 2 {lport = "-p " $3; state=3;}
state == 3 {if (ohome != "-o ") {printf "%s %s %s
%s\n", "srvctl add listener ", lname, ohome, lport;} state=0;}'
Q Give details on RSMN
This process
is called as Remote Slave Monitor.This process manages background slave process
creation andd communication on remote instances. This is a background slave
process.This process performs tasks on behalf of a co-ordinating process
running in another instance.
Q What components in RAC must reside in shared storage?
All datafiles,
controlfiles, SPFIles, redo log files must reside on cluster-aware shred
Q What is the significance of using cluster-aware shared storage
in an Oracle RAC environment?
All instances
of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog
files when these files are hosted out of cluster-aware shared storage which are
group of shared disks.
Q Give few examples for solutions that support cluster storage
storage management),raw disk devices,network file system(NFS), OCFS2 and
OCFS(Oracle Cluster Fie systems).
Q What is an interconnect network?
interconnect network is a private network that connects all of the servers in a
cluster. The interconnect network uses a switch/multiple switches that only the
nodes in the cluster can access.
Q How can we configure the cluster interconnect?
Configure User
Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unix and
linux systems we use UDP and RDS(Reliable data socket) protocols to be used by
Oracle Clusterware.Windows clusters use the TCP protocol.
Q Can we use crossover cables with Oracle Clusterware
No, crossover
cables are not supported with Oracle Clusterware intercnects.
Q What is the use of cluster interconnect?
interconnect is used by the Cache fusion for inter instance communication.
Q How do users connect to database in an Oracle RAC environment?
Users can
access a RAC database using a client/server configuration or through one or
more middle tiers ,with or without connection pooling.Users can use oracle
services feature to connect to database.
Q What is the use of a service in Oracle RAC environment?
should use the services feature to connect to the Oracle database.Services
enable us to define rules and characteristics to control how users and
applications connect to database instances.
Q What are the characteristics controlled by Oracle services
charateristics include a unique name, workload balancing and failover
options,and high availability characteristics.
Q What enables the load balancing of applications in RAC?
Oracle Net
Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.
Q What is a virtual IP address or VIP?
A virtl IP
address or VIP is an alternate IP address that the client connectins use
instead of the standard public IP address. To configureVIP address, we need to
reserve a spare IP address for each node, and the IP addresses must use the
same subnet as the public network.
Q What is the use of VIP?
If a node
fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.
Q Give situations under which VIP address failover happens
VIP addresses
failover happens when the node on
which the VIP address runs fails, all interfaces for the VIP address
fails, all interfaces for the VIP address are disconnected from the network.
Q What is the significance of VIP address failover?
When a VIP
address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't
have to wait for TCP connection timeout messages.
Q What are the administrative tools used for Oracle RAC
Oracle RAC
cluster can be administered as a single image using OEM(Enterprise
Q How do we verify that RAC instances are running?
Issue the
following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
The query gives the instance number under INST_NUMBER column,host_:instancename
under INST_NAME column.
Q What is FAN?
application Notification as it abbreviates to FAN relates to the events related
to instances,services and nodes.This is a notification mechanism that Oracle
RAc uses to notify other processes about the configuration and service level
information that includes service status changes such as,UP or DOWN
events.Applications can respond to FAN events and take immediate action.
Q Where can we apply FAN UP and DOWN events?
DOWN events can be applied to instances,services and nodes.
State the use of FAN events in case of a cluster configuration change?
During times of cluster configuration changes,Oracle RAC high availability
framework publishes a FAN event immediately when a state change occurs in the
cluster.So applications can receive FAN events and react immediately.This
prevents applications from polling database and detecting a problem after such
a state change.
Q Why should we have seperate homes for ASm instance?
It is a good
practice to have ASM home seperate from the database hom(ORACLE_HOME).This
helps in upgrading and patching ASM and the Oracle database software
independent of each other.Also,we can deinstall the Oracle database software
independent of the ASM instance.
Q What is the
advantage of using ASM?
Having ASM is
the Oracle recommended storage option for RAC databases as the ASM maximizes
performance by managing the storage configuration across the disks.ASM does
this by distributing the database file across all of the available storage
within our cluster database environment.
Q What is rolling upgrade?
It is a new
ASM feature from Database 11g.ASM instances in Oracle database 11g release(from
11.1) can be upgraded or patched using rolling upgrade feature. This enables us
to patch or upgrade ASM nodes in a clustered environment without affecting
database availability.During a rolling upgrade we can maintain a functional
cluster while one or more of the nodes in the cluster are running in different
software versions.
Q Can rolling upgrade be used to upgrade from 10g to 11g database?
No,it can be
used only for Oracle database 11g releases(from 11.1).
Q State the initialization parameters that must have same value
for every instance in an Oracle RAC database
initialization parameters are critical at the database creation time and must
have same values.Their value must be specified in SPFILE or PFILE for every
instance.The list of parameters that must be identical on every instance are
given below:
Q What is ORA-00603: ORACLE server session terminated by fatal
error or ORA-29702: error occurred in Cluster Group Service operation?
RAC node name
was listed in the loopback address...
Q Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all
parameters can be identical on all instances only if these parameter values are
set to zero.
What two parameters must be set at the time of starting up an ASM instance in a
RAC environment?The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.
Q Mention the components of Oracle clusterware
clusterware is made up of components like voting disk and Oracle Cluster
Q What is a CRS resource?
clusterware is used to manage high-availability operations in a
cluster.Anything that Oracle Clusterware manages is known as a CRS resource.Some
examples of CRS resources are database,an instance,a service,a listener,a VIP
address,an application process etc.
Q What is the use of OCR?
clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).
Q How does a Oracle Clusterware manage CRS resources?
clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).
Q Name some Oracle clusterware tools and their uses?
allocating and deallocating network interfaces
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry
OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources
Q What are the modes of deleting instances from ORacle Real
Application cluster Databases?
We can delete
instances using silent mode or interactive mode using DBCA(Database
Configuration Assistant).
Q How do we remove ASM from a Oracle RAC environment?
We need to
stop and delete the instance in the node first in interactive or silent
mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name
Q How do we verify that an instance has been removed from OCR
after deleting an instance?
Issue the
following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
Q How do we verify an existing current backup of OCR?
We can verify
the current backup of OCR using the following command : ocrconfig -showbackup
What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views
called as global views that has an INST_ID column of numeric data type.GV$
views obtain information from individual V$ views.
What are the types of connection load-balancing?
There are two types of connection load-balancing:server-side load balancing and
client-side load balancing.
Q What is the difference between server-side and client-side
connection load balancing?
balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to
redirect connections to the instance providing best service.
Q What are the three greatest benefits that RAC provides??
The three main
benefits are availability, scalability, and the ability to use low cost
commodity hardware. RAC allows an application to scale vertically, by adding
CPU, disk and memory resources to an individual server. But RAC also provides
horizontal scalability, which is achieved by adding new nodes into the cluster.
RAC also allows an organization to bring these resources online as they are
needed. This can save a small or midsize organization a lot of money in the
early stages of a project.
In a RAC
environment, if a node in the cluster fails, the application continues to run
on the surviving nodes contained in the cluster. If your application is configured
correctly, most users won't even know that the node they were running on became
Q What are the major RAC wait events?
In a RAC
environment the buffer cache is global across all instances in the cluster and
hence the processing differs.The most common wait events related to this
are gc cr request and gc buffer busy
GC CR request: the time it takes to retrieve the data
from the remote cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned
queries will increase the amount of data blocks
requested by an Oracle session. The more blocks requested typically means the
more often a block will need to be read from a remote instance via the
GC BUFFER BUSY: It is the time the
remote instance locally spends accessing the requested data block.
Q What are the different network components in Oracle 10g RAC?
We have
public, private, and VIP components. Private interfaces is for intra node
communication. VIP is all about availability of application. When a node fails
then the VIP component will fail over to some other node, this is the
reason that all applications should be based on VIP components.
This means that tns entries should have VIP entry in the host list.
Q Tune the following RAC DATABASE (DBNAME=PROD) which is 3 node
CPU 8 CPU 15
What are you
looking for here? What tuning information do you expect?
It is a 3 node cluster with different hardware configuration running RAC.
I would put 20% of the memory for Oracle in each node. So that would mean that
the SGA is different in each of the nodes.
Also since the CPU's are different PROD2 can have more number of max number of
processes as compared to the rest of them.
But as I said this is just configuration, this is not tuning. Question is not
Q Write a sample script for RMAN for the recovery if all the
instance are down.(First explain the procedure how you will restore)
Bring all
nodes down.
Start one Node
Restore all datafiles and archive logs.
Recover 1 Node.
Open the database.
bring other nodes up.
Confirm that all nodes are operational.
Clients are performing some operation and suddenly one of the
datafile is experiencing problem what do you do? The cluster is a two node
1. Bring the datafile offline
recover the datafile.
1. How can you connect to a specific node
in a RAC environment?
2. tnsnames.ora ensure that
you have INSTANCE_NAME specified in it.
Q How to move OCR and Voting disk to new storage device?
Moving OCR
You must be logged in as the root user, because root owns the OCR files. Also
an ocrmirror must be in place before trying to replace the OCR device.
Make sure there is a recent backup of the OCR file before making any changes:
ocrconfig –showbackup
If there is not a recent backup copy of the OCR file, an export can be taken
for the current OCR file. Use the following command to generate an export of
the online OCR file:
In 10.2
# ocrconfig –export -s online
In 11g
# ocrconfig -manualbackup
The new OCR disk must be owned by root, must be in the oinstall group, and must
have permissions set to 640. Provide at least 100 MB disk space for the OCR.
On one node as root run:
# ocrconfig -replace ocr
# ocrconfig -replace ocrmirror
Now run ocrcheck to verify if the OCR is pointing to the new file
Moving Voting Disk
Note: crsctl votedisk commands must be run as root
Shutdown the Oracle Clusterware (crsctl stop crs as root) on all nodes before
making any modification to the voting disk. Determine the current voting disk
location using:
crsctl query css votedisk
Take a backup of all voting disk:
dd if=voting_disk_name of=backup_file_name
To move a Voting Disk, provide the full path including file name:
crsctl delete css votedisk –force
crsctl add css votedisk –force
After modifying the voting disk, start the Oracle Clusterware stack on all
# crsctl start crs
Verify the voting disk location using
crsctl query css votedisk
Q What is runfixup.sh script in Oracle Clusterware 11g
release 2 installation
With Oracle
Clusterware 11g release 2, Oracle Universal Installer (OUI) detects
when the minimum requirements for an installation are not met, and creates
shell scripts, called fixup scripts, to finish incomplete system
configuration steps. If OUI detects an incomplete task, then it generates
fixup scripts (runfixup.sh). You can run the fixup script after you click
the Fix and Check Again Button.
The Fixup
script does the following:
■ If necessary sets kernel
parameters to values required for successful installation,
– Shared
memory parameters.
– Open file
descriptor and UDP send/receive parameters.
■ Sets permissions on the
Oracle Inventory (central inventory) directory.
■ Reconfigures primary and
secondary group memberships for the installation
owner, if
necessary, for the Oracle Inventory directory and the operating system
■ Sets shell limits if
necessary to required values.
Q When exactly during the installation process are clusterware
components created?
After fulfilling the pre-installation requirements, the basic installation
steps to follow are:
1. Invoke the Oracle Universal Installer (OUI)
2. Enter the different information for some components
- name of the cluster
- public and private node names
- location for OCR and Voting Disks
- network interfaces used for RAC instances
3. After the Summary screen, OUI will start copying
under the $CRS_HOME (this is the $ORACLE_HOME for Oracle Clusterware) in the
local node the libraries and executables.
- here we will have the daemons and scripts init.* created and configured
Oracle Clusterware is formed of several daemons, each one of which
have a special function inside the stack. Daemons are executed via the init.*
scripts (init.cssd, init.crsd and init.evmd).
- note that for CRS only some client libraries are recreated, but not all the
executables (as for the RDBMS).
4. Later the software is propagated to the rest of the
nodes in the cluster and the oraInventory is updated.
5. The installer will ask to execute root.sh on each
node. Until this step the software for Oracle Clusterware is inside the
Running root.sh will create several components outside the $CRS_HOME:
- OCR and VD will be formated.
- control files (or SCLS_SRC files ) will be created with the correct contents
to start Oracle Clusterware.
These files are used to control some aspects of Oracle Clusterware
- enable/disable processes from the CSSD family (Eg. oprocd,
- stop the daemons (ocssd.bin, crsd.bin, etc).
- prevent Oracle Clusterware from being started when the machine
- etc.
- /etc/inittab will be updated and the init process is notified.
In order to start the Oracle Clusterware daemons, the init.*
scripts first need to be run. These scripts are executed by the daemon init. To
accomplish this some entries must be created in the file /etc/inittab.
- the different processes init.* (init.cssd, init.crsd, etc) will start the
daemons (ocssd.bin, crsd.bin, etc). When all the daemons are running then we
can say that the installation was successful
- On 10.2 and later, running root.sh on the last node in the cluster also will
create the nodeapps (VIP, GSD and ONS). On 10.1, VIPCA is executed as part of
the RAC installation.
6. After running root.sh on each node, we need to
continue with the OUI session. After pressing the 'OK' button OUI will include
the information for the public and cluster_interconnect interfaces. Also CVU
(Cluster Verification Utility) will be executed.
Q What are Oracle Clusterware processes for 10g on Unix and Linux
Cluster Synchronization Services (ocssd) — Manages
cluster node membership and runs as the oracle user; failure of this process
results in cluster restart.
Cluster Ready Services (crsd) — The crs process
manages cluster resources (which could be a database, an instance, a service, a
Listener, a virtual IP (VIP) address, an application process, and so on) based
on the resource's configuration information that is stored in the OCR. This includes
start, stop, monitor and failover operations. This process runs as the root
Event manager daemon (evmd) —A background process that
publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitor
the cluster and provide I/O fencing. OPROCD performs its check, stops running,
and if the wake up is beyond the expected time, then OPROCD resets the
processor and reboots the node. An OPROCD failure results in Oracle Clusterware
restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to
support Oracle-specific requirements and complex resources. Runs server callout
scripts when FAN events occur.
Q What are Oracle database background processes specific to RAC
•LMS—Global Cache Service Process
•LMD—Global Enqueue Service Daemon
•LMON—Global Enqueue Service Monitor
•LCK0—Instance Enqueue Process
To ensure that each Oracle RAC database instance obtains the block that it
needs to satisfy a query or transaction, Oracle RAC instances use two
processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES).
The GCS and GES maintain records of the statuses of each data file and each
cached block using a Global Resource Directory (GRD). The GRD contents are
distributed across all of the active instances.
Q What are Oracle
Clusterware Components
Voting Disk — Oracle RAC uses the voting disk to
manage cluster membership by way of a health check and arbitrates cluster
ownership among the instances in case of network failures. The voting disk must
reside on shared disk.
Oracle Cluster Registry (OCR) — Maintains cluster
configuration information as well as configuration information about any
cluster database within the cluster. The OCR must reside on shared disk that is
accessible by all of the nodes in your cluster
Q How do you troubleshoot node reboot
Please check
metalink ...
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for
diagnosing Oracle Clusterware Node evictions.
Q How do you backup the OCR
There is an
automatic backup mechanism for OCR. The default location is :
To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore
With Oracle RAC 10g Release 2 or later, you can also use the export command:
#ocrconfig -export -s online, and use -import option to restore the contents
With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the
# ocrconfig -manualbackup
Q How do you backup voting disk
if=voting_disk_name of=backup_file_name
Q How do I identify the voting disk location
#crsctl query
css votedisk
Q How do I identify the OCR file location
check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
Q Is ssh required for normal Oracle RAC operation ?
"ssh" are not required for normal Oracle RAC operation. However
"ssh" should be enabled for Oracle RAC and patchset installation.
Q What is SCAN?
Single Client
Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle
Database running in a cluster. The benefit is clients using SCAN do not need to
change if you add or remove nodes in the cluster.
Q What is the purpose of Private Interconnect ?
uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the the clustered nodes. This communication is
based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process
communication (TCP). Cache Fusion is the remote memory mapping of Oracle
buffers, shared between the caches of participating nodes in the cluster.
Q Why do we have a Virtual IP (VIP) in Oracle RAC?
Without using
VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a
result, you don't really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to
some other node and new node re-arps the world indicating a new MAC address for
the IP. Subsequent packets sent to the VIP go to the new node, which will send
error RST packets back to the clients. This results in the clients getting
errors immediately
Q What do you do if you see GC CR BLOCK LOST in top 5 Timed Events
in AWR Report?
This is most
likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies
failed" , Work with your system administrator find the fault with network.
Q How many nodes are supported in a RAC Database?
10g Release 2,
support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a
RAC database.
Q Srvctl cannot start instance, I get the following error
PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you
identify the problem?
Set the
environmental variable SRVM_TRACE to true.. And start the instance with srvctl.
Now you will get detailed error stack.
Q what is the purpose of the ONS daemon?
The Oracle
Notification Service (ONS) daemon is an daemon started by the CRS clusterware
as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receive a subset of published
clusterware events via the local evmd and racgimon clusterware daemons and
forward those events to application subscribers and to the local listeners.
This in order to facilitate:
a. the FAN or Fast Application Notification feature or allowing applications to
respond to database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing
accross different rac nodes dependent of the load on the different nodes. The
rdbms MMON is creating an advisory for distribution of work every 30seconds and
forward it via racgimon and ONS to listeners and applications.
Q How do users connect to database in an Oracle RAC environment?
Users can
access a RAC database using a client/server configuration or through one or
more middle tiers, with or without connection pooling. Users can use oracle
services feature to connect to database.
Q What is the use of a service in Oracle RAC environment?
should use the services feature to connect to the Oracle database. Services
enable us to define rules and characteristics to control how users and
applications connect to database instances.
Q What are the characteristics controlled by Oracle services
characteristics include a unique name, workload balancing and failover options,
and high availability characteristics.
Q What is a voting disk?
A voting disk
is a file that manages information about node membership.
Q What are the administrative tasks involved with voting disk?
administrative tasks are performed with the voting disk :
1) Backing up
voting disks
2) Recovering
Voting disks
3) Adding
voting disks
4) Deleting
voting disks
5) Moving
voting disks
Q How do we backup voting disks?
1) Oracle
recommends that you back up your voting disk after the initial cluster creation
and after we complete any node addition or deletion procedures.
2) First, as
root user, stop Oracle Clusterware (with the crsctl stop crs command) on all
nodes. Then, determine the current voting disk by issuing the following
crsctl query
votedisk css
3) Then, issue
the dd or ocopy command to back up a voting disk, as appropriate.
Give the
syntax of backing up voting disks:-
On Linux or UNIX systems:
if=voting_disk_name of=backup_file_name
voting_disk_name is the name of the
active voting disk
backup_file_name is the name of the
file to which we want to back up the voting disk contents
On Windows systems, use the ocopy command:
voting_disk_name backup_file_name
Q What is the Oracle Recommendation for backing up voting disk?
recommends us to use the dd command to backup the voting disk with a minimum
block size of 4KB.
Q How do you restore a voting disk?
To restore the
backup of your voting disk, issue the dd or ocopy command for Linux and UNIX
systems or ocopy for Windows systems respectively.
On Linux or UNIX systems:
if=backup_file_name of=voting_disk_name
On Windows systems, use the ocopy command:
backup_file_name voting_disk_name
backup_file_name is the name of the
voting disk backup file
voting_disk_name is the name of the
active voting disk
Q How can we add and remove multiple voting disks?
If we have
multiple voting disks, then we can remove the voting disks and add them back
into our environment using the following commands, where path is the complete
path of the location where the voting disk resides:
crsctl delete css
votedisk path
crsctl add css
votedisk path
Q How do we stop Oracle Clusterware?When do we stop it?
Before making
any modification to the voting disk, as root user, stop Oracle Clusterware
using the crsctl stop crs command on all nodes.
Q How do we add voting disk?
To add a
voting disk, issue the following command as the root user, replacing the path
variable with the fully qualified path name for the voting disk we want to add:
crsctl add css
votedisk path -force
Q How do we move voting disks?
To move a
voting disk, issue the following commands as the root user, replacing the path
variable with the fully qualified path name for the voting disk we want to
crsctl delete
css votedisk path -force
crsctl add css
votedisk path -force
Q How do we remove voting disks?
To remove a
voting disk, issue the following command as the root user, replacing the path
variable with the fully qualified path name for the voting disk we want to remove:
crsctl delete
css votedisk path -force
Q What should we do after modifying voting disks?
modifying the voting disk, restart Oracle Clusterware using the crsctl start
crs command on all nodes, and verify the voting disk location using the
following command:
crsctl query
css votedisk
Q When can we use -force option?
If our cluster
is down, then we can include the -force option to modify the voting disk
configuration, without interacting with active Oracle Clusterware daemons.
However, using the -force option while any cluster node is active may corrupt
our configuration.
