Disclaimer

Wednesday, 14 August 2024

Oracle DBA Interview Questions and Answers

 

ORACLE DATA GUARD INTERVIEW QUESTION - ANSWER

What is data guard?

Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions. Data Guard maintains these standby databases as copies of the production database. Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability.

What are the advantages in using Oracle Data Guard?

Following are the different benefits in using Oracle Data Guard feature in your environment.

  • High Availability.
  • Data Protection.
  • Off loading Backup operation to standby database.
  • Automatic Gap detection and Resolution in standby database.
  • Automatic Role Transition using Data Guard Broker.

What are the Protection Modes in Dataguard?

Data Guard Protection Modes

This section describes the Data Guard protection modes.
In these descriptions, a synchronized standby database is meant to be one that meets the minimum requirements of the configured data protection mode and that does not have a redo gap.

Maximum Availability

This protectionmode provides the highest level of data protection that is possible without compromising the availability of a primary database. Transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one synchronized standby database. If the primary database cannot write its redo stream to at least one synchronized standby database, it operates as if it were in maximum performance mode to preserve primary database availability until it is again able to write its redo stream to a synchronized standby database.

This mode ensures that no data loss will occur if the primary database fails, but only if a second fault does not prevent a complete set of redo data from being sent from the primary database to at least one standby database.


Maximum Performance

This protectionmode provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log. Redo data is also written to one or more standby databases, but this is done asynchronously with respect to transaction commitment, so primary database performance is unaffected by delays in writing redo data to the standby database(s).

This protection mode offers slightly less data protection than maximum availability mode and has minimal impact on primary database performance.
This is the default protection mode.

Maximum Protection

This protection mode ensures that zero data loss occurs if a primary database fails. To provide this level of protection, the redo data needed to recover a transaction must be written to both the online redo log and to at least one synchronized standby database before the transaction commits. To ensure that data loss cannot occur, the primary database will shut down, rather than continue processing transactions, if it cannot write its redo stream to at least one synchronized standby database.
Because this data protection mode prioritizes data protection over primary database availability, Oracle recommends that a minimum of two standby databases be used to protect a primary database that runs in maximum protection mode to prevent a single standby database failure from causing the primary database to shut down.


What is the difference between Physical standby and Logical standby database?

Data Guard Apply process in standby database can apply redo information directly and in that case it will be called physical standby.
OR It can apply SQL and in that case it will be called Logical standby.

Physical Standby:

In this case standby database is an exact, block-by-block, physical replica of the primary database.
The change vectors received by RFS process are directly applied to the standby database by using media recovery.so here the apply process read data blocks, assemble redo changes from mappings, and then apply redo changes to data blocks directly.
Physical Standby is the best choice for disaster recovery (DR) based upon their simplicity, transparency, high performance, and good data protection.

Logical Standby:

In this case standby database uses SQL Apply method to “mine” the redo by converting it to logical change records, and then building SQL
transactions and applying SQL to the standby database.
As this process of replaying the workload is more complex than the Physical Standby’s process, so it requires more memory, CPU, and I/O.
One good advantage here is that a logical standby database can be opened read-write while SQL Apply is active which means you can update (create/insert/delete etc) local tables and schemas in the logical standby database.


Explain the Dataguard Architecture

Data Guard architecture incorporates the following items:

• Primary Database - A production database that is used to create standby databases. The archive logs from the primary database are transfered and applied to standby databases. Each standby can only be associated with a single primary database, but a single primary database can be associated with multiple standby databases.

• Standby Database - A replica of the primary database.

• Log Transport Services - Control the automatic transfer of archive redo log files from the primary database to one or more standby destinations.

• Network Configuration - The primary database is connected to one or more standby databases using      Oracle Net.

• Log Apply Services - Apply the archived redo logs to the standby database. The Managed Recovery      Process (MRP) actually does the work of maintaining and applying the archived redo logs.

• Role Management Services - Control the changing of database roles from primary to standby. The services include switchover, switchback and failover.

• Data Guard Broker - Controls the creation and monitoring of Data Guard. It comes with a GUI and command line interface.

Primary Database:
A Data Guard configuration contains one production database, also referred to as the primary database, that functions in the primary role. This is the database that is accessed by most of your applications.

Standby Database:
A standby database is a transactionally consistent copy of the primary database. Using a backup copy of the primary database, you can create up to nine standby databases and incorporate them in a Data Guard configuration. Once created, Data Guard automatically maintains each standby database by transmitting redo data from the primary database and then applying the redo to the standby database.
The types of standby databases are as follows:

Physical standby database:
Provides a physically identical copy of the primary database, with on disk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, through Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.

Logical standby database:
Contains the same logical information as the production database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database through SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executes the SQL statements on the standby database.

What are the Steps to create Physical Standby database?

1.Take a full hot backup of Primary database

2. Enable force logging to the database

3. Prepare parameter file for primary database

4. Enable archiving

5.Create standby control file

6.Transfer full backup, init.ora, standby control file to standby node.

7.Modify init.ora file on standby node.

8.Restore database

9.Recover Standby database

10.Put Standby database in Managed Recover mode


What are the DATAGUARD PARAMETERS in Oracle?

Set Primary Database Initialization Parameters
----------------------------------------------
On the primary database, you define initialization parameters that control redo transport services while the database is in the primary role. There are additional parameters you need to add that control the receipt of the redo data and log apply services when the primary database is transitioned to the standby role.

DB_NAME=chicago
DB_UNIQUE_NAME=chicago
LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'
CONTROL_FILES='/arch1/chicago/control1.ctl', '/arch2/chicago/control2.ctl'
LOG_ARCHIVE_DEST_1=
 'LOCATION=/arch1/chicago/
  VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
  DB_UNIQUE_NAME=chicago'
LOG_ARCHIVE_DEST_2=
 'SERVICE=boston LGWR ASYNC
  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
  DB_UNIQUE_NAME=boston'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE
LOG_ARCHIVE_FORMAT=%t_%s_%r.arc
LOG_ARCHIVE_MAX_PROCESSES=30

Primary Database: Standby Role Initialization Parameters

FAL_SERVER=boston
FAL_CLIENT=chicago
DB_FILE_NAME_CONVERT='boston','chicago'
LOG_FILE_NAME_CONVERT=  '/arch1/boston/','/arch1/chicago/','/arch2/boston/','/arch2/chicago/'
STANDBY_FILE_MANAGEMENT=AUTO

Prepare an Initialization Parameter File for the Standby Database
-----------------------------------------------------------------
Create a text initialization parameter file (PFILE) from the server parameter file (SPFILE) used by the primary database; a text initialization parameter file can be copied to the standby location and modified. For example:
CREATE PFILE='/tmp/initboston.ora' FROM SPFILE;

Modifying Initialization Parameters for a Physical Standby Database.

DB_NAME=chicago
DB_UNIQUE_NAME=boston
LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'
CONTROL_FILES='/arch1/boston/control1.ctl', '/arch2/boston/control2.ctl'
DB_FILE_NAME_CONVERT='chicago','boston'
LOG_FILE_NAME_CONVERT= '/arch1/chicago/','/arch1/boston/','/arch2/chicago/','/arch2/boston/'
LOG_ARCHIVE_FORMAT=log%t_%s_%r.arc
LOG_ARCHIVE_DEST_1=  'LOCATION=/arch1/boston/
VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=boston'
LOG_ARCHIVE_DEST_2=  'SERVICE=chicago LGWR ASYNC  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)   DB_UNIQUE_NAME=chicago'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE
STANDBY_FILE_MANAGEMENT=AUTO
FAL_SERVER=chicago
FAL_CLIENT=boston


What are the services required on the primary and standby database ?

The services required on the primary database are:

• Log Writer Process (LGWR) - Collects redo information and updates the online redo logs. It can also      create local archived redo logs and transmit online redo to standby databases.

• Archiver Process (ARCn) - One or more archiver processes make copies of online redo logs either locally or remotely for standby databases.

• Fetch Archive Log (FAL) Server - Services requests for archive redo logs from FAL clients running on multiple standby databases. Multiple FAL servers can be run on a primary database, one for each FAL request. .

The services required on the standby database are:

• Fetch Archive Log (FAL) Client - Pulls archived redo log files from the primary site. Initiates transfer of archived redo logs when it detects a gap sequence.

• Remote File Server (RFS) - Receives archived and/or standby redo logs from the primary database.

• Archiver (ARCn) Processes - Archives the standby redo logs applied by the managed recovery process (MRP).

• Managed Recovery Process (MRP) - Applies archive redo log information to the standby database.


What is RTS (Redo Transport Services) in Dataguard?

It controls the automated transfer of redo data from the production database to one or more archival destinations. The redo transport services perform the following tasks:

a) Transmit redo data from the primary system to the standby systems in the configuration.

b) Manage the process of resolving any gaps in the archived redo log files due to a network failure.

c) Automatically detect missing or corrupted archived redo log files on a standby system and automatically retrieve replacement archived redo log files from the
primary database or another standby database.

What is a Snapshot Standby Database?

Oracle 11g introduces the Snapshot Standby database which essentially is an updateable standby database which has been created from a physical standby database.

We can convert a physical standby database to a snapshot standby database, do some kind of testing on a database which is a read write copy of the current primary or production database and then finally revert it to  it’s earlier state as a physical standby database.

While the snapshot standby database is open in read-write mode, redo is being received from the primary database, but is not applied.

After converting it back to a physical standby database, it is resynchronized with the primary by applying the accumalated redo data which was earlier shipped from the primary database but not applied.

Using a snapshot standby, we are able to do real time application testing using near real time production data. Very often we are required to do production clones for the purpose of testing. But using snapshot standby databases we can meet the same requirement sparing the effort,time,resources and disk space.

A snapshot standby database is a fully updatable standby database that is created by converting a physical standby database into a snapshot standby database.

Like a physical or logical standby database, a snapshot standby database receives and archives redo data from a primary database. Unlike a physical or logical standby database, a snapshot standby database does not apply the redo data that it receives. The redo data received by a snapshot standby database is not applied until the snapshot standby is converted back into a physical standby database, after first discarding any local updates made to the snapshot standby database.

How to delay the application of logs to a physical standby? 

A standby database automatically applies redo logs when they arrive from the primary database. But in some cases, we want to create a time lag between the archiving of a redo log at the primary site, and the application of the log at the standby site.

Modify the LOG_ARCHIVE_DEST_n initialization parameter on the primary database to set a delay for the standby database.

Example: For 60min Delay:
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=stdby_srvc DELAY=60';
The DELAY attribute is expressed in minutes.
The archived redo logs are still automatically copied from the primary site to the standby site, but the logs are not immediately applied to the standby database. The logs are applied when the specified time interval expires.

What is the usage of DB_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?

DB_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. DB_FILE_NAME_CONVERT parameter are used to update the location of data files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database data files location.

What is the usage of LOG_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?

LOG_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. LOG_FILE_NAME_CONVERT parameter are used to update the location of redo log files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database redo log file location.

Your standby database was out of reach because of network issue. How will you synchronize it with primary database again?

Data Guard automatically resynchronizes the standby following network or standby outages using redo data that has been archived at the primary.


What is the difference between SYNC and ASYNC redo transport method?

Synchronous transport (SYNC)

Also known as a “zero data loss” redo transport menthod.

Below is how it works:

1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

5) Remote File Server (RFS) transmits an acknowledgement back to the LNS process on the primary database

6) Log Network Server (LNS) notifies the LGWR that transmission is complete on the primary database.

7) Log Writer (LGWR) acknowledges the commit to the user.

Asynchronous transport (ASYNC)

Unlike SYNC, Asynchronous transport (ASYNC) eliminates the requirement that the LGWR wait for acknowledgement from the LNS. This removes the performance impact on the primary database irrespective of the distance between primary and standby locations. So if the LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.

Below is how it works:

1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

so step 5, 6 & 7 as discussed above for SYNC are not applicable here.

The only drawback of ASYNC is the increased potential for data loss. Say a failure destroyed the primary database before any transport lag was reduced to zero, this means any committed transactions that were a part of the transport lag will be lost. So it is highly advisable to have enough network bandwidth to handle peak redo
generation rates when using ASYNC method.

How Synchronous transport (SYNC) can impact the primary database performance?

SYNC guarantees protection for every transaction that the database acknowledges as having been committed but at the same time LGWR must wait for confirmation that data is protected at the standby before it can proceed with the next transaction. It can impact primary database performance and it depends on factors like
  • the amount of redo information to be written
  • available network bandwidth
  • round-trip network latency (RTT)
  • standby I/O performance writing to the SRL.
  • distance betweeen primary and standby databases as network RTT increases with distance.


What is Data Guard’s Automatic Gap Resolution?

Your database is using ASYNC transport method and the instance load is at the peak. The LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.

Now in some cases there can be two or more log switches before the LNS has completed sending the redo information from online redo log files and in meantime if any such required online redo log files were archived then those redo information will be transmitted via Data Guard’s gap resolution process “Automatic Gap Resolution”.

OR

In some other case when your network or the standby database is down and your primary system is one busy system, so before the connection between the primary and standby is restored, a large log file gap will be formed.
Automatic Gap Resolution will take care of such scenarios by following below action plan:

1) ARCH process on the primary database continuously ping the standby database during the outage to determine its status.
2) As soon as the standby is restored, the ARCH ping process queries the standby control file (via its RFS process) to determine the last complete log file that the standby received from the primary database.
3) Data Guard determines which log files are required to resynchronize the standby database and immediately begins transmitting them using additional ARCH processes.
4) LNS process at primary database will also attempt and succeed in making a connection to the standby database and will begin transmitting current redo. So first all the ARCH files are applied and then current redo log.

The Data Guard architecture enables gaps to be resolved quickly using multiple background ARCH processes


How is Data Guard Apply process works if primary and secondary database involves Oracle RAC?

If Primary database is RAC but standby is Non-RAC:

Each primary Oracle RAC instance ships its own thread of redo that is merged by the Data Guard apply process at the standby and applied in SCN order to the standby database.

If both Primary and standby databases are RAC:

If the standby is also an Oracle RAC database, only one instance (the apply instance) will merge and apply changes to the standby database. If the apply instance fail for any reason, the apply process will automatically failover to a surviving instance in the Oracle RAC standby database when using the Data Guard broker.

What is Active Data Guard Option (Oracle Database 11g Enterprise Edition)?

For physical standby database, prior to 11g, the database would have to be in the mount state when media recovery was active which means you were not able to query the standby database during media recovery stage as there was no read-consistent view.

Active Data Guard 11g features solves the read consistency problem by use of a “query” SCN. The media recovery process on the standby database will advance the query SCN after all the changes in a transaction have been applied . The query SCN will appear to user as the CURRENT_SCN column in the V$DATABASE view on the standby database. So Read-only users will only be able to see data up to the query SCN, and hence guaranteeing the same read consistency as the primary database.
This enables a physical standby database to be open as read-only while media recovery is active, making it useful for doing read-only workloads.

Also, if you need read-write access to the standby database, you can use SQL Apply method of dataguard.

What are the important database parameters related to Data Guard corruption prevention?

On the primary database:

a) DB_ULTRA_SAFE

Values can be DATA_AND_INDEX or DATA_ONLY. Setting DB_ULTRA_SAFE at the primary will also automatically set DB_ LOST_WRITE_PROTECT=TYPICAL on the primary database.
In Oracle Database 11g Release 2 (11.2), the primary database automatically attempts to repair the corrupted block in real time by fetching a good version of the same block from a physical standby database.

On the standby database:

a) DB_BLOCK_CHECKSUM=FULL
DB_BLOCK_CHECKSUM detects redo and data block corruptions and detect corruptions on the primary database and protect the standby database. This parameter requires minimal CPU resources.

b) DB_LOST_WRITE_PROTECT=TYPICAL
A lost write can occur when an I/O subsystem acknowledges the completion of a write, while in fact the write did not occur in persistent storage.
This will create a stale version of the data block. When the DB_LOST_WRITE_PROTECT initialization parameter is set, the database records buffer cache block reads in the redo log, and this information is used to detect lost writes.
You set DB_LOST_WRITE_PROTECT to TYPICAL in both primary and standby databases.

What is Switchover event?

Switchover is useful for minimizing downtime during planned maintenance. It is a planned event in which Data Guard reverses the roles of the primary and a standby database.

The primary database runs unaffected while we are making the required changes on our standby database (e.g. patchset upgrades, full Oracle version upgrades, etc).

Once changes are complete, production is switched over to the standby site running at the new release.

This means regardless of how much time is required to perform planned maintenance, the only production database downtime is the time required to execute a switchover, which can be less than 60 seconds

Below operations happens when switchover command is executed:
1. primary database is notified that a switchover is about to occur.
2. all users are disconnected from the primary.
3. a special redo record is generated that signals the End Of Redo (EOR).
4. primary database is converted into a standby database.
5. the final EOR record is applied to standby database, this guarantees that no data has been lost, and it converts the standby to the primary role.

What is Failover event?

The Failover process is similar to switchover event except that the primary database never has the chance to write an EOR record as this is an unplanned event.
Whether or not a failover results in data loss depends upon the Data Guard protection mode:

a) Maximum Protection >> No Data Loss
b) Maximum Availability >> No Data Loss (except when there was a previous failure (e.g. a network failure) that had INTERRUPTED REDO TRANSPORT and allowed the primary database to move ahead of standby)

c) Maximum Performance (ASYNC) >> may lose any committed transactions that were not transmitted to the standby database before the primary database failed.

Failover event can be of two types:
1) Manual
Administrator have complete control of primary-standby role transitions. It can lengthen the outage by the amount of time required for the administrator to be notified and manual execution of command.
2) Automatic
It uses Data Guard’s Fast-Start Failover feature which automatically detects the failure, evaluates the status of the Data Guard configuration, and, if appropriate, executes the failover to a previously chosen standby database.

Which tools can be used for Data Guard Management?

1) SQL*Plus – traditional method, can prove most tedious to use

2) Data Guard broker – automates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration. Simplifies and automates many administrative
tasks. It has its own command line (DGMGRL) and syntax.

3) Enterprise Manager – requires that the Data Guard broker be enabled. a GUI to the Data Guard broker, replacing the DGMGRL command line and interfacing directly with the broker’s monitor processes.

What is the difference between Recovery Point Objective(RPO) and Recovery Time Objective (RTO)?

A) Recovery Point Objective(RPO)
RPO concerns with data. It is the amount of data you are willing to lose when the failure occurs in your database system. Usually people define data loss in terms of time, so possible values can be 5 seconds of data loss, 2 hours of data loss etc.

Remember that each standby database has its own set of attributes and parameters. It means you can mix zero data loss standby databases with minimal data loss standby
databases in the same Data Guard configuration
If you have decided that you want to implement zero data loss strategy, then you should really focus on Networks and Data Loss

B) Recovery Time Objective (RTO)
RTO is defined as how fast you can get back up and running (whereas RPO is concerned with data loss)

So with your RPO strategy you lost say only about 6 seconds of data as you committed to your client but with RTO you need to formulate how fast clients can connect back to the database system after the data loss has occurred.

What are Standby Redo Log (SRL) files?

The SRL files are where the Remote File Server (RFS) process at your standby database writes the incoming redo so that it is persistent on disk for recovery. SRL files are important for better redo transport performance and data protection.

SRL are MUST in Maximum Availability or Maximum Protection mode and OPTIONAL (but recommended) in Maximum Performance mode.

If there are no Standby Redo Log (SRL) files, then at each log switch in the primary database, the RFS process on the standby database that is serving an asynchronous standby destination has to create an archive log of the right size. While the RFS is busy doing creating the archive log file, the LNS process at the primary database has to wait, getting further and further behind the LGWR (in case of Maximum Performance mode). That is why it recommended to have Standby Redo Log (SRL) files in Maximum Performance mode also.

We generally configure them on our primary database as well in preparation for a role transition b/w primary-standby.

Also, do not multiplex SRLs. Since Data Guard will immediately request a new copy of the archive log if an SRL file fails, there is no real need to have more than one copy of each.


Oracle DBA Interview Question and Answer - Export/Import

What is use of CONSISTENT option in exp?

When you export a table, you are guaranteed that the contents of that table will be consistent with the time that the export of that table was started. This means that if you start exporting the table at 12:00 and someone makes changes to the data in the table at 12:05 and your export of this table finishes at 12:10, then the export will not contain any of the changes made between 12:00 and 12:10. You cannot change this behavior with Oracle's export utility.

The CONSISTENT parameter controls whether or not the entire export is consistent, even between tables. If CONSISTENT=N (the default), then the export of a table will be consistent, but changes can occur between tables. If CONSISTENT=Y, then the entire dump file is consistent with the point in time that you started the export.

What is use of DIRECT=Y option in exp?

Normally export will follow the process of SELECT statement i.e data from disk will be copied to buffer cache and then it will be written to dump file. When we use direct path by specifying DIRECT=Y in export command, then oracle will copy data directly from disk to PGA and from there it is written to dumpfile.

What is use of COMPRESS option in exp?

If we specify COMPRESS=y during export then at the time of table creation while importing, the INITIAL extent of the table would be as large as the sum of all the extents allocated to the table in the original database.

If we specify COMPRESS=n during export then while creating table in the import, it will use the same values of INITIAL extent as in the original database.

Now lets say I have a table of 100 MB. There have been some deletions and updations and only 50 MB of actual data is there. I export the table with COMPRESS=y and recreate it in some other database. It will sum all the extents and assign as INITIAL extent while creating the table. There is only 50 MB of data in the table but it has allocated 100 MB already. In case, you have limited space this is not a very good option.

If I do with COMPRESS=N and then import the table, its INITIAL extent will be as large as INITIAL extent in the original database and then as required, new extents will be allocated. So now my table in the new database would be approximately 50 MB in size.

Which are the common IMP/EXP problems?

ORA-00001: Unique constraint ... violated - Perhaps you are importing duplicate rows. Use IGNORE=N to skip tables that already exist (imp will give an error if the object is re-created) or the table could be dropped/ truncated and re-imported if we need to do a table refresh..
IMP-00015: Statement failed ... object already exists... - Use the IGNORE=Y import parameter to ignore these errors, but be careful as you might end up with duplicate rows.
ORA-01555: Snapshot too old - Ask your users to STOP working while you are exporting or use parameter CONSISTENT=NO (However this option could create possible referential problems, because the tables are not exported from one snapshot in time).
ORA-01562: Failed to extend rollback segment - Create bigger rollback segments or set parameter COMMIT=Y (with an appropriate BUFFER parameter) while importing.

What is the Benefits of the Data Pump Technology?

The older export/import technology was client-based. The Data Pump technology is purely       server based. All dump, log, and other files are created on the server by default. Data Pump technology offers several benefits over the traditional export and import data utilities.

The following are the main benefits of the Data Pump technology:

Improved performance: The performance benefits are significant if you are transferring huge
amounts of data.

Ability to restart jobs: You can easily restart jobs that have stalled due to lack of space or have
failed for other reasons. You may also manually stop and restart jobs.

Parallel execution capabilities: By specifying a value for the PARALLEL parameter, you can choose the number of active execution threads for a Data Pump Export or Data Pump Import job.

Ability to attach to running jobs: You can attach to a running Data Pump job and interact with
it from a different screen or location. This enables you to monitor jobs, as well as to modify
certain parameters interactively. Data Pump is an integral part of the Oracle database server,
and as such, it doesn’t need a client to run once it starts a job.

Ability to estimate space requirements: You can easily estimate the space requirements for
your export jobs by using the default BLOCKS method or the ESTIMATES method, before running
an actual export job (see the “Data Pump Export Parameters” section later in this chapter for
details).

Network mode of operation: Once you create database links between two databases, you can
perform exports from a remote database straight to a dump file set. You can also perform
direct imports via the network using database links, without using any dump files. The network
mode is a means of transferring data from one database directly into another database with
the help of database links and without the need to stage it on disk.

Fine-grained data import capability: Oracle9i offered only the QUERY parameter, which enabled
you to specify that the export utility extract a specified portion of a table’s rows. With Data Pump,
you have access to a vastly improved fine-grained options arsenal, thanks to new parameters
like INCLUDE and EXCLUDE.

Remapping capabilities: During a Data Pump import, you can remap schemas and tablespaces,
as well as filenames, by using the new REMAP_ * parameters. Remapping capabilities enable
you to modify objects during the process of importing data by changing old attributes to new
values. For example, the REMAP_SCHEMA parameter enables you to map all of user HR’s schema
to a new user, OE. The REMAP_SCHEMA parameter is similar to the TOUSER parameter in the old
import utility

How to improve exp performance?

1. Set the BUFFER parameter to a high value. Default is 256KB.
2. Stop unnecessary applications to free the resources.
3. If you are running multiple sessions, make sure they write to different disks.
4. Do not export to NFS (Network File Share). Exporting to disk is faster.
5. Set the RECORDLENGTH parameter to a high value.
6. Use DIRECT=yes (direct mode export).

How to improve imp performance?

1. Place the file to be imported in separate disk from datafiles.
2. Increase the DB_CACHE_SIZE.
3. Set LOG_BUFFER to big size.
4. Stop redolog archiving, if possible.
5. Use COMMIT=n, if possible.
6. Set the BUFFER parameter to a high value. Default is 256KB.
7. It's advisable to drop indexes before importing to speed up the import process or set INDEXES=N and building indexes later on after the import. Indexes can easily be recreated after the data was successfully imported.
8. Use STATISTICS=NONE
9. Disable the INSERT triggers, as they fire during import.
10. Set Parameter COMMIT_WRITE=NOWAIT(in Oracle 10g) or COMMIT_WAIT=NOWAIT (in Oracle 11g) during import.

What are the datapump export modes?

you can perform Data Pump Export jobs in several modes:

Full export mode: You use the FULL parameter when you want to export the entire database in
one export session. You need the EXPORT_FULL_DATABASE role to use this mode.

Schema mode: If you want to export a single user’s data and/or objects only, you must use the
SCHEMAS parameter.

Tablespace mode: By using the TABLESPACES parameter, you can export all the tables in one or
more tablespaces. If you use the TRANSPORT_TABLESPACES parameter, you can export just the
metadata of the objects contained in one or more tablespaces. You may recall that you can
export tablespaces between databases by first exporting the metadata, copying the files of the
tablespace to the target server, and then importing the metadata into the target database.

Table mode: By using the TABLES parameter, you can export one or more tables. The TABLES
parameter is identical to the TABLES parameter in the old export utility.

What is COMPRESSION parameter in expdp?

The COMPRESSION parameter enables the user to specify which data to compress before writing theexport data to a dump file. By default, all metadata is compressed before it’s written out to an export dump file. You can disable compression by specifying a value of NONE for the COMPRESSION parameter, as shown here:

$ expdp hr/hr DIRECTORY=dpump_dir1 DUMPFILE=hr_comp.dmp COMPRESSION=NONE

The COMPRESSION parameter can take any of the following four values:

ALL: Enables compression for the entire operation.

DATA_ONLY: Specifies that all data should be written to the dump file in a compressed format.

METADATA_ONLY: Specifies all metadata be written to the dump file in a compressed format.
This is the default value.

NONE: Disables compression of all types.

What are Export Filtering Parameters in expdp?

Data Pump contains several parameters related to export filtering. Some of them are substitutes for old export parameters; others offer new functionality.

CONTENT

By using the CONTENT parameter, you can filter what goes into the export dump file. The CONTENT

parameter can take three values:

• ALL exports both table data and table and other object definitions (metadata).
• DATA_ONLY exports only table rows.
• METADATA_ONLY exports only metadata.

EXCLUDE and INCLUDE

The EXCLUDE and INCLUDE parameters are two mutually exclusive parameters that you can use to perform what is known as metadata filtering. Metadata filtering enables you to selectively leave out or include certain types of objects during a Data Pump Export or Import job. In the old export utility, you used the CONSTRAINTS, GRANTS, and INDEXES parameters to specify whether you wanted to export those objects. Using the EXCLUDE and INCLUDE parameters, you now can include or exclude many other kinds of objects besides the four objects you could filter previously. For example, if you don’t wish to export any packages during the export, you can specify this with the help of the EXCLUDE parameter.

QUERY

The QUERY parameter serves the same function as it does in the traditional export utility: it lets you selectively export table row data with the help of a SQL statement. The QUERY parameter permits you to qualify the SQL statement with a table name, so that it applies only to a particular table. Here’s an example:

QUERY=OE.ORDERS: "WHERE order_id > 100000"

In this example, only those rows in the orders table (owned by user OE) where the order_id is
greater than 100,000 are exported.

What is Network Link Parameter and how it works?

The Data Pump Export utility provides a way to initiate a network export. Using the NETWORK_LINK parameter, you can initiate an export job from your server and have Data Pump export data from a remote database to dump files located on the instance from which you initiate the Data Pump Export job.

Here’s an example that shows you how to perform a network export:

$ expdp hr/hr DIRECTORY=dpump_dir1 NETWORK_LINK=finance
DUMPFILE=network_export.dmp LOGFILE=network_export.log

In the example, the NETWORK_LINK parameter must have a valid database link as its value. This
means that you must have created the database link ahead of time. This example is exporting data from the finance database on the prod1 server.

Let’s say you have two databases, called local and remote. In order to use the NETWORK_LINK parameter and pass data directly over the network, follow these steps:
1. Create a database link to the remote database, which is named remote in this example:
SQL> CREATE DATABASE LINK remote
 CONNECT TO scott IDENTIFIED BY tiger
 USING 'remote.world';

2. If there isn’t one already, create a Data Pump directory object:

SQL> CREATE DIRECTORY remote_dir1 AS '/u01/app/oracle/dp_dir';

3. Set the new directory as your default directory, by exporting the directory value:

$ export DATA_PUMP_DIR=remote_dir1

4. Perform the network export from the database named remote:

$ expdp system/sammyy1 SCHEMAS=SCOTT FILE_NAME=network.dmp NETWORK_LINK=finance

You’ll see that the Data Pump Export job will create the dump file network.dmp (in the directory location specified by remote_dir1) on the server hosting the database named local. However, the data within the dump file is extracted from the user scott’s schema in the remote database (named remote in our example). You can see that the NETWORK_LINK parameter carries the dump files over the network from a remote location to the local server. All you need is a database link from a database on the local server to the source database on the remote server.

What is use of INDEXFILE option in imp?

Will write DDLs of the objects in the dumpfile into the specified file.

What is use of IGNORE option in imp?

Will ignore the errors during import and will continue the import.

What are the differences between expdp and exp (Data Pump or normal exp/imp)?

Data Pump is server centric (files will be at server).
Data Pump has APIs, from procedures we can run Data Pump jobs.
In Data Pump, we can stop and restart the jobs.
Data Pump will do parallel execution.
Tapes & pipes are not supported in Data Pump.
Data Pump consumes more undo tablespace.
Data Pump import will create the user, if user doesn’t exist.

Why expdp is faster than exp (or) why Data Pump is faster than conventional export/import?

Data Pump is block mode, exp is byte mode.
Data Pump will do parallel execution.
Data Pump uses direct path API.

How to improve expdp performance?

Using parallel option which increases worker threads. This should be set based on the number of cpus.

How to improve impdp performance?

Using parallel option which increases worker threads. This should be set based on the number of cpus.

In Data Pump, where the jobs info will be stored (or) if you restart a job in Data Pump, how it will know from where to resume?

Whenever Data Pump export or import is running, Oracle will create a table with the JOB_NAME and will be deleted once the job is done. From this table, Oracle will find out how much job has completed and from where to continue etc.
Default export job name will be SYS_EXPORT_XXXX_01, where XXXX can be FULL or SCHEMA or TABLE.
Default import job name will be SYS_IMPORT_XXXX_01, where XXXX can be FULL or SCHEMA or TABLE.

What is the order of importing objects in impdp?

 Tablespaces
 Users
 Roles
 Database links
 Sequences
 Directories
 Synonyms
 Types
 Tables/Partitions
 Views
 Comments
 Packages/Procedures/Functions
 Materialized views

How to import only metadata?

CONTENT= METADATA_ONLY

How to import into different user/tablespace/datafile/table?

REMAP_SCHEMA
REMAP_TABLESPACE
REMAP_DATAFILE
REMAP_TABLE
REMAP_DATA

Oracle DBA Interview Questions and Answers - Backup and Recovery

What is difference between Restoring and Recovery of database?

Restoring means copying the database object from the backup media to the destination where actually it is required where as recovery means to apply the database object copied earlier (roll forward) in order to bring the database into consistent state.

What is the difference between complete and incomplete recovery?

An incomplete database recovery is a recovery that it does not reach to the point of failure. The recovery can be either point of time or particular SCN or Particular archive log specially incase of missing archive log or redolog failure where as a complete recovery recovers to the point of failure possibly when having all archive log backup.


How would you decide your backup strategy and timing for backup?

In fact backup strategy is purely depends upon your organization business need.
If no downtime then database must be run on archivelog mode and you have to take frequently or daily backup.

If sufficient downtime is there and loss of data would not affect your business then you can run your database in noarchivelog mode and backup can be taken in-frequently or weekly or monthly.
In most of the case in an organization when no downtime then frequent inconsistent backup needed (daily backup), multiplex online redo log files (multiple copies), different location for redo log files, database must run in archivelog mode and dataguard can be implemented for extra bit of protection.

What is the benefit of running the DB in archivelog mode over no archivelog mode?

When a database is in no archivelog mode whenever log switch happens there will be a loss of some redoes log information in order to avoid this, redo logs must be archived. This can be achieved by configuring the database in archivelog mode.


If an oracle database is crashed? How would you recover that transaction which is not in backup?

If the database is in archivelog we can recover that transaction otherwise we cannot recover that transaction which is not in backup.

What is the difference between HOTBACKUP and RMAN backup?

For hotbackup we have to put database in begin backup mode, then take backup where as RMAN would not put database in begin backup mode. RMAN is faster can perform incremental (changes only) backup, and does not place tablespace in hotbackup mode.

Can we use Same target database as Catalog database?

No, the recovery catalog should not reside in the target database (database to be backed up) because the database can not be recovered in the mounted state.

Incremental backup levels:
Level 0 – full backup that can be used for subsequent incrementals
RMAN> backup incremental level 0 database;
Differential Level 1–only the blocks that have changed since the last backup (whether it is level 0 or level 1)
RMAN> backup incremental level 1 differential database;
Cumulative Level 1 – all changes since the last level 0 incremental backup
RMAN> backup incremental level 1 cumulative database;
A full backup cannot be used for a cumulative level 1 backup.
A cumulative level 1 backup must be done on top of an incremental level 0 backup.



Why RMAN incremental backup fails even though full backup exists?

If you have taken the RMAN full backup using the command ‘Backup database’, where as a level 0 backup is physically identical to a full backup. The only difference is that the level 0 backup is recorded as an incremental backup in the RMAN repository so it can be used as the parent for a level 1 backup. Simply the ‘full backup without level 0’ can not be considered as a parent backup from which you can take level 1 backup.


Can we perform RMAN level 1 backup without level 0?

If no level 0 is available, then the behavior depends upon the compatibility mode setting (oracle version).
If the compatibility mode less than 10.0.0, RMAN generates a level 0 backup of files contents at the time of backup.
If the compatibility is greater than 10.0.0, RMAN copies all block changes since the file was created, and stores the results as level 1 backup.

How to put Manual/User managed backup in RMAN?

In case of recovery catalog, you can put by using catalog command:
RMAN> CATALOG START WITH ‘/oracle/backup.ctl’;



How to check RMAN version in oracle?

If you want to check RMAN catalog version then use the below query from SQL*plus
SQL> Select * from rcver;

What happens actually in case of instance Recovery?

While Oracle instance fails, Oracle performs an Instance Recovery when the associated database is being re-started. Instance recovery occurs in 2 steps:

Cache recovery: Changes being made to a database are recorded in the database buffer cache as well as redo log files simultaneously. When there are enough data in the database buffer cache, they are written to data files. If an Oracle instance fails before these data are written to data files, Oracle uses online redo log files to recover the lost data when the associated database is re-started. This process is called cache recovery.

Transaction recovery: When a transaction modifies data in a database (the before image of the modified data is stored in an undo segment which is used to restore the original values in case the transaction is rolled back). At the time of an instance failure, the database may have uncommitted transactions. It is possible that changes made by these uncommitted transactions have gotten saved in data files. To maintain read consistency, Oracle rolls back all uncommitted transactions when the associated database is re-started. Oracle uses the undo data stored in undo segments to accomplish this. This process is called transaction recovery.

What is RMAN?

Recovery Manager (RMAN) is a utility that can manage your entire Oracle backup and recovery activities.

What is the difference between using recovery catalog and control file?

When new incarnation happens, the old backup information in control file will be lost. It will be preserved in recovery catalog.

In recovery catalog we can store scripts.

Recovery catalog is central and can have information of many databases.

Can we use same target database as catalog?

No, The recovery catalog should not reside in the target database (database should be backed up), because the database can’t be recovered in the mounted state.

How do you know that how much RMAN task has been completed?

By querying v$rman_status or v$session_longops

From where list & report commands will get input?

Both the commands command quering v$ and recovery catalog views. V$BACKUP_FILES or many of the recovery catalog views such asRC_DATAFILE_COPY or RC_ARCHIVED_LOG.

Command to delete archive logs older than 7days?

RMAN> delete archivelog all completed before sysdate-7;

How many times does oracle ask before dropping a catalog?

The default is two times one for the actual command, the other for confirmation.

How to view the current defaults for the database.

RMAN> show all;

What is the use of crosscheck command in RMAN?

Crosscheck will be useful to check whether the catalog information is intact with OS level information. This command only updates repository records with the status of the backups.

e.g. If user removes archived logs from disk with an operating system command, the repository still indicates that the logs are on disk, when in fact they are not.

 What are the differences between crosscheck and validate commands?

Validate command is to examine a backup set and report whether it can be restored. RMAN scans all of the backup pieces in the specified backup sets and looks at the checksum to verify that the contents are intact so that backup can be successfully restored if necessary.

Crosscheck command is to verify the status of backups and copies recorded in the RMAN repository against media such as disk or tape. The crosscheck command only processes files created on the same device type as the channel running crosscheck.

Which one is good, differential (incremental) backup or cumulative (incremental) backup?

A differential backup, which backs up all blocks changed after the most recent incremental backup at level 1 or 0

RMAN> BACKUP INCREMENTAL LEVEL 1 DATABASE;

A cumulative backup, which backs up all blocks changed after the most recent incremental backup at level 0

RMAN> BACKUP INCREMENTAL LEVEL 1 CUMULATIVE DATABASE;

Cumulative backups are preferable to differential backups when recovery time is more important than disk space, because during recovery each differential backup must be applied in succession. Use cumulative incremental backups instead of differential, if enough disk space is available to store cumulative incremental backups.

This is command for taking Level 0 backup.

RMAN> BACKUP INCREMENTAL LEVEL 0 DATABASE;

What is the difference between backup set and backup piece?

Backup set is logical and backup piece is physical.

RMAN command to backup for creating standby database

RMAN> duplicate target database

You loss one datafile and DB is running in ARCHIVELOG mode. You have full database backup of 1 week/day old and don’t have backup of this (newly created) datafile. How do you restore/recover file?

Create data file and recover datafile.

SQL> alter database create datafile ‘/u01/app/oracle/oradata/xyz.dbf’ size 2G;

RMAN> recover datafile file_id;

What is obsolete backup & expired backup?

A status of “expired” means that the backup piece or backup set is not found in the backup destination.

A status of “obsolete” means the backup piece is still available, but it is no longer needed. The backup piece is no longer needed since RMAN has been configured to no longer need this piece after so many days have elapsed, or so many backups have been performed.

What is the difference between hot backup & RMAN backup?

For hot backup, we have to put database in begin backup mode, then take backup.
RMAN won’t put database in backup mode.

How to put manual/user-managed backup in RMAN (recovery catalog)?

By using catalog command.

RMAN> CATALOG START WITH ‘/tmp/backup.ctl’;

What are the Architectural components of RMAN?

RMAN Executables
Sercer process
Channels
Target database
Recovery catalog database (optional)
Media management Layer (optional)
Backups, backup sets and backup pieces

What are channels?

A channel is an RMAN server process started when there is a need to communicate with an I/O device, such as a disk or a tape. A channel is what reads and writes RMAN backup files. It is through the allocation of channels that you govern I/O characteristics:

Type of I/O device being read or written to, either a disk or an sbt_tape
Number of processes simultaneously accessing an I/O device
Maximize size of files created on I/O devices
Maximize rate at which database files are read
Maximize number of files open at a time

Why is the catalog optional?

Because RMAN manages backup and recovery operations, it requires a place to store necessary information about the database. RMAN always stores this information in the target database control file. You can also store RMAN metadata in a recovery catalog schema contained in a separate database. The recovery catalog schema must be stored in a database other than the target database.

What is a Backup set?

A logical grouping of backup files — the backup pieces — that are created when you issue an RMAN backup command. A backup set is RMAN’s name for a collection of files associated with a backup. A backup set is composed of one or more backup pieces.

What are the benefits of using RMAN?

Incremental backups that only copy data blocks that have changed since the last backup.
Tablespaces are not put in backup mode, thus there is noextra redo log generation during online backups.
Detection of corrupt blocks during backups.
Parallelization of I/O operations.
Automatic logging of all backup and recovery operations.
Built-in reporting and listing commands.
What are the various reports available with RMAN

RMAN>list backup;

RMAN> list archive;

In catalog database, if some of the blocks are corrupted due to system crash, How will you recover?

using RMAN BLOCK RECOVER command

How do you enable the autobackup for the controlfile using RMAN?

Issue command at RMAN prompt.

RMAN> configure controlfile autobackup on;

Also we can configure controlfile backup format.

RMAN> configure controlfile autobackup format for device type disk to

2> ‘$HOME/BACKUP/RMAN/ F.bkp’;

How do you identify what are the all the target databases that are being backed-up with RMAN database?

You don’t have any view to identify whether it is backed up or not. The only option is connect to the target database and give list backup this will give you the backup information with date and timing.

How do you identify the block corruption in RMAN database? How do you fix it?

Using v$block_corruption view you can find which blocks corrupted.

RMAN> block recover datafile <fileid> block <blockid>;

Using the above statement You recover the corrupted blocks. First check whether the block is corrupted or not by using this command

SQL>select file# block# from v$database_block_corruption;

file# block

2 507

the above block is corrupted…

conn to Rman

To recover the block use this command…

RMAN>blockrecover datafile 2 block 507;

the above command recover the block 507

Now just verify it…..

Rman>blockrecover corruption list;

How do you clone the database using RMAN software? Give brief steps? When do you use crosscheck command?

Check whether backup pieces proxy copies or disk copies still exist.

Two commands available in RMAN to clone database:

1) Duplicate

2) Restore.

List some of the RMAN catalog view names which contain the catalog information?

RC_DATABASE_INCARNATION RC_BACKUP_COPY_DETAILS

RC_BACKUP_CORRUPTION

RC_BACKUP-DATAFILE_SUMMARY

How do you install the RMAN recovery catalog?

Steps to be followed:

1) Create connection string at catalog database.

2) At catalog database create one new user or use existing user and give that user a recovery_catalog_owner privilege.

3) Login into RMAN with connection string

a) export ORACLE_SID

b) rman target catalog @connection string

4) rman> create catalog;

5) register database;

What is the difference between physical and logical backups?

In Oracle Logical Backup is “which is taken using either Traditional Export/Import or Latest Data Pump”. Where as Physical backup is known “when you take Physical O/s Database related Files as Backup”.

What is RAID? What is RAID0? What is RAID1? What is RAID 10?

RAID: It is a redundant array of independent disk

RAID0: Concatenation and stripping

RAID1: Mirroring

How to enable Fast Incremental Backup to backup only those data blocks that have changed?

SQL> ALTER DATABASE enable BLOCK CHANGE TRACKING;

How do you set the flash recovery area?

SQL> ALTER SYSTEM SET db_recovery_file_dest_size = 100G;

SQL> ALTER SYSTEM SET db_recovery_file_dest = ‘/u10/oradata/school’;

What is auxiliary channel in RMAN? When do you need this?

An auxiliary channel is a link to auxiliary instance. If you do not have automatic channels configured, then before issuing the DUPLICATE command, manually allocate at least one auxiliary channel within the same RUN command.

How do you use the V$RECOVERY_FILE_DEST view to display information regarding the flashrecovery area?

SQL> SELECT name, space_limit, space_used,space_reclaimable, number_of_filesFROM v$recovery_file_dest;

How can you display warning messages?

SQL> SELECT object_type, message_type,message_level, reason, suggested_actionFROM dba_outstanding_alerts;

How do you backup the entire database?

RMAN> BACKUP DATABASE;

How do you backup an individual tablespaces?

RMAN> CONFIGURE DEFAULT DEVICE TYPE TO DISK;

RMAN> BACKUP TABLESPACE system;

How do you backup datafiles and control files?

RMAN> BACKUP DATAFILE 3;

RMAN> BACKUP CURRENT CONTROLFILE;

Use a fast recovery without restoring all backups from their backup location to the location specified inthe controlfile.

RMAN> SWITCH DATABASE TO COPY;

My Database has Level 1 backup, tell me what are all backed up ? with Example?


Database is UP and has taken Level 0 backup, is the backup taken is Consistent or Inconsistent?

How do you say a backup is Consistent or Inconsistent, Oracle Terminology?

Can we take backup when the Database is down?

If i have a RMAN full backup Level 0 of Sun @9PM, on Mon 9PM taken incremental Level 1 backup.What type of backup do you get and what is actually backedup?

If i have a RMAN full backup of Sun @9PM, on Mon 9PM taken incremental Level 1 backup.On Tuesday Database Crashed.What type of backup do you get and what is actually backedup?

There is no Backup available, Can we take a Level 1 backup?

A table got dropped between 9AM - 11AM how to get the Table backup using RMAN, 
DB size 500GB available mount point space for table recovery is 15GB?

Sys Admin has changed the time from 10:00 AM to 9:30 AM, table dropped, How do you recover the Table?

A DATAFILE is corrupted and there is no backup, How to recover the datafile?

All Controlfiles are corrupted, How to recover the controlfile?

Oracle DBA Interview Questions and Answers - Patching,Cloning and Upgrade

In which months oracle release CPU patches?

JAN, APR, JUL, OCT

When we applying single Patch, can you use opatch utility?

Yes, you can use Opatch incase of single patch. The only type of patch that cannot be used with OPatch is a patchset.

Is it possible to apply OPATCH without downtime?

As you know for apply patch your database and listener must be down. When you apply OPTACH it will update your current ORACLE_HOME. Thus coming to your question to the point in fact it is not possible without or zero downtime in case of single instance but in RAC you can Apply Opatch without downtime as there will be more separate ORACLE_HOME and more separate instances (running once instance on each ORACLE_HOME).

When you moved oracle binary files from one ORACLE_HOME server to another server then which oracle utility will be used to make this new ORACLE_HOME usable?
 

Relink all.

You have collection of patch (nearly 100 patches) or patchset. How can you apply only one patch from it?

With Napply itself (by providing patch location and specific patch id) you can apply only one patch from a collection of extracted patch. For more information check the opatch util NApply –help. It will give you clear picture.

For Example:

opatch util napply <patch_location> -id 9 -skip_subset -skip_duplicate
This will apply only the patch id 9 from the patch location and will skip duplicate and subset of patch installed in your ORACLE_HOME.

If both CPU and PSU are available for given version which one, you will prefer to apply?

From the above discussion it is clear once you apply the PSU then the recommended way is to apply the next PSU only. In fact, no need to apply CPU on the top of PSU as PSU contain CPU (If you apply CPU over PSU will considered you are trying to rollback the PSU and will require more effort in fact). So if you have not decided or applied any of the patches then, I will suggest you to go to use PSU patches. For more details refer: Oracle Products [ID 1430923.1], ID 1446582.1

PSU is superset of CPU then why someone choose to apply a CPU rather than a PSU?

CPUs are smaller and more focused than PSU and mostly deal with security issues. It seems to be theoretically more consecutive approach and can cause less trouble than PSU as it has less code changing in it. Thus any one who is concerned only with security fixes and not functionality fixes, CPU may be good approach.

How to Download Patches, Patchset or Opatch from metalink?

If you are using latest support.oracle.com then after login to metalink Dashboard
- Click on "Patches & Updates" tab
- On the left sidebar click on "Latest Patchsets" under "Oracle Server/Tools".
- A new window will appear.
- Just mouseover on your product in the "Latest Oracle Server/Tools Patchsets" page.
- Corresponding oracle platform version will appear. Then simply choose the patchset version and click on that.
- You will go the download page. From the download page you can also change your platform and patchset version.

REFERENCES:
http://docs.oracle.com/cd/E11857_01/em.111/e12255/e_oui_appendix.htm
Oracle® Universal Installer and OPatch User's Guide
11g Release 2 (11.2) for Windows and UNIX
Part Number E12255-11


What is the recent Patch applied?
  
January 2016 PSU patch

What is OPatch?

It is the utility to apply the patch.

How to Apply Opatch in Oracle?

1. You MUST read the Readme.txt file included in opatch file, look for any prereq. steps/ post installation steps or and DB related changes. Also, make sure that you have the correct opatch version required by this patch.
2.Make sure you have a good backup of database.
3. Make a note of all Invalid objects in the database prior to the patch.
4. Shutdown All the Oracle Processes running from that Oracle Home , including the Listener and Database instance, Management agent etc.
5. You MUST Backup your oracle Home and Inventory
tar -cvf $ORACLE_HOME $ORACLE_HOME/oraInventory | gzip > Backup_Software_Version.tar.gz
6. Unzip the patch in $ORACLE_HOME/patches
7. cd to the patch direcory and do opatch -apply to apply the patch.
8. Read the output/log file to make sure there were no errors.

Patching Oracle Software with OPatch ?

opatch napply <patch_location> -skip_subset -skip_duplicate
OPatch skips duplicate patches and subset patches (patches under <patch_location> that are subsets of patches installed in the Oracle home).

What is Opactch in Oracle?

OPATCH Utility (Oracle RDBMS Patching)

1. Download the required Patch from Metalink based on OS Bit Version and DB Version.
2. Need to down the database before applying patch.
3. Unzip and Apply the Patch using ”opatch apply” command.On successfully applied of patch you will see successful message “OPatch succeeded.“, Crosscheck your patch is applied by using “opatch lsinventory” command .
4. Each patch has a unique ID, the command to rollback a patch is “opatch rollback -id  <patch no.>” command.On successfully applied of patch you will see successful message “OPatch succeeded.“, Crosscheck your patch is applied by using “opatch lsinventory” command .
5. Patch file format will be like, “p<patch no.>_<db version>_<os>.zip”
6. We can check the opatch version using “opatch -version” command.
7. Generally, takes 2 minutes to apply a patch.
8. To get latest Opatch version download “patch 6880880 - latest opatch tool”, it contains OPatch directory.
9. Contents of downloaded patches will be like “etc,files directories and a README file”
10. Log file for Opatch utility can be found at $ORACLE_HOME/cfgtoollogs/opatch
11. OPatch also maintains an index of the commands executed with OPatch and the log files associated with it in the history.txt file located in the <ORACLE_HOME>/cfgtoollogs/opatch directory.
12. Starting with the 11.2.0.2 patch set, Oracle Database patch sets are full installations of the Oracle Database software. This means that you do not need to install Oracle Database 11g Release 2 (11.2.0.1) before installing Oracle Database 11g Release 2 (11.2.0.2).
13. Direct upgrade to Oracle 10g is only supported if your database is running one of the following releases: 8.0.6, 8.1.7, 9.0.1, or 9.2.0. If not, you will have to upgrade the database to one of these releases or use a different upgrade option (like export/ import).
14.Direct upgrades to 11g are possible from existing databases with versions 9.2.0.4+, 10.1.0.2+ or 10.2.0.1+. Upgrades from other versions are supported only via intermediate upgrades to a supported upgrade version.

http://avdeo.com/2008/08/19/opatch-utility-oracle-rdbms-patching/

Oracle version 10.2.0.4.0 what does each number refers to?

Oracle version number refers:
10 – Major database release number
 2 – Database Maintenance release number
 0 – Application server release number
 4 – Component Specific release number
 0 – Platform specific release number

Types of Patches?

How to rollback a patch?

What is PSU?

What is Rolling Patch?

How to check installed Patches?

How much time will it take for Patching?

Common issues faced in Patching?


Cloning
=======
What is Cloning?

How to do take RMAN Cloning? Explain Steps?

Upgrade
=======

What is rolling upgrade?

It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables us to patch or upgrade ASM nodes in a clustered environment without affecting database availability.During a rolling upgrade we can maintain a functional cluster while one or more of the nodes in the cluster are running in different software versions.Rolling upgrade can be used only for Oracle database 11g releases(from 11.1).

Steps to Upgrade in Oracle ?

Manual upgrade which involves the following steps:
1.Backup the database.
2.In UNIX/Linux environments, set the $ORACLE_HOME and $PATH variables to point to the new 11g Oracle home.
3.Analyze the existing instance using the "$ORACLE_HOME/rdbms/admin/utlu111i.sql" script.
4.Start the original database using the STARTUP UPGRADE command and proceed with the upgrade by running the "$ORACLE_HOME/rdbms/admin/catupgrd.sql" script.
5.Recompile invalid objects.
6.Restart the database.
7.Run the "$ORACLE_HOME/rdbms/admin/utlu111s.sql" script and check the result of the upgrade.
8.Troubleshoot any issues or abort the upgrade.

What happens when you give "STARTUP UPGRADE"?

$sqlplus "/as sysdba"
SQL> STARTUP UPGRADE

Note:
----
The UPGRADE keyword enables you to open a database based on an earlier Oracle Database release. It also restricts logons to AS SYSDBAsessions, disables system triggers, and performs additional operations that prepare the environment for the upgrade.

You might be required to use the PFILE option to specify the location of your initialization parameter file.
Once the database is started in upgrade mode, only queries on fixed views execute without errors until after the catupgrd.sql script is run. Before running catupgrd.sql, queries on any other view or the use of PL/SQL returns an error.

What is the difference between startup Upgrade and Migrate ?

startup migrate:
---------------
Used to upgrade a database till 9i.

Startup Upgrade
---------------
From 10G  we are using startup upgrade to upgrade database.

What happens internally when you use startup upgrade/migrate?

It will adjust few database (init) parameters (irrespective of what you have defined) automatically to certain values in order to run upgrade scripts smoothely.
in other way..it will issue few alter statements to set certain parameters which are required to complete the upgrade scripts without any issues.


Common issues faced in Upgrade?

Error is related to timezone file
Started database in upgrade mode and fired catupgrd.sql :

SQL> startup upgrade
ORACLE instance started.
Total System Global Area 6413680640 bytes
Fixed Size                  2160112 bytes
Variable Size            1946159632 bytes
Database Buffers         4429185024 bytes
Redo Buffers               36175872 bytes
Database mounted.
Database opened.
SQL> @catupgrd.sql
DOC>#######################################################################
DOC>#######################################################################
DOC>
DOC>   The first time this script is run, there should be no error messages
DOC>   generated; all normal upgrade error messages are suppressed.
DOC>
DOC>   If this script is being re-run after correcting some problem, then
DOC>   expect the following error which is not automatically suppressed:
DOC>
DOC>   ORA-00001: unique constraint () violated
DOC>#
   FROM registry$database
        *
ERROR at line 2:
ORA-00942: table or view does not exist
This  error is related to timezone file  which must be version 4 for Oracle version 11g.If timezone is not version 4 than patch needs to be applied.
Query to check timezone file  is:
SQL> select * from v$timezone_file;
FILENAME        VERSION
———— ———-
timezlrg.dat          4
SQL> select * from v$timezone_file;
FILENAME        VERSION
———— ———-
timezlrg.dat          4
So I had correct version.I remember applying patch before upgrade.I got lucky because patch existed for version 10.2.0.3.
If there is no patch for your Oracle versions than patch can be download for similar version and  applied manually.
Instructions are below:
1. Download the identified patch.
2. Unzip the patch, and locate the 2 files timezone.dat and timezlrg.dat in the “files/oracore/zoneinfo” directory of the uncompressed patch (or from the relevant .jar file of a   patchset). If there is also a readme.txt in this location then make a note of this as well.
3. Backup your existing files in $ORACLE_HOME/oracore/zoneinfo – THIS CAN BE VITAL, DO NOT SKIP.
Note:
Before going on with step 4, make sure the current files are not in use.
On Windows the files will simply refuse to be removed when the are in use.
On Unix replacing the files whilst they are in use can cause the files to become corrupt. Use the fuser command before replacing the files to make sure they are not in use.
4. Copy the 2 .dat files and possibly the readme.txt file that were found in step 2 into the $ORACLE_HOME/oracore/zoneinfo directory.
5. Restart the database (in case of installation on a database), or restart the client applications (in case of client install). Note that the database did not need to be down before the time zone files were applied, but it does need to be restarted afterwards.


Oracle DBA Interview Questions and Answers - RAC

What is RAC?

RAC stands for Real Application cluster.

It is a clustering solution from Oracle Corporation that ensures high availability of databases by providing instance failover, media failover features.

Oracle RAC is a cluster database with a shared cache architecture that overcomes the limitations of traditional shared-nothing and shared-disk approaches to provide a highly scalable and available database solution for all the business applications.

Oracle RAC provides the foundation for enterprise grid computing.

Why do we have to create odd number of voting disk?

As far as voting disks are concerned, a node must be able to access strictly more than half of the voting disks at any time. So if you want to be able to tolerate a failure of n voting disks, you must have at least 2n+1 configured. (n=1 means 3 voting disks). You can configure up to 32 voting disks, providing protection against 15 simultaneous disk failures.
Oracle recommends that customers use 3 or more voting disks in Oracle RAC 10g Release 2. Note: For best availability, the 3 voting files should be physically separate disks. It is recommended to use an odd number as 4 disks will not be any more highly available than 3 disks, 1/2 of 3 is 1.5...rounded to 2, 1/2 of 4 is 2, once we lose 2 disks, our cluster will fail with both 4 voting disks or 3 voting disks.

Does the cluster actually check for the vote count before node eviction? If yes, could you expain this process briefly?

Yes. If you lose half or more of all of your voting disks, then nodes get evicted from the cluster, or nodes kick themselves out of the cluster


How does OCSSD starts first if voting disk & OCR resides in ASM Diskgroups?

You might wonder how CSSD, which is required to start the clustered ASM instance, can be started if voting disks are stored in ASM?

This sounds like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance.
To solve this problem the ASM disk headers have new metadata in 11.2:
you can use kfed to read the header of an ASM disk containing a voting disk.
The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file. This does not require the ASM instance to be up.
Once the voting disks are located, CSS can access them and joins the cluster.

What is gsdctl in RAC? list gsdctl commands in Oracle RAC?

GSDCTL stands for Global Service Daemon Control, we can use gsdctl commands to start, stop, and obtain the status of the GSD service on any platform.

The options for gsdctl are:-
$ gsdctl start -- To start the GSD service
$ gsdctl stop  -- To stop the GSD service
$ gsdctl stat  -- To obtain the status of the GSD service

Log file location for gsdctl:
$ ORACLE_HOME/srvm/log/gsdaemon_node_name.log

What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd node is in cold standby mode. If the instance fails for some reason then RAC one node detect it and restart the instance on the same node or the instance is relocate to the 2nd node incase there is failure or fault in 1st node. The benefit of this feature is that it provides a cold failover solution and it automates the instance relocation without any downtime and does not need a manual intervention. Oracle introduced this feature with the release of 11gR2 (available with Enterprise Edition).

What is RAC and how is it different from non RAC databases?

Oracle Real Application clusters allows multiple instances to access a single database, the instances will be running on multiple nodes.
In Real Application Clusters environments, all nodes concurrently execute transactions against the same database.
Real Application Clusters coordinates each node's access to the shared data to provide consistency and integrity.

What are the advantages of RAC (Real Application Clusters)?

Reliability - if one node fails, the database won't fail
Availability - nodes can be added or replaced without having to shutdown the database
Scalability - more nodes can be added to the cluster as the workload increases

What is Cache Fusion?

Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block, it is easy to get the block image from the instance which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service (GES) monitors and Instance enqueue process manages the cache fusion.

What command would you use to check the availability of the RAC system?

crs_stat -t -v (-t -v are optional)

How do we verify that RAC instances are running?

SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column,host_:instancename under INST_NAME column.

How can you connect to a specific node in a RAC environment?

tnsnames.ora ensure that you have INSTANCE_NAME specified in it.

Which is the "MASTER NODE" in RAC?

The node with the lowest node number will become master node and dynamic remastering of the resources will take place.

To find out the master node for particular resource, you can query v$ges_resource for MASTER_NODE column.

To find out which is the master node, you can see ocssd.log file and search for "master node number".
when the first master node fails in the cluster the lowest node number will become master node.

What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

Give few examples for solutions that support cluster storage?

·ASM (automatic storage management),
·Raw disk devices,
·Network file system (NFS),
·OCFS2 and
·OCFS (Oracle Cluster Fie systems).

What are Oracle Cluster Components?

1.Cluster Interconnect (HAIP)
2.Shared Storage (OCR/Voting Disk)
3.Clusterware software
4.Oracle Kernel Components

What are Oracle RAC Components?

VIP, Node apps etc.

What are Oracle Kernel Components?

Basically Oracle kernel need to switched on with RAC On option when you convert to RAC, that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.

How to turn on RAC?

# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle

Disk architechture in RAC?

SAN (Storage Area Networks) - generally using fibre to connect to the SAN
NAS (Network Attached Storage) - generally using a network to connect to the NAS using either NFS, ISCSI

What is Oracle Clusterware?

The Clusterware software allows nodes to communicate with each other and forms the cluster that makes the nodes work as a single logical server.
The software is run by the Cluster Ready Services (CRS) using the Oracle Cluster Registry (OCR) that records and maintains the cluster and node membership information and the voting disk which acts as a tiebreaker during communication failures. Consistent heartbeat information travels across the interconnect to the voting disk when the cluster is running.

Real Application Clusters
Oracle RAC is a cluster database with a shared cache architecture that overcomes the limitations of traditional shared-nothing and shared-disk approaches to provide a highly scalable and available database solution for all your business applications. Oracle RAC provides the foundation for enterprise grid computing.

Oracle’s Real Application Clusters (RAC) option supports the transparent deployment of a single database across a cluster of servers, providing fault tolerance from hardware failures or planned outages. Oracle RAC running on clusters provides Oracle’s highest level of capability in terms of availability, scalability, and low-cost computing.

One DB opened by multipe instances so the the db ll be Highly Available if an instance crashes.
Cluster Software. Oracles Clusterware or products like Veritas Volume Manager are required to provide the cluster support and allow each node to know which nodes belong to the cluster and are available and with Oracle Cluterware to know which nodes have failed and to eject then from the cluster, so that errors on that node can be cleared.

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

The cluster registry holds all information about nodes, instances, services and ASM storage if used, it also contains state information ie they are available and up or similar.

The voting disk is used to determine if a node has failed, i.e. become separated from the majority. If a node is deemed to no longer belong to the majority then it is forcibly rebooted and will after the reboot add itself again the the surviving cluster nodes.

What are the Oracle Clusterware key components?

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

What is Voting Disk and OCR?

Voting Disk:
Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.
A node must be able to access more than half of the voting disks at any time.
For example, if you have 3 voting disks configured, then a node must be able to access at least two of the voting disks at any time. If a node cannot access the minimum required number of voting disks it is evicted, or removed, from the cluster.

Oracle Cluster Registry (OCR) 
The cluster registry holds all information about nodes, instances, services and ASM storage if used, it also contains state information ie they are available and up or similar.
The OCR must reside on shared disk that is accessible by all of the nodes in your cluster.

What are the administrative tasks involved with voting disk?

Following administrative tasks are performed with the voting disk :
1) Backing up voting disks
2) Recovering Voting disks
3) Adding voting disks
4) Deleting voting disks
5) Moving voting disks

Can you add voting disk online? Do you need voting disk backup?

Yes,  as per documentation, if you have multiple voting disk you can add online, but if you have only one voting disk , by that cluster will be down as its lost you just need to start crs in exclusive mode and add the votedisk using
crsctl add votedisk <path>

What is the Oracle Recommendation for backing up voting disk?

Oracle recommends us to use the dd command to backup the voting disk with a minimum block size of 4KB.

How do we backup voting disks?

1) Oracle recommends that you back up your voting disk after the initial cluster creation and after we complete any node addition or deletion procedures.
2) First, as root user, stop Oracle Clusterware (with the crsctl stop crs command) on all nodes. Then, determine the current voting disk by issuing the following command:
crsctl query votedisk css
3) Then, issue the dd or ocopy command to back up a voting disk, as appropriate.
Give the syntax of backing up voting disks:-
On Linux or UNIX systems:
dd if=voting_disk_name of=backup_file_name
where,
voting_disk_name is the name of the active voting disk
backup_file_name is the name of the file to which we want to back up the voting disk contents
On Windows systems, use the ocopy command:
copy voting_disk_name backup_file_name

How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -showbackup

You have lost OCR disk, what is your next step?

The cluster stack will be down due to the fact that cssd is unable to maintain the integrity, this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and running. You can add the ocr back by restoring the automatic backup or import the manual backup,

What are the major RAC wait events?

In a RAC environment the buffer cache is global across all instances in the cluster and hence the processing differs.The most common wait events related to this are gc cr request and gc buffer busy

GC CR request :the time it takes to retrieve the data from the remote cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested data block.

What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report? 

This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network.

How do you troubleshoot node reboot?

Please check metalink ...
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem?
Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack.

What are Oracle Clusterware processes for 10g on Unix and Linux?

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user

Event manager daemon (evmd) —A background process that publishes events that crs creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.

What are Oracle database background processes specific to RAC?

Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and each cahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity.

What is ACMS?

ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure.

What is SCAN listener?

A scan listener is something that additional to node listener which listens the incoming db connection requests from the client which got through the scan IP, it got end points configured to node listener where it routes the db connection requests to particular node listener.

SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by the concurrent manager (kind of job scheduler in oracle apps).
Steps to disable the SCAN IP,
i.  Do not use SCAN IP at the client end.
ii. Stop scan listener
    srvctl stop scan_listener
iii.Stop scan
    srvctl stop scan (this will stop the scan vip's)
iv. Disable scan and disable scan listener
    srvctl disable scan

What are the different network components are in 10g RAC?

public, private, and vip components
Private interfaces is for intra node communication.
VIP is all about availability of application. When a node fails then the VIP component fail over to some other node, this is the reason that all applications should based on vip components means tns entries should have vip entry in the host list

What is an interconnect network?

An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.

What is the use of cluster interconnect?
Cluster interconnect is used by the Cache fusion for inter instance communication.

How can we configure the cluster interconnect?

· Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.
· On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be used by Oracle Clusterware.
· Windows clusters use the TCP protocol.

What is the purpose of Private Interconnect?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configure VIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

Give situations under which VIP address failover happens?

VIP addresses failover happens when the node on which the VIP address runs fails; all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database. Services enable us to define rules and characteristics to control how users and applications connect to database instances.

What are the characteristics controlled by Oracle services feature?

The characteristics include a unique name, workload balancing, failover options, and high availability.

What enables the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

What are the types of connection load-balancing?

Connection Workload management is one of the key aspects when you have RAC instances as you want to distribute the connections to specific nodes/instance or those have less load.
There are two types of connection load-balancing:
1.Client Side load balancing (also called as connect time load balancing)
2.Server side load balancing (also called as Listener connection load balancing)

What is the difference between server-side and client-side connection load balancing?

Client-side balancing happens at client side where load balancing is done using listener.In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.

Client Side load balancing:- Oracle client side load balancing feature enables clients to randomize the connection requests among all the available listeners based on their load.

An tns entry that contains all nodes entries and use load_balance=on (default its on) will use the connect time load balancing or client side load balancing.

Sample Client Side TNS Entry:-

    finance =
    (DESCRIPTION =
         (ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
         (ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
         (ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))
    (LOAD_BALANCE = yes)
    (CONNECT_DATA =
         (SERVER = DEDICATED)
         (SERVICE_NAME = FINANCE) (FAILOVER=ON)
    (FAILOVER_MODE =  (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5))
    )
    )

Server side load balancing:- This improves the connection performance by balancing the number of active connections among multiple instances and dispatchers. In a single instance environment (shared servers), the listener selects the least dispatcher to handle the incoming client requests. In a rac environments, PMON is aware of all instances load and dispatchers , and depending on the load information PMON redirects the connection to the least loaded node.

In a RAC environment, *.remote_listener parameter which is a tns entry containing all nodes addresses need to set to enable the load balance advisory updates to PMON.

Sample Tns entry should be in an instances of RAC cluster,

    local_listener=LISTENER_MYRAC1
    remote_listener = LISTENERS_MYRACDB

What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using the below
·       OEM (Enterprise Manager),
·       SQL*PLUS,
·       Server control (SRVCTL),
·       Cluster Verification Utility (CLUVFY),
·       DBCA,
·       NETCA

Name some Oracle Clusterware tools and their uses?

·OIFCFG - allocating and deallocating network interfaces.
·OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.
·OCRDUMP - Identify the interconnect being used.
·CVU - Cluster verification utility to get status of CRS resources.

What is the difference between CRSCTL and SRVCTL?

crsctl manages clusterware-related operations:
    Starting and stopping Oracle Clusterware
    Enabling and disabling Oracle Clusterware daemons
    Registering cluster resources

srvctl manages Oracle resource–related operations:
    Starting and stopping database instances and services
    Also from 11gR2 manages the cluster resources like network,vip,disks etc

How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an instance?

Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat

What are the modes of deleting instances from ORacle Real Application cluster Databases?

We can delete instances using silent mode or interactive mode using DBCA(Database Configuration Assistant).

What are the background process that exists in 11gr2 and functionality?

Process Name     Functionality
crsd     •The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes.
cssd     •Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting the node.
diskmon     •Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started.
evmd     •Event Manager (EVM): Is a background process that publishes Oracle Clusterware events
mdnsd     •Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows.
gnsd     •Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster.
ons     •Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN) events
oraagent     •oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).
orarootagent     •Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address
oclskd     •Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS
gipcd     •Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure
ctssd     •Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP

Under which user or owner the process will start?

Component                     Name of the Process         Owner
Oracle High Availability Service         ohasd                 init, root
Cluster Ready Service (CRS)             Cluster Ready Services         root
Cluster Synchronization Service (CSS)         ocssd,cssd monitor, cssdagent     grid owner
Event Manager (EVM)                 evmd, evmlogger         grid owner
Cluster Time Synchronization Service (CTSS)     octssd                 root
Oracle Notification Service (ONS)         ons, eons             grid owner
Oracle Agent                     oragent             grid owner
Oracle Root Agent                 orarootagent             root
Grid Naming Service (GNS)             gnsd                 root
Grid Plug and Play (GPnP)             gpnpd                 grid owner
Multicast domain name service (mDNS)         mdnsd                 grid owner

What is the major difference between 10g and 11g RAC?

There is not much difference between 10g and 11gR (1) RAC. But there is a significant difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS
    Databases
    Instances
    Applications
    Node Monitoring
    Event Services
    High Availability

From 11gR2(onwards) its completed HA stack managing and providing the following resources as like the other cluster software like VCS etc.
    Databases
    Instances
    Applications
    Cluster Management
    Node Management
    Event Services
    High Availability
    Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional services) and SCAN – Single Access Client Naming method, HAIP
    Storage Management (with help of ASM and other new ACFS filesystem)
    Time synchronization (rather depending upon traditional NTP)
    Removed OS dependent hang checker etc, manages with own additional monitor process

What is hangcheck timer? 

The hangcheck timer checks regularly the health of the system. If the system hangs or stop the node will be restarted automatically.
There are 2 key parameters for this module:
-> hangcheck-tick: this parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.
-> hangcheck-margin: this defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node.

State the initialization parameters that must have same value for every instance in an Oracle RAC database?

Some initialization parameters are critical at the database creation time and must have same values.Their value must be specified in SPFILE or PFILE for every instance.The list of parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT

What is RAC? What is the benefit of RAC over single instance database?

In Real Application Clusters environments, all nodes concurrently execute transactions against the same database. Real Application Clusters coordinates each node's access to the shared data to provide consistency and integrity.
Benefits:
Improve response time
Improve throughput
High availability
Transparency


Advantages of RAC (Real Application Clusters)

Reliability - if one node fails, the database won't fail
Availability - nodes can be added or replaced without having to shutdown the database
Scalability - more nodes can be added to the cluster as the workload increases


What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configure VIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.
Give situations under which VIP address failover happens:-
VIP addresses failover happens when the node on which the VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.
Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service maintains communication between each nodes and listeners.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes in the cluster. All nodes in the cluster registers their heart-beat information in the voting disk, so as to confirm that they are all operational. If heart-beat information of any node in the voting disk is not available that node will be evicted from the cluster. The CSS (Cluster Synchronization Service) daemon in the clusterware maintains the heart beat of all nodes to the voting disk. When any node is not able to send heartbeat to voting disk, then it will reboot itself, thus help avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three or odd number (3 or greater) of votingdisks.

Voting Disk - is file that resides on shared storage and Manages cluster members.  Voting disk reassigns cluster ownership between the nodes in case of failure.

The Voting Disk Files are used by Oracle Clusterware to determine which nodes are currently members of the cluster. The voting disk files are also used in concert with other Cluster components such as CRS to maintain the clusters integrity.

Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along with the OCR. Oracle Clusterware can access the OCR and the voting disks present in ASM even if the ASM instance is down. As a result CSS can continue to maintain the Oracle cluster even if the ASM instance has failed.

How many voting disks are you maintaining ?

http://www.toadworld.com/KNOWLEDGE/KnowledgeXpertforOracle/tabid/648/TopicID/RACR2ARC6/Default.aspx

By default Oracle will create 3 voting disk files in ASM.

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You should always configure an odd number of voting disks >= 3. This is because loss of more than half your voting disks will cause the entire cluster to fail.

You should plan on allocating 280MB for each voting disk file. For example, if you are using ASM and external redundancy then you will need to allocate 280MB of disk for the voting disk. If you are using ASM and normal redundancy you will need 560MB.

Why we need to keep odd number of voting disks ?

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You should always configure an odd number of voting disks >= 3. This is because loss of more than half your voting disks will cause the entire cluster to fail.


What are Oracle RAC software components?

Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

What is TAF?

TAF (Transparent Application Failover) is a configuration that allows session fail-over between different nodes of a RAC database cluster.
Transparent Application Failover (TAF). If a communication link failure occurs after a connection is established, the connection fails over to another active node. Any disrupted transactions are rolled back, and session properties and server-side program variables are lost. In some cases, if the statement executing at the time of the failover is a Select statement, that statement may be automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.

After an Oracle RAC node crashes—usually from a hardware failure—all new application transactions are automatically rerouted to a specified backup node. The challenge in rerouting is to not lose transactions that were "in flight" at the exact moment of the crash. One of the requirements of continuous availability is the ability to restart in-flight application transactions, allowing a failed node to resume processing on another server without interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed Transparent Application Failover. TAF allows the DBA to configure the type and method of failover for each Oracle Net client.
TAF architecture offers the ability to restart transactions at either the transaction (SELECT) or session level.

What are the requirements for Oracle Clusterware?

1. External Shared Disk to store Oracle Cluster ware file (Voting Disk and Oracle Cluster Registry - OCR)
2. Two netwrok cards on each cluster ware node (and three set of IP address) -
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node communication between rac nodes used by clusterware and rac database)
IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for client connection and for connection failover)
3. Storage Option for OCR and Voting Disk - RAW, OCFS2 (Oracle Cluster File System), NFS, …..
Which enable the  load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

How to find location of OCR file when CRS is down?

If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.
When the CRS is down:
Look into “ocr.loc” file, location of this file changes depending on the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc
When CRS is UP:
Set ASM environment or CRS environment then run the below command:
ocrcheck

In 2 node RAC, how many NIC’s are r using ?

2 network cards on each clusterware node
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node communication between rac nodes used by clusterware and rac database)

In 2 node RAC, how many IP’s are r using ?

6 - 3 set of IP address
## eth1-Public:  2
## eth0-Private: 2
## VIP: 2

How to find IP’s information in RAC ?

Edit the /etc/hosts file as shown below:
# Do not remove the following line, or various programs
# that requires network functionality will fail.
127.0.0.1               localhost.localdomain localhost
## Public Node names
 192.168.10.11          node1-pub.hingu.net     node1-pub
192.168.10.22          node2-pub.hingu.net     node2-pub
## Private Network (Interconnect)
 192.168.0.11            node1-prv               node1-prv
192.168.0.22            node2-prv               node2-prv
## Private Network (Network Area storage)
 192.168.1.11            node1-nas               node1-nas
192.168.1.22            node2-nas               node2-nas
192.168.1.33            nas-server              nas-server
## Virtual IPs
 192.168.10.111          node1-vip.hingu.net     node1-vip
192.168.10.222          node2-vip.hingu.net     node2-vip

What is difference between RAC ip addresses ?

Public IP adress is the normal IP address typically used by DBA and SA to manage storage, system and database. Public IP addresses are reserved for the Internet.
Private IP address is used only for internal clustering processing (Cache Fusion) (aka as interconnect). Private IP addresses are reserved for private networks.
VIP is used by database applications to enable fail over when one cluster node fails. The purpose for having VIP is so client connection can be failover to surviving nodes in case there is failure


Can application developer access the private ip ?
No. private IP address is used only for internal clustering processing (Cache Fusion) (aka as interconnect)

No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...