Background Processes Specific to Oracle RAC
An Oracle RAC database has the same processes and memory structures as a single-instance Oracle database and additional process and memory structures that are specific to Oracle RAC. The global cache service and global enqueue service processes, and the global resource directory (GRD) collaborate to enable cache fusion. The Oracle RAC processes and their identifiers are as follows:
ACMS | Atomic Control File to Memory Service |
---|---|
GTX[0-j] | Global Transaction Process |
LMON | Global Enqueue Service Monitor-Lock Monitor |
LMD | Global Enqueue Service Daemon |
LMS | Global Cache Service Process |
LCK0 | Instance Enqueue Process |
LMHB | Global Cache/Enqueue Service Heartbeat Monitor |
PING | Interconnect Latency Measurement Process |
RCBG | Result Cache Background Process |
RMSn | Oracle RAC Management Processes |
RSMN | Remote Slave Monitor |
Each node has its own background processes and memory structures, there are additional processes than the norm to manage the shared resources, theses additional processes maintain cache coherency across the nodes.
Cache coherency is the technique of keeping multiple copies of a buffer consistent between different Oracle instances on different nodes. Global cache management ensures that access to a master copy of a data block in one buffer cache is coordinated with the copy of the block in another buffer cache.
The sequence of a operation would go as below
- When instance A needs a block of data to modify, it reads the bock from disk, before reading it must inform the GCS (DLM). GCS keeps track of the lock status of the data block by keeping an exclusive lock on it on behalf of instance A
- Now instance B wants to modify that same data block, it to must inform GCS, GCS will then request instance A to release the lock, thus GCS ensures that instance B gets the latest version of the data block (including instance A modifications) and then exclusively locks it on instance B behalf.
- At any one point in time, only one instance has the current copy of the block, thus keeping the integrity of the block.
GCS maintains data coherency and coordination by keeping track of all lock status of each block that can be read/written to by any nodes in the RAC.
GCS is an in memory database that contains information about current locks on blocks and instances waiting to acquire locks. This is known as Parallel Cache Management (PCM).
The Global Resource Manager (GRM) helps to coordinate and communicate the lock requests from Oracle processes between instances in the RAC.
Each instance has a buffer cache in its SGA, to ensure that each RAC instance obtains the block that it needs to satisfy a query or transaction.
RAC uses two processes the GCS and GES which maintain records of lock status of each data file and each cached block using a GRD.
So what is a resource, it is an identifiable entity, it basically has a name or a reference, it can be a area in memory, a disk file or an abstract entity.
A resource can be owned or locked in various states (exclusive or shared). Any shared resource is lockable and if it is not shared no access conflict will occur.
A global resource is a resource that is visible to all the nodes within the cluster.
Data buffer cache blocks are the most obvious and most heavily global resource, transaction enqueue's and database data structures are other examples.
GCS handle data buffer cache blocks and GES handle all the non-data block resources.
All caches in the SGA are either global or local, dictionary and buffer caches are global, large and java pool buffer caches are local.
Cache fusion is used to read the data buffer cache from another instance instead of getting the block from disk, thus cache fusion moves current copies of data blocks between instances (hence why you need a fast private network), GCS manages the block transfers between the instances.
Finally we get to the processes
Oracle RAC Daemons and Processes | ||
LMSn | Lock Manager Server process - GCS | 1) This is the cache fusion part and the most active process, 2) The LMS process maintains records of the data file statuses and each cached block by recording information in the GRD. 3) The Primary job of LMS process is to transport blocks across the nodes. 4) For example if a node requests consistent-read of a block, The LMS process makes a Consistent-Read image of the block from another node with the help of undo segments and then transports the blocks through the network to the node who requested it. 5) The LMS process also controls the flow of messages to remote instances and manages global data block access and transmits block images between the buffer caches of different instances. 6) It handles the consistent copies of blocks that are transferred between instances. 7) It receives requests from LMD to perform lock requests. 8) It rolls back any uncommitted transactions. 9) There can be up to ten LMS processes running and can be started dynamically if demand requires it. 10) They manage lock manager service requests for GCS resources and send them to a service queue to be handled by the LMSn process. 11) It also handles global deadlock detection and monitors for lock conversion timeouts. 12) As a performance gain you can increase this process priority to make sure CPU starvation does not occur. 13) You can see the statistics of this daemon by looking at the view X$KJMSDP |
LMON | Lock Monitor Process - GES |
1) LMON is responsible for monitoring all instances in a cluster for the detection of failed instances. 2) Once a failed Instance is detected it facilitates in the recovery of global locks held by that instance. 3) It is also responsible for reconfiguration of locks and other resources when instances leave or are added to the cluster. 4) This dynamic reconfiguration is done in real-time. |
LMD | Lock Manager Daemon - GES |
1) The LMD process basically acts as a broker to LMS process by sending requests for resources to a queue that is handled by the LMS process. 2) These requests are placed by the global cache service in order to keep the block buffers consistent across all the instances. 3) The other responsibility of LMD is of detection and resolution of global deadlocks, along with monitoring of lock timeouts in the global environment. |
LCK0 | Lock Process - GES | 1) The LCK process is similar to LMD process, but it handles requests for all global resources excluding requests for database block buffers. 2) Manages instance resource requests and cross-instance call operations for shared resources. 3) It builds a list of invalid lock elements and validates lock elements during recovery. 4) The LCK0 process manages noncache fusion resource requests such as library and row cache requests. |
DIAG | Diagnostic Daemon | 1) It regularly monitors the health of the instance. 2) Checks for instance hangs and deadlocks. 3) It captures diagnostic data for instance and process failures. 4) This is a lightweight process, it uses the DIAG framework to monitor the health of the cluster. 5) It captures information for later diagnosis in the event of failures. 6) It will perform any necessary recovery if an operational hang is detected. |
No comments:
Post a Comment