Disclaimer

Sunday, 24 November 2024

Why ODD number of Voting Disks or Why 3 Voting Disks


Before we start discussing why do we need 3 voting disks or odd number of voting disk, why should know the Role of Voting Disk  


The Role of Voting Disks in Oracle RAC

  1. Voting Disk Purpose:

    • Voting disks are files that help Oracle RAC determine which nodes in the cluster are healthy and should remain operational.
    • They act like a "tie-breaker" during communication failures (e.g., network issues or node failure) to prevent split-brain scenarios.
  2. Split-Brain Problem:

    • Split-brain happens when nodes in the cluster lose communication with each other and both believe they are healthy.
    • If both nodes simultaneously access shared storage, it can cause data corruption.
    • Voting disks resolve this by deciding which node (or nodes) should stay in the cluster.



Why 3 Voting Disks?


Case with 1 Voting Disk:

  • If you use only 1 voting disk, what happens if the disk becomes inaccessible (e.g., disk failure)?
    • The cluster cannot determine which nodes are healthy.
    • The entire cluster may shut down to avoid corruption.


Case with 2 Voting Disks:

  • With 2 voting disks, if there is a communication failure:
    • Each node might claim 1 voting disk and assume it is the only active node.
    • This causes a split-brain scenario because both nodes think they own the cluster.


Case with 3 Voting Disks:

  • When there are 3 voting disks, the cluster can tolerate the failure of 1 disk or the failure of communication between nodes:
    • A node needs to access more than half the votes (majority) to stay in the cluster.
    • If a node cannot access the majority, it is evicted.




Examples

Scenario 1: Normal Operation

  • Node A, Node B, and 3 voting disks (VD1, VD2, VD3) are operational.
  • Each node can access all 3 voting disks, and everything works fine.


Scenario 2: Network Failure Between Nodes

  • Network heartbeat fails between Node A and Node B.

  • Both nodes try to access the voting disks to decide which one stays.

    • Node A accesses VD1 and VD2 (2 votes).
    • Node B accesses only VD3 (1 vote).
    • Outcome: Node A wins (majority), Node B is evicted.


Scenario 3: Voting Disk Failure

  • Suppose VD3 fails, leaving VD1 and VD2.
  • Nodes can still function because:
    • Node A and Node B can both access VD1 and VD2 (majority).
    • The cluster continues running without interruption.


Scenario 4: Node Failure

  • If Node B fails completely:
    • Node A can still access VD1, VD2, and VD3 (majority).
    • The cluster continues with Node A.



Why a Majority Is Critical

Oracle RAC requires more than half of the votes (majority) to prevent split-brain:

  • In a 3-vote setup, 2 votes are needed for a majority.
  • This ensures:
    • If 1 node or disk fails, the other node/disk can still form a majority.
    • If more than half the votes are unavailable, the cluster shuts down to avoid corruption.




Key Points About Voting Disk Design

  1. Odd Number of Voting Disks:
    • Always configure an odd number of voting disks (3, 5, 7, etc.) to prevent tie scenarios.
  2. Redundancy:
    • Voting disks are typically stored on shared storage (e.g., ASM) with redundancy to handle hardware failures.







No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...