Disclaimer

Sunday, 24 November 2024

Why ODD number of Voting Disks or Why 3 Voting Disks


Before we start discussing why do we need 3 voting disks or odd number of voting disk, why should know the Role of Voting Disk  


The Role of Voting Disks in Oracle RAC

  1. Voting Disk Purpose:

    • Voting disks are files that help Oracle RAC determine which nodes in the cluster are healthy and should remain operational.
    • They act like a "tie-breaker" during communication failures (e.g., network issues or node failure) to prevent split-brain scenarios.
  2. Split-Brain Problem:

    • Split-brain happens when nodes in the cluster lose communication with each other and both believe they are healthy.
    • If both nodes simultaneously access shared storage, it can cause data corruption.
    • Voting disks resolve this by deciding which node (or nodes) should stay in the cluster.



Why 3 Voting Disks?


Case with 1 Voting Disk:

  • If you use only 1 voting disk, what happens if the disk becomes inaccessible (e.g., disk failure)?
    • The cluster cannot determine which nodes are healthy.
    • The entire cluster may shut down to avoid corruption.


Case with 2 Voting Disks:

  • With 2 voting disks, if there is a communication failure:
    • Each node might claim 1 voting disk and assume it is the only active node.
    • This causes a split-brain scenario because both nodes think they own the cluster.


Case with 3 Voting Disks:

  • When there are 3 voting disks, the cluster can tolerate the failure of 1 disk or the failure of communication between nodes:
    • A node needs to access more than half the votes (majority) to stay in the cluster.
    • If a node cannot access the majority, it is evicted.




Examples

Scenario 1: Normal Operation

  • Node A, Node B, and 3 voting disks (VD1, VD2, VD3) are operational.
  • Each node can access all 3 voting disks, and everything works fine.


Scenario 2: Network Failure Between Nodes

  • Network heartbeat fails between Node A and Node B.

  • Both nodes try to access the voting disks to decide which one stays.

    • Node A accesses VD1 and VD2 (2 votes).
    • Node B accesses only VD3 (1 vote).
    • Outcome: Node A wins (majority), Node B is evicted.


Scenario 3: Voting Disk Failure

  • Suppose VD3 fails, leaving VD1 and VD2.
  • Nodes can still function because:
    • Node A and Node B can both access VD1 and VD2 (majority).
    • The cluster continues running without interruption.


Scenario 4: Node Failure

  • If Node B fails completely:
    • Node A can still access VD1, VD2, and VD3 (majority).
    • The cluster continues with Node A.



Why a Majority Is Critical

Oracle RAC requires more than half of the votes (majority) to prevent split-brain:

  • In a 3-vote setup, 2 votes are needed for a majority.
  • This ensures:
    • If 1 node or disk fails, the other node/disk can still form a majority.
    • If more than half the votes are unavailable, the cluster shuts down to avoid corruption.




Key Points About Voting Disk Design

  1. Odd Number of Voting Disks:
    • Always configure an odd number of voting disks (3, 5, 7, etc.) to prevent tie scenarios.
  2. Redundancy:
    • Voting disks are typically stored on shared storage (e.g., ASM) with redundancy to handle hardware failures.







No comments:

Post a Comment

Understanding SQL Plan Baselines in Oracle Database

  Understanding SQL Plan Baselines in Oracle Database SQL Plan Baseline is the feature in Oracle started from Database 11g that helps to pre...