Skip to content

Kefmat/distributed-consensus-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Replicated Key-Value State Machine (Consensus Engine)

A production-grade, zero-dependency simulation of the Raft consensus protocol implemented in strict TypeScript.

This engine provides safety-certified, linearizable state-machine replication across a virtual network topology. It includes an asynchronous clock framework, randomized election timeouts, structural fault/crash injection capabilities, and a background invariant safety auditor.

Architectural overview

The system coordinates distributed state changes via an atomic, append-only log ledger maintained by an elected cluster leader.

flowchart TD
  Client["State Mutation Query"] -->|Route Command| Leader["Elected Cluster Leader"]
  subgraph Consensus_Cluster[Consensus Cluster]
    Leader -->|1. Append Entries RPC| F1["Follower Node 02"]
    Leader -->|1. Append Entries RPC| F2["Follower Node 03"]
    F1 -->|2. Success Affirmation| Leader
    F2 -->|2. Success Affirmation| Leader
    Leader -->|3. Quorum Majority Reached| CommitCursor["Advance Commit Index"]
    CommitCursor -->|4. Deterministic Application| SM["Replicated State Machine"]
  end
  subgraph Network_Isolation_Guard[Network Isolation Guard]
    VNB["Virtual Network Bus"] ---|Inject Synthetic Partitions & Latency| Consensus_Cluster
  end
Loading

Core features & testing vectors

  • Asynchronous event loop simulation: Background clock intervals for heartbeats and elections.
  • Deterministic randomization: Election timeouts randomized (150ms–300ms) to model split votes.
  • Chaos engineering: Virtual transport layer for partitions, latency, and crash injection.
  • Automated auditing: Continuous safety invariant checks during simulation runs.
  • Linearizable key-value store: Deterministic application of committed log entries.

Protocol safety invariants

The implementation enforces the core Raft safety properties:

  1. Election safety: At most one leader elected per term.
  2. Leader append-only: Leaders only append new entries; they do not overwrite.
  3. Log matching: If two logs contain an entry with the same index and term, their histories match through that index.
  4. Leader completeness: If an entry is committed in a term, it will be present in leaders of later terms.
  5. State machine safety: Once an entry is applied at an index, no different entry will be applied at that index.

Project layout

src/
├── audit/
│   └── safety.ts
└── primitives/
    ├── log.ts
    ├── network.ts
    ├── node.ts
    ├── stateMachine.ts
    └── types.ts

Component reference

  • src/primitives/types.ts — RPC payload schemas and protocol types (RequestVote, AppendEntries, InstallSnapshot).
  • src/primitives/log.ts — Append-only ledger, term verification, and conflict truncation.
  • src/primitives/node.ts — Role state machine (FOLLOWER, CANDIDATE, LEADER) and election logic.
  • src/primitives/stateMachine.ts — Linearizable key-value operations (SET, GET, DELETE).
  • src/primitives/network.ts — Virtual network bus supporting latency and partitions.

Getting started

Prerequisites:

  • Node.js 20+ and npm.

Install and run:

npm install
npm run build
npm run start

For development tests:

npm test

For detailed design and implementation notes, see the source files under src/primitives and src/audit.

Detailed component notes

  • src/primitives/types.ts

    • Defines strict static data structures and RPC payload schemas (RequestVote, AppendEntries, InstallSnapshot).
    • Explicitly types state transitions, command payload envelopes, and network routing metadata.
  • src/primitives/log.ts

    • Coordinates the append-only ledger array.
    • Implements term matching verification and conflict log truncation cascades during unaligned transitions.
  • src/primitives/node.ts

    • Drives individual state machine roles (FOLLOWER, CANDIDATE, LEADER).
    • Manages term tracking, vote counting logic, election/rejection guard conditions, and step-down transitions.
  • src/primitives/stateMachine.ts

    • Evaluates committed linearizable dictionary operations (SET, DELETE, GET).
    • Tracks applied execution indexes independently to eliminate state drift across nodes.
  • src/primitives/network.ts

    • Acts as an abstract network routing matrix for asynchronous RPC frames.
    • Supports configurable packet propagation latencies, network partitions, and node crash flags.
  • src/audit/safety.ts

    • Evaluates cluster conditions across active nodes and asserts multi-node invariants (for example, verifying no two leaders claim the same term).

RPC interface definitions

The consensus engine uses the following strict TypeScript interfaces for message frames transported over the virtual network bus.

export interface RequestVoteArgs {
  term: number;         // Candidate's active logical term sequence
  candidateId: string;  // Network identity of the node seeking leadership votes
  lastLogIndex: number; // Index of candidate's longest active log entry
  lastLogTerm: number;  // Term of candidate's longest active log entry
}

export interface RequestVoteReply {
  term: number;         // Current local term of evaluating peer
  voteGranted: boolean; // True if candidate matches all safety constraints
}
export interface AppendEntriesArgs {
  term: number;             // Leader's current logical term
  leaderId: string;         // Network identity of the leader
  prevLogIndex: number;     // Index immediately preceding new entries
  prevLogTerm: number;      // Term of prevLogIndex entry
  entries: LogEntry[];      // Log structures to replicate (empty = heartbeat)
  leaderCommit: number;     // Leader's highest known committed index
}

export interface AppendEntriesReply {
  term: number;             // Current local term of peer
  success: boolean;         // True if follower found matching prevLogIndex entry
}

State machine command structures

Mutations are delivered to replicated state machines as transactional command objects. Example:

const setCommand: StateMachineCommand = {
  type: 'SET',
  key: 'auth_token_ttl',
  value: '3600'
};

const deleteCommand: StateMachineCommand = {
  type: 'DELETE',
  key: 'stale_session_id'
};

Simulation execution workflow

When the orchestrator runs (e.g., npm run start) the framework drives the topology through phases to validate safety and liveness.

[SIMULATION ENGINE] Initializing 5-Node Distributed Consensus Loop...

--- PHASE 1: Initial Cluster Election Safety ---
Node NODE_01 timeout tripped. Transitioning to CANDIDATE state for Term 1.
Broadcasting RequestVote RPCs across the Virtual Network Bus.
Vote collection verified: 4 of 5 peers granted election authority.
Status Verified: Node NODE_01 successfully assumed LEADER role.
Safety Auditor Pass: Clear. Single leader invariant maintained.

--- PHASE 2: State Machine Mutation Replication ---
Leader NODE_01 appending write transaction key 'cluster_config_mode' to log entry 1.
Broadcasting AppendEntries replication payload to all registered follower peers.
Quorum verification: 5 of 5 nodes acknowledged entry synchronization.
Leader advanced commit index cursor to 1.
Replicated Key-Value State Machine applied command safely at index 1: { cluster_config_mode: 'linearizable_strict' }

--- PHASE 3: Testing Network Partition Resilience ---
Injecting synthetic partition matrix: [NODE_01, NODE_02] VS [NODE_03, NODE_04, NODE_05]
Simulating concurrent election inside the isolated majority partition.
Node NODE_03 timeout tripped. Transitioning to CANDIDATE state for Term 2.
Node NODE_03 secured quorum vote majority (3 of 3 visible nodes).
Status Verified: Node NODE_03 assumed LEADER role for Term 2.
Safety Auditor Pass: Clear. Zero split-brain leadership conditions detected.

--- PHASE 4: Healing Partitions and Harmonizing States ---
Network topology barriers flushed.
Node NODE_01 detects higher term (Term 2) from leader NODE_03 and steps down to FOLLOWER.
Log harmonization triggered for Node NODE_01 and NODE_02.

Simulation execution run completed successfully with zero safety invariant regressions.

Configurable engine parameter tuning

  • minElectionTimeoutMilliseconds: 150 — lower bound for randomized election window
  • maxElectionTimeoutMilliseconds: 300 — upper bound to break split-votes
  • heartbeatIntervalMilliseconds: 50 — leader heartbeat interval
  • networkLatencyMsMilliseconds: 0 — artificial per-packet delay for testing

About

A high-performance, fault-tolerant distributed consensus engine implementing state machine replication, dynamic leader election, and linearizable log replication. Designed as a foundational primitive for low-latency cluster coordination and highly available infrastructure.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors