Loading vLEI.wiki
Fetching knowledge base...
Fetching knowledge base...
This comprehensive explanation has been generated from 62 GitHub source documents. All source documents are searchable here.
Last updated: October 7, 2025
This content is meant to be consumed by AI agents via MCP. Click here to get the MCP configuration.
Note: In rare cases it may contain LLM hallucinations.
For authoritative documentation, please consult the official GLEIF vLEI trainings and the ToIP Glossary.
Byzantine Fault Tolerance (BFT) is a property of distributed computing systems that enables them to reach consensus and maintain correct operation despite the presence of Byzantine faults—failures where components may behave arbitrarily, provide inconsistent information to different observers, or act maliciously. A BFT system can continue functioning correctly as long as at least two-thirds of the network reaches consensus, tolerating up to one-third faulty or malicious nodes.
Byzantine Fault Tolerance (BFT) represents a fundamental property of distributed computing systems that enables them to achieve reliable consensus despite the presence of Byzantine faults. A Byzantine fault occurs when a system component fails in a way that presents different symptoms to different observers, making it difficult for other components to determine whether the component has actually failed and what corrective action to take.
The term derives from the Byzantine Generals Problem, a classical thought experiment in distributed computing that illustrates the challenge of achieving agreement when some participants may be unreliable or malicious. In this allegory, several Byzantine generals must coordinate an attack on a city, but they can only communicate via messengers who may be intercepted or corrupted. The generals must reach consensus on whether to attack or retreat, despite knowing that some generals may be traitors attempting to prevent agreement.
A system achieves Byzantine Fault Tolerance when it can maintain correct operation as long as two-thirds of the network reaches consensus. This two-thirds threshold is mathematically significant: it ensures that honest nodes can always outvote malicious nodes, even when up to one-third of participants are faulty or adversarial. The system resists:
The Byzantine Generals Problem was formalized by Leslie Lamport, Robert Shostak, and Marshall Pease in their 1982 paper "The Byzantine Generals Problem." This foundational work established the theoretical framework for understanding consensus in the presence of arbitrary failures.
Traditional distributed systems often assumed fail-stop behavior, where components either work correctly or stop completely. Byzantine faults are more challenging because faulty components continue operating while providing incorrect or inconsistent information. This makes Byzantine faults particularly relevant for:
Implementing Byzantine fault tolerance in KERI requires careful witness pool configuration:
Minimum Witness Count: For meaningful BFT properties, deploy at least n = 3*f + 1 witnesses where f is the expected maximum faults. For example:
Threshold Selection: Set the witness threshold M according to:
M = N - F (requires all honest witnesses)M = (N + F + 1) / 2 (requires supermajority)M = F + 1 (minimum for safety, reduces availability)Geographic Distribution: Deploy witnesses across multiple geographic regions to prevent common-mode failures from network partitions, natural disasters, or regional infrastructure outages.
Organizational Diversity: Select witnesses from different organizations to prevent collusion and reduce single points of failure in governance.
The Threshold of Accountable Duplicity should be set considering:
Security Requirements: Higher-stakes identifiers (legal entities, financial institutions) should use higher TOAD values approaching N - F.
Availability Requirements: Systems requiring high availability may use lower TOAD values, accepting slightly reduced security for faster confirmation.
Dynamic Adjustment: TOAD can be modified through rotation events, allowing controllers to adjust security/availability trade-offs as requirements evolve.
Enhance BFT properties through watcher networks:
Independent Monitoring: Deploy watchers operated by different entities than witnesses to provide independent verification.
Promiscuous Mode: Watchers run in promiscuous mode, collecting and verifying all key events they observe without requiring controller designation.
Duplicity Detection: Watchers compare key event logs from multiple witnesses, detecting inconsistencies that indicate Byzantine behavior.
Practical Byzantine Fault Tolerance (pBFT), introduced by Miguel Castro and Barbara Liskov in 1999, represented a breakthrough by demonstrating that Byzantine fault tolerance could be achieved efficiently in asynchronous systems. pBFT showed that:
The pBFT algorithm operates through a three-phase protocol (pre-prepare, prepare, commit) that ensures all honest nodes agree on the ordering of operations, even when up to one-third of nodes are faulty. This work laid the foundation for modern BFT consensus mechanisms used in blockchain systems and distributed identity infrastructure.
KERI employs Byzantine fault tolerance principles through its witness coordination mechanism and KAACE (KERI Agreement Algorithm for Control Establishment), but with a distinctive architectural approach that differs from traditional BFT systems.
KERI's BFT implementation centers on witness pools—designated entities that verify, sign, and store key events for autonomic identifiers (AIDs). The witness architecture provides Byzantine fault tolerance through:
Threshold-Based Agreement: Controllers specify a witness threshold determining the minimum number of witness confirmations required for an event to be considered properly witnessed. This threshold must satisfy the relationship M >= N - F, where:
N = total number of witnessesF = maximum number of potentially faulty witnessesM = minimum required confirmations (the ample number)Independent Verification: Each witness independently validates key events before signing receipts, preventing a single compromised witness from corrupting the system.
Duplicity Detection: Rather than preventing Byzantine behavior, KERI's architecture makes duplicity evident. If a controller creates conflicting versions of key events, witnesses maintain immutable records that expose the inconsistency through ambient duplicity detection.
KERI's Agreement Algorithm for Control Establishment (KAACE) represents what the KERI specification describes as "a simplification of PBFT-class algorithms." KAACE achieves consensus through:
Agreement Definition: Agreement on a key event occurs when:
Separation of Networks: KAACE introduces a novel architectural separation:
This separation enables KERI to achieve safety without requiring liveness, making it suitable for decentralized, eventually-consistent identity systems.
KERI's BFT approach is characterized by what KERI creator Samuel Smith describes as "what if PBFT and Stellar had a baby?" with these properties:
Present Properties:
Deliberately Absent Properties:
This trade-off reflects KERI's design philosophy: prioritizing duplicity detection and cryptographic verifiability over strict real-time consensus. By relaxing liveness and total ordering requirements, KERI achieves greater scalability and flexibility while maintaining the safety properties essential for secure identifier management.
KERI implements BFT principles through the Threshold of Accountable Duplicity (TOAD)—a controller-declared threshold M indicating the minimum number of witness confirmations deemed sufficient for accountability. This threshold:
F faulty witnessesKERI employs the concept of supermajority—a sufficient majority labeled as immune from certain attacks or faults. The ample number represents the minimum participants required to achieve supermajority, ensuring that:
The mathematical constraints ensure that the ample number provides genuine Byzantine fault tolerance:
f >= 1 if n > 0 (at least one fault must be tolerated)n >= 3*f + 1 (total participants must exceed three times the fault threshold)(n + f + 1)/2 <= m <= n - f (ample number must be achievable yet sufficient)Byzantine fault tolerance in KERI provides critical security properties:
Malicious Witness Resistance: Up to one-third of witnesses can be compromised without affecting system integrity. Honest witnesses maintain correct records that expose any duplicitous behavior.
Network Partition Tolerance: The system continues operating correctly even when network partitions prevent some witnesses from communicating, as long as sufficient honest witnesses remain reachable.
Duplicity Evidence: Rather than preventing Byzantine behavior in real-time, KERI makes it cryptographically evident, enabling post-facto accountability and dispute resolution.
Availability vs. Consistency: KERI's BFT approach prioritizes eventual consistency over immediate availability. Events may not be immediately confirmed across all witnesses, but once confirmed, they are cryptographically verifiable.
Witness Selection: Controllers must carefully select witness pools, balancing:
Threshold Configuration: Setting witness thresholds involves trade-offs:
M = N - F where F is the expected maximum faultsEnterprise Identity Systems: Organizations can deploy witness pools with known security properties, providing Byzantine fault tolerance for critical identity operations without blockchain overhead.
Federated Governance: Multiple organizations can jointly witness identifiers, with BFT properties ensuring no single organization can unilaterally compromise the system.
High-Stakes Credentials: Legal entity identifiers (like GLEIF's vLEI) benefit from BFT witness networks that provide cryptographic proof of credential issuance and revocation state.
The KERI documentation identifies eclipse attacks as the primary attack vector against KERI systems. Byzantine fault tolerance provides defense through:
Watcher Networks: Independent watchers monitor witness behavior, detecting inconsistencies that might indicate eclipse attacks.
Distributed Witnesses: Geographic and organizational distribution of witnesses makes it difficult for attackers to isolate or compromise sufficient witnesses to execute an eclipse attack.
Ambient Verification: Anyone can verify key event logs against multiple witnesses, making sustained eclipse attacks detectable through ambient duplicity detection.
KERI's BFT approach achieves:
Low Latency: Witness confirmation typically completes in seconds, not minutes or hours like blockchain consensus.
High Throughput: Each identifier's key event log is independent, enabling parallel processing without global consensus bottlenecks.
Minimal Overhead: Witness coordination requires only cryptographic signature verification, not computational puzzles or stake-based voting.
Scalability: The system scales horizontally as each identifier maintains its own witness pool, avoiding the global state synchronization challenges of traditional BFT systems.
Byzantine Fault Tolerance in KERI represents a pragmatic adaptation of classical BFT principles to the specific requirements of decentralized identifier systems. By focusing on duplicity detection rather than duplicity prevention, and by separating promulgation from confirmation networks, KERI achieves the safety guarantees of Byzantine fault tolerance while maintaining the scalability, portability, and permissionless operation essential for internet-scale identity infrastructure. The witness-based architecture provides cryptographically verifiable consensus without the performance penalties or infrastructure dependencies of traditional blockchain-based BFT systems, making it particularly well-suited for enterprise and regulatory use cases like the GLEIF vLEI ecosystem.
Ambient Verification: Enable public watcher networks that allow anyone to verify identifier state, maximizing duplicity detection coverage.
Parallel Verification: Witnesses can verify events in parallel, reducing confirmation latency.
Asynchronous Operation: KERI's BFT model operates asynchronously, avoiding the synchronization overhead of traditional BFT systems.
Caching: Implement witness receipt caching to reduce redundant verification operations.
Batch Processing: Group multiple events for batch verification when appropriate to improve throughput.
Witness Compromise Detection: Monitor witness behavior for signs of compromise:
Rotation Strategy: Periodically rotate witness pools to limit the impact of long-term compromises.
Audit Trails: Maintain comprehensive logs of witness receipts and watcher observations for forensic analysis.
Incident Response: Establish procedures for responding to detected Byzantine behavior, including witness replacement and identifier recovery mechanisms.