Loading vLEI.wiki
Fetching knowledge base...
Fetching knowledge base...
This comprehensive explanation has been generated from 64 GitHub source documents. All source documents are searchable here.
Last updated: October 7, 2025
This content is meant to be consumed by AI agents via MCP. Click here to get the MCP configuration.
Note: In rare cases it may contain LLM hallucinations.
For authoritative documentation, please consult the official GLEIF vLEI trainings and the ToIP Glossary.
In cryptography and identity systems, a collision occurs when two different inputs produce identical outputs (such as hash digests or identifiers), creating ambiguity about which source the result represents. Collision resistance—the computational infeasibility of finding such pairs—is a fundamental security property for cryptographic hash functions, self-addressing identifiers, and namespace systems.
A collision in cryptographic and identity systems represents a fundamental security concern where identical results point to different sources or backing data. The term encompasses two primary manifestations:
Hash Collisions: When two distinct digital inputs produce the same cryptographic digest (hash value). For example, if hash(data1) = hash(data2) but data1 ≠ data2, a collision has occurred.
Namespace Collisions: When two or more identifiers within a given namespace cannot be unambiguously resolved to their intended targets, creating identifier ambiguity.
The severity of collisions stems from their ability to undermine the fundamental assumptions of cryptographic systems: that unique inputs produce unique outputs, and that outputs can serve as reliable proxies for their inputs. In identity systems, collisions can enable impersonation, data substitution, and breakdown of trust relationships.
Collision Resistance is the cryptographic property that makes finding collisions computationally infeasible. A hash function with strong collision resistance ensures that an adversary cannot practically find two different messages that hash to the same value, even with significant computational resources. The NIST definition specifies this as one of two essential properties for approved hash functions (alongside the one-way property).
The mathematical foundation rests on the birthday paradox: for a hash function producing n-bit outputs, finding a collision requires approximately 2^(n/2) hash computations on average. This is why cryptographic strength is typically measured in bits—a 256-bit hash provides approximately 128 bits of collision resistance.
Collision resistance emerged as a critical concern with the development of cryptographic hash functions in the 1970s and 1980s. Early hash functions like MD5 (128-bit) and SHA-1 (160-bit) were eventually broken through collision attacks:
MD5 Collisions: In 2004, researchers demonstrated practical collision attacks against MD5, finding colliding inputs in hours on commodity hardware. This rendered MD5 unsuitable for security-critical applications.
Implementations MUST use hash functions with at least 128 bits of collision resistance:
Key material generation requires sufficient entropy to prevent collision probability:
When processing CESR-encoded primitives:
When verifying SAIDs:
# characters)For risk assessment:
Collision resistance remains secure against quantum computers:
SHA-1 Collisions: In 2017, Google and CWI Amsterdam demonstrated the first practical SHA-1 collision ("SHAttered" attack), requiring approximately 2^63 computations but proving the theoretical vulnerability was exploitable.
These breaks drove migration to stronger hash functions like SHA-256, SHA-3, and Blake2/Blake3, which provide sufficient collision resistance (128+ bits of security) to remain secure against current and foreseeable quantum computing threats.
In identifier systems, namespace collisions have plagued centralized systems where multiple authorities might independently assign the same identifier to different entities. The Domain Name System (DNS) addresses this through hierarchical delegation, but still suffers from potential collisions during domain transfers or administrative errors.
KERI's architecture fundamentally depends on collision resistance at multiple layers:
KERI's SAID protocol creates identifiers that are cryptographic digests of the content they identify. The security model requires:
The SAID specification explicitly requires that "the digest algorithm employed for generating SAIDs MUST have an approximate cryptographic strength of 128 bits" to ensure collision resistance remains computationally infeasible.
Autonomic Identifiers in KERI derive from public keys or digests of inception data. Collision resistance ensures:
KERI's use of qualified cryptographic primitives (via CESR encoding) includes derivation codes that specify the hash algorithm, enabling crypto-agility while maintaining collision resistance guarantees.
The KEL structure uses hash chaining where each event includes a digest of the previous event. Collision resistance is critical because:
KERI's pre-rotation mechanism commits to next keys via cryptographic digests. The security model assumes:
Daniel J. Bernstein's research on collision costs demonstrates that quantum computers provide no significant advantage for finding hash collisions compared to classical computers, making digest-based commitments quantum-resistant.
KERI's autonomic namespace design prevents namespace collisions through:
Collision resistance provides several critical security properties:
Integrity Protection: In KERI's ACDC credentials, the top-level SAID serves as a Merkle root. Signing the compact variant (containing only SAIDs) provides cryptographic commitment to the entire credential tree. Collision resistance ensures this commitment cannot be forged—an attacker cannot create different credential content that produces the same SAID.
Duplicity Evidence: KERI's security model makes duplicity (conflicting versions of a KEL) evident rather than hidden. This depends on collision resistance: if an attacker could create colliding event digests, they could mask duplicitous behavior by making conflicting events appear consistent.
Content Addressability: SAIDs enable content-addressable storage and retrieval. Collision resistance guarantees that retrieving content by its SAID returns the correct, unambiguous content.
Collision resistance comes with computational costs:
Hash Function Selection: KERI specifies Blake3-256 as a preferred hash function, balancing:
Digest Size: 256-bit digests (32 bytes) provide 128-bit collision resistance. CESR encoding represents these as 44-character Base64 strings in text domain, creating overhead in data structures but ensuring security.
Birthday Attacks: The birthday paradox means collision probability grows with the square of the number of hash computations. For 128-bit collision resistance:
Rainbow Table Attacks: Pre-computed hash tables could enable collision searches for low-entropy inputs. KERI mitigates this through:
Quantum Computing: While quantum computers threaten signature schemes (via Shor's algorithm), they provide no significant advantage for finding hash collisions. Grover's algorithm offers only quadratic speedup, meaning:
Systems implementing KERI must ensure:
Approved Hash Functions: Use only hash functions with proven collision resistance:
Proper Entropy Sources: Identifier generation requires high-quality randomness:
Derivation Code Verification: CESR-encoded primitives include derivation codes specifying the hash algorithm. Verifiers must:
Collision resistance involves inherent trade-offs:
Security vs. Efficiency: Stronger collision resistance (larger digests) increases:
KERI's choice of 256-bit hashes balances these factors, providing strong security (128-bit collision resistance) while maintaining reasonable efficiency.
Crypto-Agility vs. Simplicity: Supporting multiple hash algorithms enables:
CESR's derivation code system provides crypto-agility while maintaining interoperability through standardized code tables.