Loading vLEI.wiki
Fetching knowledge base...
Fetching knowledge base...
This comprehensive explanation has been generated from 42 GitHub source documents. All source documents are searchable here.
Last updated: October 7, 2025
This content is meant to be consumed by AI agents via MCP. Click here to get the MCP configuration.
Note: In rare cases it may contain LLM hallucinations.
For authoritative documentation, please consult the official GLEIF vLEI trainings and the ToIP Glossary.
A method of identifying and retrieving data using a cryptographic hash of the content itself as the address, rather than a location-based identifier, providing inherent integrity verification and deduplication properties.
A content-addressable hash is a cryptographic identifier derived by applying a one-way hash function to data content, where the resulting digest serves simultaneously as both the unique address for locating that data and a cryptographic commitment to its integrity. This approach fundamentally differs from traditional location-based addressing (such as URLs or file paths) by making the identifier intrinsically bound to the content through cryptographic properties.
In the KERI and ACDC ecosystems, content-addressable hashing forms the foundation for Self-Addressing Identifiers (SAIDs), enabling verifiable data structures where any modification to content produces a detectably different identifier. The hash function used must be collision-resistant and one-way, meaning it is computationally infeasible to find two different inputs producing the same hash or to reverse-engineer the original content from the hash alone.
Content-addressable hashing serves three critical functions in KERI-based systems:
This primitive underlies KERI's approach to creating verifiable data structures and authentic data containers, where identifiers remain cryptographically bound to their content—a core requirement for and verification mechanisms.
For new KERI implementations, Blake3-256 is the recommended hash algorithm due to its:
However, implementations should support multiple algorithms through the CESR derivation code system to enable:
Content-addressable hashing requires deterministic serialization to ensure the same content always produces the same hash. Critical requirements:
For SAID generation, the placeholder string (typically # characters) must be exactly the same length as the final SAID to ensure the serialization size remains constant.
Content-addressable hashing can be performance-critical in high-throughput scenarios:
Collision Resistance: While 256-bit hashes provide strong collision resistance, implementations should:
Preimage Attacks: Content-addressable hashes used as commitments to private data should:
Content-addressable hashing is classified as a cryptographic primitive within the CESR (Composable Event Streaming Representation) framework. It represents a fundamental building block for more complex constructs like SAIDs, seals, and digests used throughout KERI event logs and ACDC credential structures.
KERI supports multiple cryptographic hash algorithms through its derivation code system, with Blake3-256 being the preferred default for new implementations due to its performance and security characteristics. The specification also supports:
The choice of algorithm is encoded in the CESR derivation code that prepends the hash output, enabling cryptographic agility—the ability to support multiple algorithms and migrate to stronger ones as needed without breaking existing systems.
Content-addressable hashes in KERI must satisfy three critical security properties:
Collision Resistance: It must be computationally infeasible to find two different inputs that produce the same hash output. This property ensures that each unique piece of content has a unique identifier. For 256-bit hash functions, the collision resistance provides approximately 128 bits of security against birthday attacks.
Preimage Resistance (One-Way Property): Given a hash output, it must be computationally infeasible to find any input that produces that hash. This property ensures that the hash cannot be reversed to reveal the original content, which is critical for privacy-preserving applications where hashes serve as commitments to undisclosed data.
Second Preimage Resistance: Given an input and its hash, it must be computationally infeasible to find a different input that produces the same hash. This property prevents attackers from substituting malicious content while maintaining the same identifier.
These properties collectively ensure that content-addressable hashes provide tamper-evidence—any modification to content produces a different hash, making tampering immediately detectable through identifier mismatch.
KERI standardizes on 256-bit (32-byte) hash outputs for content-addressable identifiers, providing:
When encoded in CESR text format using Base64 URL-safe encoding, a 256-bit hash with its derivation code occupies 44 characters. In binary format, the same hash with derivation code occupies 33 bytes (1-byte code + 32-byte hash).
Content-addressable hashes in KERI are encoded using CESR (Composable Event Streaming Representation), which provides dual text-binary encoding with composability properties. The encoding structure consists of:
For example, a Blake3-256 hash in CESR text format:
ELvaU6Z-i0d8JJR2nmwyYAZAoTNZH3UfSVPzhzS6b5CM
Breaking down this encoding:
E: Derivation code for Blake3-256 digestLvaU6Z-i0d8JJR2nmwyYAZAoTNZH3UfSVPzhzS6b5CM: Base64 URL-safe encoding of the 32-byte hashText Domain (Base64 URL-safe):
A-Z, a-z, 0-9, -, _=) due to CESR's alignment propertiesBinary Domain:
CESR's design ensures that any set of concatenated primitives in text domain can be converted to binary domain and back without loss, enabling efficient streaming and storage while maintaining human readability when needed.
KERI uses single-character derivation codes in text domain to indicate hash algorithms:
E: Blake3-256 (recommended default)F: Blake2b-256G: Blake2s-256H: SHA3-256I: SHA2-256These codes are part of the CESR code table, which defines the complete set of cryptographic primitives supported by KERI. The derivation code system enables algorithm agility—systems can support multiple hash functions simultaneously and migrate to stronger algorithms as cryptographic research advances.
Content-addressable hashes appear throughout KERI and ACDC data structures:
Key Event Logs (KELs):
Authentic Chained Data Containers (ACDCs):
d field in every ACDC contains a self-addressing identifier computed as a hash of the entire structures field may contain a hash reference to a JSON Schemaa field may contain a hash of attribute blocks for compact disclosuree field uses hashes to link ACDCs in directed acyclic graphsTransaction Event Logs (TELs):
Self-Referential Identifiers (SAIDs):
The most sophisticated use of content-addressable hashing in KERI is the SAID protocol, which creates identifiers that are embedded within the content they identify. The generation process:
# characters)This creates a self-referential identifier that is simultaneously part of the data and a cryptographic commitment to it.
Compact Disclosure:
ACDCs use content-addressable hashes to enable graduated disclosure mechanisms:
This pattern allows credential holders to progressively reveal information while maintaining verifiable commitments to the complete credential structure.
Merkle Tree Commitments:
While not explicitly detailed in the provided sources, content-addressable hashing enables Merkle tree structures for efficient verification of large data sets. The ACDC specification mentions support for Merkle proofs in selective disclosure scenarios.
Verifying content-addressable hashes follows a standard protocol:
Basic Hash Verification:
SAID Verification:
# characters of same length)Chain Verification:
For chained structures like KELs:
Any mismatch in hash verification indicates either data corruption or malicious tampering, triggering rejection of the invalid data.
Content-addressable hashes are closely related to digest primitives in CESR, which represent the raw cryptographic hash output with its derivation code. The distinction:
Digests appear in multiple contexts:
SAIDs represent the most sophisticated application of content-addressable hashing, where the hash is embedded within the content it identifies. This creates a self-referential identifier that provides:
SAIDs are used extensively in ACDCs for the d (identifier), s (schema), a (attributes), and e (edges) fields.
Seals use content-addressable hashes to anchor external data to key events. A seal consists of:
Seals enable KERI to make cryptographically verifiable commitments to arbitrary data (credentials, documents, transactions) without embedding the full data in the KEL.
Content-addressable hashes compose with other CESR primitives to create complex verifiable structures:
Hash Chains: Sequential events linked by digests of prior events
Merkle Trees: Hierarchical hash structures for efficient verification
Commitment Schemes: Hashes serve as cryptographic commitments to undisclosed data
Authenticated Data Structures: Combining hashes with signatures creates non-repudiable commitments
These composition patterns enable KERI's architecture of verifiable data structures that provide end-to-end verifiability without requiring trusted intermediaries.
Side-Channel Attacks: Implementations should:
When implementing content-addressable hashing with CESR:
Comprehensive testing should include:
Non-deterministic serialization: Ensure JSON serialization maintains field order and produces identical output for identical input.
Incorrect placeholder handling: SAID generation requires exact placeholder length matching the final SAID length.
Algorithm mismatch: Verify the derivation code matches the actual hash algorithm used.
Encoding errors: Ensure proper Base64 URL-safe encoding without padding characters.
Premature optimization: Prioritize correctness over performance in initial implementations; optimize only after profiling identifies bottlenecks.