canonicalization

AI-Generated Content

This comprehensive explanation has been generated from 118 GitHub source documents. All source documents are searchable here.

Last updated: October 7, 2025

This content is meant to be consumed by AI agents via MCP. Click here to get the MCP configuration.
Note: In rare cases it may contain LLM hallucinations.
For authoritative documentation, please consult the official GLEIF vLEI trainings and the ToIP Glossary.

Short Definition

Canonicalization is the process of converting data that has multiple possible representations into a single, deterministic "standard" or "canonical" form, enabling consistent cryptographic operations, equivalence comparison, and verifiable data structures across KERI/ACDC systems.

No related concepts available

Comprehensive Explanation

Canonicalization in KERI/ACDC Systems

Process Definition

Canonicalization (also called standardization or normalization) is a fundamental data transformation process in KERI and ACDC implementations that converts data structures with potentially multiple valid representations into a single, deterministic, reproducible form. This process is critical for cryptographic integrity because cryptographic hash functions and digital signatures require byte-exact input to produce consistent, verifiable outputs.

In the KERI ecosystem, canonicalization accomplishes several essential objectives:

Enables cryptographic verifiability: By ensuring data serializes identically across different systems, canonicalization allows SAID (Self-Addressing Identifier) computation to produce consistent digests
Supports equivalence comparison: Different representations of logically identical data can be compared by canonicalizing both and checking for byte-exact equality
Prevents malleability attacks: Deterministic serialization prevents attackers from creating alternative representations of signed data that would produce different signatures
Facilitates interoperability: Systems using different serialization libraries or programming languages can exchange verifiable data structures

Canonicalization is used throughout KERI operations including:

SAID generation for ACDC credentials and schemas
KEL (Key Event Log) event serialization for signature verification
CESR (Composable Event Streaming Representation) primitive encoding
Schema validation and credential presentation

Key participants in canonicalization processes include:

Implementation Notes

Critical Implementation Requirements

Field Ordering Strategy

ACDC Canonical Order: The canonical field order for ACDCs is schema-defined, not lexicographic. The schema's properties object defines the order in which fields must appear in canonical serializations. This is explicitly stated in Document 2: "The canonical ordering is defined by the JSON schema document, not lexicographical (alphabetical) order."

Implementation Approach:

Parse the JSON Schema to extract field order from properties object
Maintain insertion order when constructing data structures
Never use alphabetical sorting for canonicalization
Test with schemas that have non-alphabetical field orders to catch ordering bugs

SAID Placeholder Handling

Placeholder Requirements:

Must be exactly 44 characters for Blake3-256 with CESR encoding
Typically uses # (ASCII 35) characters: ############################################
Must be replaced byte-for-byte with computed SAID
Placeholder length varies with hash algorithm (32 chars for SHA-256, 44 for Blake3-256)

Common Mistake: Using arbitrary placeholder lengths or forgetting to account for CESR derivation code in length calculation.

Recursive Canonicalization

Nested Structure Handling: ACDCs often contain nested field maps (attributes, edges, rules sections). Canonicalization must be applied recursively:

Start with innermost nested structures
Canonicalize and compute SAIDs for leaf nodes
Embed computed SAIDs into parent structures
Continue recursively until top-level SAID is computed

Document 26 provides detailed guidance: "The SAIDification process must proceed from the innermost blocks outward. This recursive approach ensures that: 1. Leaf-level SAIDs are calculated first, 2. These SAIDs are embedded into their parent structures, 3. Parent-level SAIDs are then calculated."

Serialization Format Consistency

Multi-Format Support: KERI supports JSON, CBOR, and MGPK serialization. Key requirements:

All formats must produce equivalent canonical forms
Conversion between formats must preserve field ordering
The v (version) field indicates serialization format

Loading vLEI.wiki