Loading vLEI.wiki Fetching knowledge base...
vLEI.wiki Comprehensive knowledge base for KERI (Key Event Receipt Infrastructure) and vLEI (verifiable Legal Entity Identifier) ecosystem.
Made by Key State Capital .
© 2025 vLEI.wiki. Educational resource for KERI/vLEI ecosystem.
CESR - vLEI.wiki | KERI Knowledge Base - vLEI.wiki
Back to ConceptsShort Definition CESR (Composable Event Streaming Representation) is a dual text-binary encoding protocol that provides self-framing , composable representation of cryptographic primitives and structured data, enabling lossless round-trip conversion between human-readable text and compact binary formats while maintaining primitive separability.
Related Concepts No related concepts available
Comprehensive Explanation CESR (Composable Event Streaming Representation)
Protocol Definition
Core Purpose and Objectives
CESR addresses a fundamental challenge in cryptographic protocol design: representing cryptographic primitives (digests, keys, signatures) and structured data in formats that are simultaneously:
Human-readable for debugging, logging, and text-based protocols
Compact for efficient network transmission and storage
Self-describing without requiring external schemas
Composable across text and binary domains
Traditional encodings like Base64 provide only value information without type or size metadata. CESR solves this by prepending derivation codes (also called framing codes) that encode type and size information, making each primitive self-framing .
The protocol's defining innovation is text-binary concatenation composability : the ability to convert groups of concatenated primitives between text (T) and binary (B) domains en masse without loss, while maintaining the separability of individual primitives. Formally:
T(cat(b[k])) = cat(T(b[k])) and B(cat(t[k])) = cat(B(t[k])) for all k
This property is essential for streaming protocols where primitives need efficient processing in both domains.
Implementation Notes Critical Implementation Requirements
24-Bit Alignment
All CESR primitives MUST align on 24-bit boundaries:
Text domain : Integer multiples of 4 Base64 characters
Binary domain : Integer multiples of 3 bytes
Pad size calculation : ps = (3 - (N mod 3)) mod 3
Pre-Padding Strategy
CESR uses leading pad bytes (not trailing = characters):
Prepend zero bytes to raw value before Base64 conversion
Replace leading Base64 characters with derivation code
This ensures composability without = pad characters
Code Table Versioning
Implementations MUST support:
Version count codes to specify active code table
Multiple simultaneous versions for backward compatibility
Dynamic table loading when encountering new versions
Stream Parsing
Implementations SHOULD provide:
Cold start re-synchronization using count codes
Format detection (sniffer) for interleaved serializations
Group extraction without parsing individual primitives
Error isolation to prevent cascade failures
Performance Optimization
Text Domain :
Pre-compute code-to-type lookup tables
Minimize string copying operations
Batch convert groups when possible
Binary Domain :
Use zero-copy parsing with byte slices
Ensure proper memory alignment
Consider SIMD for Base64 operations
Testing Requirements
Implementations MUST validate:
Round-trip conversion : T(B(T(x))) = T(x) and B(T(B(x))) = B(x)
Composability : Group conversion equals concatenation of individual conversions
Interoperability : Cross-implementation compatibility using test vectors
Common Pitfalls
: Failing to ensure 24-bit boundaries
CESR is formally specified in:
IETF Draft : draft-ssmith-cesr (Composable Event Streaming Representation)
Trust over IP Foundation : TSWG CESR Specification v0.9 Draft
Related Specifications :
draft-pfeairheller-cesr-proof - CESR Proof Signatures extension
draft-ssmith-said - Self-Addressing Identifiers (uses CESR encoding)
draft-ssmith-keri - KERI protocol (primary consumer of CESR)
Version History and Evolution CESR evolved from the need to support KERI's cryptographic event streaming requirements:
Early Development (2020-2021) : Initial design focused on Base64 text encoding with derivation codes
Binary Domain Addition : Extended to support compact binary representation
Composability Formalization : Mathematical proof of text-binary composability property
Code Table Expansion : Multiple code tables for different primitive types and sizes
CESR-Proof Extension (2022-2023) : Added support for transposable signature attachments on self-addressing data
Current Status (v0.9 Draft) : Stable specification with multiple production implementations
Protocol Architecture
Three-Domain Model CESR operates across three abstract domain representations:
1. Text (T) Domain
Character Set : URL-safe Base64 (A-Z, a-z, 0-9, -, _) per RFC 4648
Encoding : 6 bits per character
Alignment : Primitives must be integer multiples of 4 characters (24 bits)
Purpose : Human readability, text protocols, debugging
Key Property : Stable type coding (type information never shares bits with length/value)
2. Binary (B) Domain
Representation : Raw bytes (8 bits each)
Alignment : Primitives must be integer multiples of 3 bytes (24 bits)
Purpose : Compact transmission, efficient storage
Key Property : Maintains composability with text domain
3. Raw (R) Domain
Representation : Tuple (text_code, raw_binary)
Purpose : Actual cryptographic operations
Usage : Cryptographic libraries work with raw bytes
Conversion : Requires transformation to/from T or B domains for transmission
24-Bit Alignment Constraint The composability property requires strict 24-bit boundary alignment :
24 bits is the least common multiple of 6 (Base64 character width) and 8 (byte width)
Text domain : 4 characters × 6 bits = 24 bits
Binary domain : 3 bytes × 8 bits = 24 bits
Consequence : Without alignment, conversions would create bit-level dependencies between adjacent primitives
Self-Framing Architecture Every CESR primitive is self-framing - it contains all information needed to parse it without external delimiters:
Type information : Encoded in derivation code prefix
Size information : Either fixed (implicit in code) or variable (explicit in code)
Value : The actual cryptographic material or data
E = derivation code (Blake3-256 digest)
Implies 32-byte raw value
Total: 44 characters (aligned on 24-bit boundary)
Code Table Organization CESR uses multiple code tables optimized for different requirements:
Fixed-Size Tables Small Fixed Raw Size (1-character codes):
Pad size 1 primitives
Examples: A (random seed), B (Ed25519 public key), E (Blake3-256 digest)
Large Fixed Raw Size (2-character codes):
Pad size 0 primitives
Examples: 0B (Ed25519 signature), 0D (Blake3-512 digest)
Variable-Size Tables Small Variable Raw Size (2-character codes):
Length encoded in second character
Supports up to 4095 quadlets (text) or triplets (binary)
Large Variable Raw Size (4-character codes):
Length encoded in characters 2-4
Supports up to 16,777,215 quadlets/triplets
Count Codes (Group Framing) Purpose : Enable grouping of primitives for pipelining
Primitive count codes : Specify number of primitives in group
Byte/character count codes : Specify total size of group
Nested group codes : Support hierarchical composition
-AAB<primitive1><primitive2>
-AAB = count code indicating 2 primitives follow
Enables extraction of group without parsing individual primitives
Hierarchical Composition CESR supports hierarchical composition through:
Concatenation : Primitives can be concatenated in any order
Grouping : Count codes create logical groups
Nesting : Groups can contain other groups
Interleaving : CESR streams can interleave with JSON, CBOR, MGPK
Primitive Structure All CESR primitives follow this general structure:
Derivation Code Components :
Type selector : Identifies primitive type
Size information : Fixed (implicit) or variable (explicit)
Pad size indicator : Determines alignment strategy
Text Domain Encoding
Pre-Padding Approach CESR uses leading pad bytes (not trailing = characters) to achieve 24-bit alignment:
Calculate pad size: ps = (3 - (N mod 3)) mod 3 where N = raw byte length
Prepend ps zero bytes to raw value
Convert to Base64
Replace first characters with derivation code
Raw: 0x42
Pad size: 2
Pre-padded: 0x0000 42
Base64: AABC
With code M: MABC
1-Character Codes (pad size 1):
Example: BDKrJxkcR9m5u1xs33F5pxRJP6T7hJEbhpHrUtlDdhh0
2-Character Codes (pad size 0):
[Code1][Code2][42 Base64 chars]
Example: 0BDKrJxkcR9m5u1xs33F5pxRJP6T7hJEbhpHrUtlDdhh0
Example: 4B##<value> where ## encodes length in Base64
Binary Domain Encoding Binary encoding follows similar principles but operates on bytes:
Text code E → Binary code 0x0C
Text code 0B → Binary code 0x34 0x00
Alignment : Binary codes ensure 3-byte alignment through lead byte padding
Supported Primitive Types
Cryptographic Material
A: 128-bit random seed
0A: 256-bit random salt
B: Ed25519 non-transferable prefix
D: Ed25519 public verification key
C: X25519 public encryption key
1AAA: ECDSA secp256k1 public key
E: Blake3-256 (32 bytes)
F: Blake2b-256 (32 bytes)
G: Blake2s-256 (32 bytes)
H: SHA3-256 (32 bytes)
I: SHA2-256 (32 bytes)
0D: Blake3-512 (64 bytes)
0B: Ed25519 signature (64 bytes)
0C: ECDSA secp256k1 signature
Indexed signatures with dual-indexed codes
Data Types
M: Short number (2 bytes)
N: Big number (8 bytes)
0H: Long number (4 bytes)
4B##: Variable-length byte string (small)
5B##: Variable-length byte string (lead size 1)
7AAB####: Variable-length byte string (big)
Encoded as variable-length strings
ISO 8601 format for timestamps
Interleaved Serializations CESR streams can interleave with other serialization formats:
JSON (RFC 8259)
CBOR (RFC 8949)
MessagePack (MGPK)
CESR uses version count codes to mark boundaries
Sniffer component detects format transitions
Regex matching locates version strings in non-CESR formats
[CESR primitives][JSON object][CESR primitives][CBOR data]
Protocol Mechanics
Stream Processing Model
Cold Start and Re-synchronization CESR provides cold start stream parsing capabilities:
Problem : After reboot or error, parser needs framing information
Version count codes at stream boundaries
Group count codes enable skipping to next boundary
No buffer flushing required for recovery
Re-synchronization Process :
Detect format using sniffer
If non-CESR: extract length from version string, skip to boundary
If CESR: locate next count code, resume parsing
Pipelining and Multiplexing
Multiplexing : Combine multiple primitives into complex streams
De-multiplexing : Extract primitives from streams
Group extraction : Pull entire groups without parsing contents
Parallel processing : Multiple cores can process different groups
Early rejection : Invalid signatures can drop messages without full parsing
Efficient routing : Group codes enable stream routing without deep inspection
CESR defines six transformations between domains:
Read binary code
Determine primitive type and length
Extract raw bytes
Apply Base64 encoding with appropriate padding
Prepend text derivation code
Read text code
Determine primitive type and length
Extract Base64 characters
Decode to bytes
Remove lead padding
Prepend binary code
T(B(T(primitive))) = T(primitive)
B(T(B(primitive))) = B(primitive)
Attachment Mechanisms CESR supports cryptographic attachments without wrapper envelopes:
[Message Body][Count Code][Signatures]
[Key Event][Count Code][Witness Receipts]
CESR-Proof Attachments (from draft-pfeairheller-cesr-proof):
[SAD][Count Code][Path-Signature Pairs]
Security Properties
Cryptographic Agility CESR provides algorithm agility through derivation codes:
Multiple algorithms : Support Ed25519, ECDSA, Ed448 simultaneously
Graceful migration : Add new algorithms without breaking existing code
Post-quantum readiness : Can incorporate quantum-resistant algorithms
Algorithm identification : Every primitive self-identifies its algorithm
Security Consideration : Derivation codes must be carefully managed to prevent downgrade attacks
Integrity Properties
Malformed primitives are immediately detectable
Parsing errors don't propagate to adjacent primitives
Stream corruption is localized
Round-trip conversion preserves all information
No bit-level dependencies between primitives
Group boundaries are cryptographically verifiable
Threat Model
Primitive injection : Self-framing prevents injection between primitives
Length extension : Fixed/explicit lengths prevent extension attacks
Type confusion : Derivation codes prevent type misinterpretation
Replay attacks : (Handled by higher-level protocols using CESR)
Not Protected Against (requires higher-level protocols):
Signature verification : CESR encodes signatures but doesn't verify them
Key management : CESR represents keys but doesn't manage them
Replay protection : Requires timestamp/nonce mechanisms in protocols
Attack Resistance Malformed Stream Attacks :
Detection : Invalid codes immediately detected
Recovery : Cold start re-synchronization without data loss
Isolation : Errors don't cascade to valid primitives
Length limits : Variable-length primitives have maximum sizes
Group limits : Count codes prevent unbounded group sizes
Early rejection : Invalid primitives rejected before full processing
Interoperability
KERI Protocol Integration CESR is the native encoding for KERI:
{
"v": "KERI10JSON00011c_",
"t": "icp",
"d": "EH7Oq9oxCgYa-nnNLvwhp9sFZpALILlRYyB-6n4WDi7w",
"i": "EH7Oq9oxCgYa-nnNLvwhp9sFZpALILlRYyB-6n4WDi7w",
"s": "0",
"kt": "1",
"k": ["DSuhyBcPZEZLK-fcw5tzHn2N46wRCG_ZOoeKtWTOunRA"],
"n": ["EPYuj8mq_PYYsoBKkzX1kxSPGYBWaIya3slgCOyOtlqU"]
}
Signature Attachments (CESR-encoded):
-AABAA1o61PgMhwhi89FES_vwYeSbbWnVuELV_jv7Yv6f5zNiOLnj1ZZa4MW2c6Z_vZDt55QUnLaiaikE-d_ApsFEgCA
ACDC Integration SAIDs (Self-Addressing Identifiers):
All ACDC SAIDs are CESR-encoded digests
Example: EAdXt3gIXOf2BBWNHdSXCJnFJL5OuQPyM5K0neuniccM
Replace full field maps with SAIDs
Maintains verifiability while reducing size
Transposable signature attachments
Support nested partial signatures
Enable selective disclosure
SAID Protocol Dependency SAID (Self-Addressing Identifier) protocol uses CESR:
Replace SAID field with dummy characters (#)
Compute digest
CESR-encode digest
Replace dummy with CESR-encoded SAID
Extract SAID
Replace with dummy
Recompute digest
Compare CESR-encoded digests
OOBI Protocol Integration OOBI (Out-Of-Band Introduction) uses CESR for:
AID Encoding : All AIDs in OOBIs are CESR-encoded
URL Construction : CESR primitives embedded in URLs
Discovery : CESR enables compact AID representation
Implementation Considerations
Code Table Management Challenge : Multiple code tables with version evolution
Version count codes : Specify active code table version
Default tables : Parsers start with default version
Dynamic loading : Load new tables when version codes encountered
Backward compatibility : Support multiple versions simultaneously
Lookup tables : Pre-compute code-to-type mappings
String operations : Minimize string copying
Batch conversion : Convert groups en masse when possible
Zero-copy parsing : Parse directly from byte buffers
Alignment : Ensure proper memory alignment for performance
SIMD : Use SIMD instructions for Base64 encoding/decoding
Buffering : Minimize buffer allocations
Pipelining : Process groups in parallel
Early rejection : Validate codes before extracting values
Common Implementation Challenges
Issue : Ensuring all primitives align on 24-bit boundaries
Solution : Careful pad size calculation and validation
Code Table Synchronization :
Issue : Sender and receiver must use same code tables
Solution : Version count codes and explicit version negotiation
Interleaved Format Detection :
Issue : Distinguishing CESR from JSON/CBOR/MGPK
Solution : Sniffer component with format-specific detection logic
Issue : Recovering from malformed streams
Solution : Cold start re-synchronization using count codes
Testing Strategies for each primitive:
text = T(R(primitive))
binary = B(text)
recovered = R(B(binary))
assert primitive == recovered
primitives = [p1, p2, p3]
text_concat = cat([T(p) for p in primitives])
binary_group = B(text_concat)
recovered_text = T(binary_group)
assert text_concat == recovered_text
Interoperability Testing :
Test against reference implementations
Validate against test vectors
Cross-language compatibility checks
Library Design Patterns
Diger : Digest primitive with verification
Verfer : Public key primitive with signature verification
Signer : Private key primitive with signing
Siger : Indexed signature primitive
Cigar : Non-indexed signature primitive
Salter : Seed/salt primitive with key generation
.qb64() # Qualified Base64 text
.qb64b() # Qualified Base64 bytes
.qb2() # Qualified binary
.code() # Derivation code
.raw() # Raw bytes
Sniffer : Format detection
Parside : Stream parsing with count codes
Cesride : Primitive parsing
Streamer : Stream composition
Use Buffer for binary operations
Handle UTF-8 encoding carefully
Consider WebAssembly for performance
Use bytes for binary domain
Leverage base64 standard library
Consider Cython for hot paths
Zero-copy parsing with byte slices
Strong typing for primitive variants
Efficient memory management
Use byte slices for efficiency
Leverage standard encoding/base64
Consider goroutines for parallel processing
Alignment errors
Code table mismatches : Sender/receiver using different versions
Naive Base64 : Using standard Base64 without CESR pre-padding
Buffer management : Inefficient copying in stream processing
Error propagation : Allowing malformed primitives to corrupt adjacent data
Library Architecture Recommendations Primitive Classes : Implement typed classes for each primitive category (Diger, Verfer, Signer, etc.) with common interface:
.qb64() - Qualified Base64 text
.qb2() - Qualified binary
.code() - Derivation code
.raw() - Raw bytes
Sniffer : Format detection for interleaved streams
Parside : Stream parsing with count code support
Cesride : Individual primitive parsing
Streamer : Stream composition and attachment handling