Authentic data is data that possesses both cryptographically verifiableintegrity (the data is whole, sound, and unimpaired) and verifiable provenance (the data has a documented, cryptographically traceable origin and history).
Related Concepts
No related concepts available
Comprehensive Explanation
authentic-data
Conceptual Definition
Authentic data represents a fundamental concept in verifiable data systems where data must satisfy two essential properties simultaneously: integrity and provenance. This definition, attributed to Timothy Ruff at Internet Identity Workshop #37, establishes that authenticity requires more than just data correctness—it demands both verifiable wholeness and traceable origin.
Core Properties
Integrity means the data is:
Whole and complete (no missing components)
Sound and unimpaired (no corruption or modification)
Internally consistent (all parts align correctly)
Verifiably unchanged from its original state
Provenance means the data has:
Documented origin (who created it)
Traceable history (how it evolved)
Verifiable chain of custody (who controlled it)
Cryptographic proof of authorship
The critical insight is that both properties must be present for data to be considered truly authentic. Data can have integrity without provenance (a perfect copy of unknown origin), or claimed provenance without integrity (a corrupted document with attribution), but authentic data requires cryptographic verification of both qualities.
Distinction from Related Concepts
Authentic data differs from:
Veracity: Authenticity concerns origin and integrity, not truthfulness of content
Accuracy: A newspaper story may be authentically reproduced but contain false information
Authorization: Authentic data proves "who said what" but not whether they had authority to say it
Historical Context
Implementation Notes
Verification Requirements
Implementations must verify both integrity and provenance:
Append-only storage: Never delete historical records, only nullify
Redundancy: Multiple copies prevent deletion attacks
Indexing: Efficient lookup of SAIDs and AIDs
Archival: Long-term preservation of provenance chains
Key Management
Protecting authentic data requires:
Secure key generation: High-entropy seeds (128+ bits)
Pre-rotation: Forward commitments to next keys
Multi-signature: Threshold schemes for critical operations
Recovery mechanisms: Delegated or custodial arrangements
Performance Optimization
Compact disclosure: Use SAIDs instead of full content when possible
Caching: Store verified key states to avoid repeated KEL walks
Parallel verification: Independent chains can be verified concurrently
Selective verification: Verify only the provenance chain needed for a specific use case
Traditional Provenance Systems
The concept of provenance originates in art history, archaeology, and legal evidence management, where establishing the chain of custody for physical objects has been critical for centuries. Traditional provenance relied on:
Paper trails: Physical documentation of ownership sequences
Expert testimony: Authorities vouching for authenticity
Comparative analysis: Matching against known authentic examples
Scientific testing: Material analysis to verify age and origin
These methods provided contextual and circumstantial evidence but lacked cryptographic guarantees. The transition to digital systems initially replicated these weaknesses—digital signatures could prove integrity, but tracking provenance required trusted intermediaries.
Digital Identity Evolution
Early digital identity systems separated integrity from provenance:
PKI/Certificate Authority Model:
Provided integrity through digital signatures
Required trusted third parties for provenance
Created administrative rather than cryptographic roots of trust
Vulnerable to CA compromise and DNS hijacking
Blockchain Approaches:
Offered algorithmic trust through distributed consensus
Provided provenance through immutable ledgers
Required shared infrastructure and governance
Locked identifiers to specific platforms
The Authentic Data Gap
The fundamental challenge was creating data structures that could provide end-verifiable proof of both integrity and provenance without relying on:
Centralized authorities
Shared ledger infrastructure
Platform-specific trust models
External verification services
This gap motivated the development of Decentralized Autonomic Data (DAD) concepts and eventually the KERI protocol.
KERI's Approach
Cryptographic Root-of-Trust
KERI establishes authentic data through an autonomic trust basis where:
Post-quantum resistant through digest-based commitments
Portability:
Authentic data moves between systems without losing properties
No platform lock-in
Verification works across trust domains
Infrastructure independence
Scalability:
Verification is local and efficient
No global consensus required
Parallel processing of independent chains
Minimal infrastructure requirements
Privacy:
Selective disclosure preserves authenticity
Graduated revelation mechanisms
Correlation resistance through blinding
Privacy-preserving verification
Trade-offs
Complexity:
Requires understanding of cryptographic primitives
Key management becomes critical
Recovery mechanisms must be carefully designed
Learning curve for implementers
Storage:
Complete provenance chains require storage
Historical data cannot be deleted
Nullification adds records rather than removing them
Trade-off between completeness and efficiency
Key Management:
Lost keys mean lost control
Compromise requires rotation and recovery
Multi-signature schemes add coordination overhead
Delegation increases complexity
Adoption:
Requires ecosystem coordination
Interoperability depends on standard compliance
Legacy systems may not support authentic data
Migration from existing systems requires planning
The Authentic Web Vision
The ultimate implication of authentic data is the Authentic Web—a vision where all data on the internet has verifiable proof of authorship and integrity. This would enable:
Universal attribution: Every piece of data traceable to its source
Automated trust decisions: Machines can verify authenticity without human intervention
Reputation systems: Consistent behavior over time becomes verifiable
Value attribution: Creators can be compensated based on verifiable contributions
Regulatory compliance: Audit trails are cryptographically guaranteed
KERI provides the foundational infrastructure for this vision by making authentic data practical, scalable, and universally verifiable without requiring centralized authorities or shared infrastructure.