Stage 6: Component Library Documentation¶
Overview¶
Stage 6 of the KotaDB risk reduction methodology implements a Component Library that provides reusable, battle-tested components with validated inputs and automatic best practices. This stage achieves -1.0 risk reduction points by making it impossible to construct invalid states and automatically applying proven patterns.
Architecture¶
The component library consists of three main categories:
Stage 6 Components
├── Validated Types (src/types.rs)
│ ├── Path validation and safety
│ ├── Document lifecycle state machines
│ ├── Temporal constraints enforcement
│ └── Bounded numeric types
├── Builder Patterns (src/builders.rs)
│ ├── Fluent API construction
│ ├── Sensible defaults
│ ├── Validation during building
│ └── Ergonomic error handling
└── Wrapper Components (src/wrappers.rs)
├── Automatic tracing and metrics
├── Transparent caching layers
├── Retry logic with backoff
└── RAII transaction safety
Validated Types (src/types.rs)¶
Core Principle: Invalid States Unrepresentable¶
All validated types follow the principle that invalid data cannot be constructed. Instead of runtime checks scattered throughout the codebase, invariants are enforced at the type level.
Path Safety: ValidatedPath
¶
pub struct ValidatedPath {
inner: PathBuf,
}
impl ValidatedPath {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
// Enforces:
// - Non-empty paths
// - No directory traversal (..)
// - No null bytes
// - Valid UTF-8
// - Not Windows reserved names
}
}
Why this matters: Path traversal vulnerabilities are eliminated at compile time. No need to remember to validate paths throughout the codebase.
Document Identity: ValidatedDocumentId
¶
pub struct ValidatedDocumentId {
inner: Uuid,
}
impl ValidatedDocumentId {
pub fn from_uuid(uuid: Uuid) -> Result<Self> {
ensure!(!uuid.is_nil(), "Document ID cannot be nil");
Ok(Self { inner: uuid })
}
}
Why this matters: Nil UUIDs are a common source of bugs. This type guarantees every document has a valid identifier.
Document Lifecycle: TypedDocument<State>
¶
pub struct TypedDocument<S: DocumentState> {
pub id: ValidatedDocumentId,
pub path: ValidatedPath,
pub timestamps: TimestampPair,
// ... other fields
_state: PhantomData<S>,
}
// State machine transitions
impl TypedDocument<Draft> {
pub fn into_persisted(self) -> TypedDocument<Persisted> { ... }
}
impl TypedDocument<Persisted> {
pub fn into_modified(self) -> TypedDocument<Modified> { ... }
}
Why this matters: Documents can only transition through valid states. Attempting to modify a draft or persist a non-existent document becomes a compile error.
Temporal Constraints: TimestampPair
¶
pub struct TimestampPair {
created: ValidatedTimestamp,
updated: ValidatedTimestamp,
}
impl TimestampPair {
pub fn new(created: ValidatedTimestamp, updated: ValidatedTimestamp) -> Result<Self> {
ensure!(updated.as_secs() >= created.as_secs(),
"Updated timestamp must be >= created timestamp");
Ok(Self { created, updated })
}
}
Why this matters: Time paradoxes (documents updated before they were created) are impossible to represent.
Builder Patterns (src/builders.rs)¶
Core Principle: Ergonomic Construction with Validation¶
Builders provide fluent APIs that make it easy to construct complex objects while ensuring all required fields are provided and validation occurs at build time.
Document Construction: DocumentBuilder
¶
let doc = DocumentBuilder::new()
.path("/knowledge/rust-patterns.md")?
.title("Rust Design Patterns")?
.content(b"# Rust Patterns\n\nKey patterns...")
.word_count(150) // Optional - will be calculated if not provided
.timestamps(1000, 2000)? // Optional - will use current time if not provided
.build()?;
Features: - Fluent API: Method chaining for readability - Automatic Calculation: Word count computed from content if not specified - Sensible Defaults: Timestamps default to current time - Early Validation: Errors caught at method call, not build time - Required Fields: Build fails if path, title, or content missing
Query Construction: QueryBuilder
¶
let query = QueryBuilder::new()
.with_text("rust patterns")?
.with_tag("programming")?
.with_tag("design")?
.with_date_range(start_time, end_time)?
.with_limit(50)?
.build()?;
Features: - Incremental Building: Add constraints one at a time - Validation per Method: Each method validates its input immediately - Flexible Composition: Mix text, tags, date ranges, and limits - Default Limits: Reasonable defaults prevent accidental large queries
Wrapper Components (src/wrappers.rs)¶
Core Principle: Automatic Best Practices¶
Wrappers implement cross-cutting concerns like tracing, caching, validation, and retry logic automatically. They can be composed together to create fully-featured implementations.
Automatic Tracing: TracedStorage<S>
¶
pub struct TracedStorage<S: Storage> {
inner: S,
trace_id: Uuid,
operation_count: Arc<Mutex<u64>>,
}
Capabilities: - Unique Trace IDs: Every storage instance gets a UUID for correlation - Operation Logging: All operations logged with context and timing - Metrics Collection: Duration and success/failure metrics automatically recorded - Operation Counting: Track how many operations performed
Usage Pattern:
let storage = MockStorage::new();
let traced = TracedStorage::new(storage);
// All operations now automatically traced and timed
Input/Output Validation: ValidatedStorage<S>
¶
pub struct ValidatedStorage<S: Storage> {
inner: S,
existing_ids: Arc<RwLock<std::collections::HashSet<Uuid>>>,
}
Capabilities: - Precondition Validation: All inputs validated before processing - Postcondition Validation: All outputs validated before returning - Duplicate Prevention: Tracks existing IDs to prevent duplicates - Update Validation: Ensures updates are valid transitions
Automatic Retries: RetryableStorage<S>
¶
pub struct RetryableStorage<S: Storage> {
inner: S,
max_retries: u32,
base_delay: Duration,
max_delay: Duration,
}
Capabilities: - Exponential Backoff: Intelligent retry timing with jitter - Configurable Limits: Set max retries and delay bounds - Transient Error Handling: Retries on temporary failures only - Operation-Specific Logic: Different retry behavior per operation type
LRU Caching: CachedStorage<S>
¶
pub struct CachedStorage<S: Storage> {
inner: S,
cache: Arc<Mutex<LruCache<Uuid, Document>>>,
cache_hits: Arc<Mutex<u64>>,
cache_misses: Arc<Mutex<u64>>,
}
Capabilities: - LRU Eviction: Intelligent cache management - Cache Statistics: Track hit/miss ratios for optimization - Automatic Invalidation: Updates and deletes invalidate cache entries - Configurable Size: Set cache capacity based on memory constraints
Wrapper Composition¶
The real power comes from composing wrappers together:
pub type FullyWrappedStorage<S> = TracedStorage<ValidatedStorage<RetryableStorage<CachedStorage<S>>>>;
pub async fn create_wrapped_storage<S: Storage>(
inner: S,
cache_capacity: usize,
) -> FullyWrappedStorage<S> {
let cached = CachedStorage::new(inner, cache_capacity);
let retryable = RetryableStorage::new(cached);
let validated = ValidatedStorage::new(retryable);
let traced = TracedStorage::new(validated);
traced
}
Layer Composition: 1. Base Storage: Your implementation 2. Caching Layer: Reduces I/O operations 3. Retry Layer: Handles transient failures 4. Validation Layer: Ensures data integrity 5. Tracing Layer: Provides observability
RAII Transaction Safety: SafeTransaction
¶
pub struct SafeTransaction {
inner: Transaction,
committed: bool,
}
impl Drop for SafeTransaction {
fn drop(&mut self) {
if !self.committed {
warn!("Transaction {} dropped without commit - automatic rollback",
self.inner.id);
// Triggers rollback
}
}
}
Capabilities: - Automatic Rollback: Uncommitted transactions roll back on drop - Explicit Commit: Must explicitly commit to persist changes - RAII Safety: Impossible to forget transaction cleanup
Testing Strategy¶
Test Coverage by Component¶
Validated Types Tests (tests/validated_types_tests.rs
)¶
- Edge Case Validation: Empty strings, null bytes, reserved names
- Boundary Testing: Maximum lengths, extreme timestamps
- State Machine Testing: Valid and invalid state transitions
- Invariant Testing: Type constraints cannot be violated
Builder Tests (tests/builder_tests.rs
)¶
- Fluent API: Method chaining works correctly
- Validation: Each method validates its input
- Default Behavior: Sensible defaults applied correctly
- Error Propagation: Validation errors surface immediately
Wrapper Tests (tests/wrapper_tests.rs
)¶
- Composition: Wrappers can be stacked together
- Automatic Behavior: Tracing, caching, retries work transparently
- Performance: Cache hit/miss ratios, retry counts measured
- Error Handling: Failure scenarios handled gracefully
Property-Based Testing Integration¶
Stage 6 components integrate with the existing property-based testing from Stage 5:
#[test]
fn validated_path_never_allows_traversal() {
proptest!(|(path_input in any_string())| {
if let Ok(validated) = ValidatedPath::new(&path_input) {
// If validation succeeded, path is guaranteed safe
assert!(!validated.as_str().contains(".."));
assert!(!validated.as_str().contains('\0'));
}
// If validation failed, that's also correct behavior
});
}
Performance Characteristics¶
Validated Types¶
- Zero Runtime Cost: Validation only at construction time
- Compile-Time Optimization: NewType patterns optimize away
- Memory Efficiency: No additional overhead beyond wrapped types
Builder Patterns¶
- Allocation Efficient: Builders reuse allocations where possible
- Lazy Validation: Only validate when needed, cache results
- Move Semantics: Take ownership to avoid unnecessary copies
Wrapper Components¶
- Composable Overhead: Each wrapper adds minimal overhead
- Async-Optimized: All wrappers designed for async/await patterns
- Zero-Copy Where Possible: Pass-through wrappers avoid data copies
Integration with Previous Stages¶
Stage 1-2 Integration: Contracts and Tests¶
#[async_trait]
impl<S: Storage> Storage for TracedStorage<S> {
async fn insert(&mut self, doc: Document) -> Result<()> {
// Stage 2: Contract validation
validation::document::validate_for_insert(&doc, &HashSet::new())?;
// Stage 6: Automatic tracing
with_trace_id("storage.insert", async {
self.inner.insert(doc).await
}).await
}
}
Stage 3-4 Integration: Pure Functions and Observability¶
impl DocumentBuilder {
fn calculate_word_count(content: &[u8]) -> u32 {
// Stage 3: Pure function for word counting
pure::text::count_words(content)
}
pub fn build(self) -> Result<Document> {
// Stage 4: Automatic metric recording
let start = Instant::now();
let result = self.build_internal();
record_metric(MetricType::Histogram {
name: "document_builder.build.duration".to_string(),
value: start.elapsed().as_millis() as f64,
tags: vec![],
});
result
}
}
Stage 5 Integration: Adversarial Testing¶
All Stage 6 components are tested against the adversarial scenarios from Stage 5: - Concurrent Access: Multiple threads using builders simultaneously - Invalid Inputs: Fuzz testing with random byte sequences - Resource Exhaustion: Large caches, many retry attempts - Failure Injection: Wrapped storage that simulates failures
Usage Examples¶
Basic Document Processing¶
use kotadb::{DocumentBuilder, TracedStorage, CachedStorage};
async fn process_document(content: &[u8], path: &str) -> Result<()> {
// Stage 6: Builder with validation
let doc = DocumentBuilder::new()
.path(path)? // Validated path
.title("Auto-Generated")? // Validated title
.content(content) // Auto-calculated word count
.build()?;
// Stage 6: Wrapped storage with automatic best practices
let storage = create_wrapped_storage(BaseStorage::new(), 1000).await;
storage.insert(doc).await?; // Traced, cached, retried, validated
Ok(())
}
Advanced Query Building¶
use kotadb::{QueryBuilder, ValidatedTag};
async fn build_complex_query() -> Result<Query> {
let query = QueryBuilder::new()
.with_text("machine learning")?
.with_tags(vec!["ai", "algorithms", "rust"])?
.with_date_range(
chrono::Utc::now().timestamp() - 86400 * 7, // Last week
chrono::Utc::now().timestamp()
)?
.with_limit(25)?
.build()?;
Ok(query)
}
Storage Configuration¶
use kotadb::{StorageConfigBuilder, IndexConfigBuilder};
async fn setup_optimized_storage() -> Result<()> {
let storage_config = StorageConfigBuilder::new()
.path("/data/knowledge-base")?
.cache_size(512 * 1024 * 1024) // 512MB cache
.compression(true)
.encryption_key([0u8; 32]) // Use real key in production
.build()?;
let index_config = IndexConfigBuilder::new()
.name("semantic_search")
.max_memory(100 * 1024 * 1024) // 100MB
.fuzzy_search(true)
.similarity_threshold(0.85)?
.build()?;
// Use configurations...
Ok(())
}
Best Practices¶
When to Use Validated Types¶
- Always for user inputs (paths, queries, identifiers)
- Always for data with invariants (timestamps, sizes, limits)
- Consider for internal types that have constraints
When to Use Builders¶
- Complex objects with many optional fields
- Configuration objects with sensible defaults
- Objects requiring validation of field combinations
When to Use Wrappers¶
- Cross-cutting concerns like logging, metrics, caching
- Infrastructure patterns like retries, circuit breakers
- Behavioral modification without changing core logic
Composition Guidelines¶
- Layer by responsibility: Group related concerns together
- Optimize for readability: Most important wrapper outermost
- Consider performance: Expensive operations (validation) inner
- Test composition: Verify wrappers work together correctly
Future Extensions¶
Additional Validated Types¶
ValidatedEmail
: Email address validationValidatedUrl
: URL format and reachabilityValidatedLanguageCode
: ISO language codesValidatedMimeType
: MIME type validation
Additional Builders¶
FilterBuilder
: Complex query filtersIndexBuilder
: Index configuration with optimization hintsBackupConfigBuilder
: Backup and restore configurations
Additional Wrappers¶
RateLimitedStorage
: Rate limiting for external APIsEncryptedStorage
: Transparent encryption/decryptionVersionedStorage
: Automatic versioning and rollbackDistributedStorage
: Multi-node consistency
Conclusion¶
Stage 6's Component Library provides the foundation for reliable, maintainable code by:
- Eliminating Invalid States: Validated types make bugs unrepresentable
- Encoding Best Practices: Wrappers automatically apply proven patterns
- Improving Developer Experience: Builders make complex construction ergonomic
- Enabling Composition: Components combine to create powerful functionality
The -1.0 risk reduction is achieved through prevention rather than detection - problems that can't happen don't need to be debugged.