rhizome-node/docs/schema-validation.md
Lentil Hoffman f4ea2eca39
refactor: replace NegationHelper.createNegation with DeltaBuilder.negate
- Remove NegationHelper.createNegation in favor of using DeltaBuilder's fluent API
- Update all test cases to use createDelta().negate().buildV1() pattern
- Update documentation to reflect the preferred way to create negation deltas
- Remove unused isNegationDeltaById helper method
2025-06-21 22:45:27 -05:00

91 lines
3.2 KiB
Markdown

# Schema Validation in Rhizome-Node
This document explains how schema validation works with deltas in Rhizome-Node.
## Overview
Schema validation in Rhizome-Node is enforced at the `TypedCollection` level when using the `put` method, which validates data before creating deltas. This means:
1. **Local Changes**: When you use `collection.put()`, the data is validated against the schema before any deltas are created and ingested.
2. **Peer Changes**: Deltas received from other peers are ingested without validation by default, which means invalid data can enter the system.
3. **Validation Tracking**: The system tracks which entities are valid/invalid after ingestion.
## Example Usage
```typescript
// 1. Define a schema for users
const userSchema = SchemaBuilder
.create('user')
.name('User')
.property('name', PrimitiveSchemas.requiredString())
.property('email', PrimitiveSchemas.email())
.property('age', PrimitiveSchemas.integer({ minimum: 0 }))
.required('name')
.build();
// 2. Create a typed collection with strict validation
const collection = new TypedCollectionImpl<{
name: string;
email?: string;
age?: number;
}>('users', userSchema, schemaRegistry, {
strictValidation: true // Enable strict validation
});
// Connect to the node
collection.rhizomeConnect(node);
// 3. Local changes - validated on put()
// Valid usage - will pass schema validation
await collection.put('user1', {
name: 'Alice',
email: 'alice@example.com',
age: 30
});
// Invalid usage - will throw SchemaValidationError
await expect(collection.put('user2', {
email: 'invalid-email', // Invalid email format
age: -5 // Negative age
})).rejects.toThrow(SchemaValidationError);
// 4. Peer data - ingested without validation by default
const unsafeDelta = createDelta('peer1', 'peer1')
.setProperty('user3', 'name', 'Bob', 'users')
.setProperty('user3', 'age', 'not-a-number', 'users')
.buildV1();
// This will be ingested without validation
node.lossless.ingestDelta(unsafeDelta);
// 5. Check validation status after the fact
const stats = collection.getValidationStats();
console.log(`Valid: ${stats.validEntities}, Invalid: ${stats.invalidEntities}`);
// Get details about invalid entities
const invalidUsers = collection.getInvalidEntities();
invalidUsers.forEach(user => {
console.log(`User ${user.entityId} is invalid:`, user.errors);
});
```
## Key Points
### Validation Timing
- Schema validation happens in `TypedCollection.put()` before deltas are created
- Deltas from peers are ingested without validation by default
### Validation Modes
- `strictValidation: true`: Throws errors on invalid data (recommended for local changes)
- `strictValidation: false`: Allows invalid data but tracks it (default)
### Monitoring
- Use `getValidationStats()` to get counts of valid/invalid entities
- Use `getInvalidEntities()` to get detailed error information
### Best Practices
- Always validate data before creating deltas when accepting external input
- Use `strictValidation: true` for collections where data integrity is critical
- Monitor validation statistics in production to detect data quality issues
- Consider implementing a validation layer for peer data if needed