Context
Meteor currently extracts everything on every run. For high-volume sources, this is wasteful and slow. Change detection allows extractors to emit only what changed since the last run.
Scope
- Define a watermark/checkpoint interface that extractors can implement
- Store watermarks between runs (local file, config store, or Compass state)
- Update high-volume extractors to support incremental extraction:
- BigQuery — use
INFORMATION_SCHEMA timestamps for modified tables
- Postgres/MySQL — track schema modification timestamps
- Kafka — track topic configuration changes
- Emit change type metadata on records:
created, updated, deleted
- Fall back to full extraction when no watermark exists
Design Considerations
- Watermark storage should be pluggable (local file for dev, remote store for production)
- Full extraction should remain available as a fallback or explicit mode
- Change detection accuracy varies by source — document limitations per extractor
Why
Incremental extraction reduces load on sources, shrinks payloads, and enables faster refresh cycles. A graph that updates in minutes instead of hours is significantly more useful for AI agents.
References
Context
Meteor currently extracts everything on every run. For high-volume sources, this is wasteful and slow. Change detection allows extractors to emit only what changed since the last run.
Scope
INFORMATION_SCHEMAtimestamps for modified tablescreated,updated,deletedDesign Considerations
Why
Incremental extraction reduces load on sources, shrinks payloads, and enables faster refresh cycles. A graph that updates in minutes instead of hours is significantly more useful for AI agents.
References