-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Context
Setting up a new single-machine DRS server or worker node often requires rapidly seeding it with a manifest or catalogue of existing files that align with a real S3 bucket. This is especially useful when onboarding a development node or mirroring environments for testing purposes (e.g., with git-drs or other tools).
Decision
We propose implementing a mechanism to export the contents of a DRS server into a static file (e.g., drs.index) and uploading this file to a specified S3 bucket. When a new DRS server is launched and points to this bucket, it should automatically recognize and ingest drs.index to populate its own DRS tables, facilitating a quick start and mirroring process.
This process does not need to be flawless—its primary goal is to streamline rapid development and test node deployment against existing real-world DRS catalogues.
Key & Credential Discovery
A bootstrapping node pointed at an S3 bucket will need valid credentials to access the bucket and any referenced objects. The following strategies should be considered:
Credential Sources (in priority order)
- Environment variables — Standard AWS credentials (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN) or equivalent for non-AWS S3-compatible stores. - IAM instance roles / IRSA — When running on EC2 or Kubernetes, the node should automatically discover credentials via the instance metadata service or service account token projection (preferred for deployed environments).
- Credential file or secrets manager reference — A path or ARN bundled alongside the
drs.indexfile (e.g., adrs.credentials.jsonsidecar) that points to a secrets manager entry or contains scoped, short-lived credentials for bootstrap use only. - Interactive / CLI prompt — For local development, allow the operator to supply credentials at startup or via a CLI flag.
Security Considerations
- Credentials must never be embedded in the
drs.indexdump itself. - Bootstrap credentials should be scoped and short-lived (e.g., STS temporary credentials or time-limited pre-signed URLs).
- The ingest process should validate that it can authenticate to the target bucket before beginning table population, failing fast with a clear error message.
- Audit logging should capture when and how credentials were sourced during bootstrap.
- If the
drs.indexreferences objects across multiple buckets or accounts, the credential discovery chain should support assuming cross-account roles where necessary.
Considerations
- The static
drs.indexdump serves as a snapshot rather than a perfect real-time mirror. - This approach is aimed at development, integration, and testing, not production-grade failover or replication.
- The ingest process should be automated and require minimal manual intervention on the new node.
- The dump format and ingest logic should be documented and versioned to ensure forward compatibility.
Alternatives Considered
- Manual population of the test server via API or direct database access (slower, more error-prone)
- Live cluster synchronization solutions (overkill for development/test cases)
- Baking credentials into AMIs or container images (insecure, brittle)
Consequences
- Quicker development and validation cycles for DRS features in environments like git-drs.
- Lower friction when onboarding new dev nodes or worker nodes needing local DRS state.
- Potential risk of confusion if using stale or incompatible
.indexsnapshots, mitigated by documentation and versioning. - Clear credential hygiene from day one prevents accidental secret leakage in test/dev workflows.
Next Steps
- Define the structure for the
drs.indexexport ; Implement export code in drs-server - Define the structure for the
drs.indeximport ; Implement import code in drs-server - Implement credential discovery chain with fail-fast validation
- Document the ingest logic for new single-machine servers
- Ensure compatibility with git-drs and similar tools