:experimental: This project is experimental and may change significantly.
A small FastAPI service that returns aggregate file counts and total sizes for any (resource, path) pair.
Backed by PostgreSQL tables and triggers that maintain an aggregate table path_agg_res from index_record changes.
GET /{resource}/path/{path}
Response:
{
"resource": "<extracted_resource>",
"path": "<extracted_path>",
"size": <extracted_size>,
"file_count": <calculated_file_count>
}# 1) Install Poetry if needed
# https://python-poetry.org/docs/#installation
poetry install
poetry run uvicorn app.main:app --reload
# then open http://127.0.0.1:8000//programs/a/project/b/path//dataCreate a .env file with:
DATABASE_URL=postgresql://user:pass@localhost:5432/mydb
If DATABASE_URL is missing or the DB call fails, the service will return zeros (and log the error).
Run migrations/001_init.sql against your Postgres instance. It will:
- create
index_record,index_record_authz(if not present), - create helper functions,
- create
path_agg_resaggregate table, - create triggers to keep
path_agg_resupdated on inserts/updates/deletes toindex_recordandindex_record_authz.
poetry run pytestThese tests spin up a disposable Postgres in Docker via Testcontainers, apply the SQL migration using SQLAlchemy, then run API tests against the live database.
Requirements:
- Postgres
Run:
poetry install
poetry run pytest -qSee migrations/001_init.sql for database schema, functions, and triggers.
See openapi.json for API schema.
sequenceDiagram
autonumber
actor Client
participant API as FastAPI (GET /{resource}/path/{path})
participant DB as PostgreSQL
participant TR as Triggers (PL/pgSQL)
participant SRC as Source Tables<br/>index_record, index_record_authz
participant AGG as Aggregate Table<br/>path_agg_res
%% ---- Query path aggregate ----
Client->>API: GET /{resource}/path/{path}
API->>API: normalize(resource, path)
API->>DB: SELECT file_count, total_size<br/>FROM path_agg_res<br/>WHERE resource=$1 AND path=$2
DB-->>API: {file_count, total_size} or NULL
API->>API: default to {0,0} if NULL
API-->>Client: 200 JSON {resource, path, file_count, size}
%% ---- Background maintenance via triggers ----
rect rgb(245,245,245)
note over SRC,TR: Any data change that affects paths/resources
Client->>API: (ingest/update via upstream app)*
API->>DB: INSERT/UPDATE/DELETE on index_record<br/>and/or index_record_authz
DB->>TR: AFTER INSERT/UPDATE/DELETE fire
TR->>TR: compute path prefixes (/a, /a/b, …)
TR->>AGG: UPSERT/UPDATE per (path, resource)<br/>file_count ±1, total_size ±Δ
AGG-->>TR: ok
TR-->>DB: done
end
note over Client,API:This service exposes the read endpoint. Writes typically come from indexd workflows/services.
A PostgreSQL trigger is used to automatically update the path_agg_res aggregation table whenever indexd tables are modified.
This ensures that aggregated counts and sizes remain consistent without manual intervention.
-
Trigger Function:
The trigger function listens forINSERT,UPDATE, orDELETEevents on the source data table. On each event, it recalculates affected aggregates and updates thepath_agg_restable. -
Benefits:
- Keeps aggregation data up-to-date.
- Reduces the need for expensive queries at runtime.
- Ensures data integrity between source and aggregate tables.
- Python 3.12: Main programming language.
- FastAPI: REST API framework.
- SQLAlchemy / psycopg: Database access.
- PostgreSQL: Relational database.
- Pytest: Testing framework.
- Poetry: Dependency management.
-
API Layer (
app/):
Exposes REST endpoints for resource and path aggregation. Handles request validation and response formatting. -
Database Layer (
app/db.py):
Contains functions for querying and aggregating data from PostgreSQL. Uses raw SQL and connection helpers. -
Models (
app/models.py):
Defines data structures for resources, paths, and aggregation results. -
Tests (
tests/):
Unit and integration tests for API and database logic.
-
Client Request:
Client sends HTTP request to API endpoint with resource, path, and depth parameters. -
API Processing:
FastAPI validates input, normalizes paths, and calls aggregation logic. -
Database Query:
Aggregation function constructs SQL query to fetch matching paths and children up to specified depth. -
Response:
Results are formatted as JSON and returned to the client.
app/
├── main.py # FastAPI app entrypoint
├── db.py # Database access functions
├── models.py # Data models
└── ...
tests/
└── ... # Test cases and fixtures
pyproject.toml # Project configuration
README.md # Project overview
- All database queries are parameterized to prevent SQL injection.
- API endpoints are documented with OpenAPI via FastAPI.
- Integration tests require a running PostgreSQL instance.