Path Aggregation API

:experimental: This project is experimental and may change significantly.

A small FastAPI service that returns aggregate file counts and total sizes for any (resource, path) pair. Backed by PostgreSQL tables and triggers that maintain an aggregate table path_agg_res from index_record changes.

Endpoints

GET /{resource}/path/{path}

Response:

{
  "resource": "<extracted_resource>",
  "path": "<extracted_path>",
  "size": <extracted_size>,
  "file_count": <calculated_file_count>
}

Quickstart (dev)

# 1) Install Poetry if needed
# https://python-poetry.org/docs/#installation

poetry install
poetry run uvicorn app.main:app --reload
# then open http://127.0.0.1:8000//programs/a/project/b/path//data

Environment

Create a .env file with:

DATABASE_URL=postgresql://user:pass@localhost:5432/mydb

If DATABASE_URL is missing or the DB call fails, the service will return zeros (and log the error).

Database / Migration

Run migrations/001_init.sql against your Postgres instance. It will:

create index_record, index_record_authz (if not present),
create helper functions,
create path_agg_res aggregate table,
create triggers to keep path_agg_res updated on inserts/updates/deletes to index_record and index_record_authz.

Tests

poetry run pytest

Integration tests with Postgres (SQLAlchemy)

These tests spin up a disposable Postgres in Docker via Testcontainers, apply the SQL migration using SQLAlchemy, then run API tests against the live database.

Requirements:

Postgres

Run:

poetry install
poetry run pytest -q

Architecture

See migrations/001_init.sql for database schema, functions, and triggers. See openapi.json for API schema.

sequenceDiagram
    autonumber

    actor Client
    participant API as FastAPI (GET /{resource}/path/{path})
    participant DB as PostgreSQL
    participant TR as Triggers (PL/pgSQL)
    participant SRC as Source Tables<br/>index_record, index_record_authz
    participant AGG as Aggregate Table<br/>path_agg_res

    %% ---- Query path aggregate ----
    Client->>API: GET /{resource}/path/{path}
    API->>API: normalize(resource, path)
    API->>DB: SELECT file_count, total_size<br/>FROM path_agg_res<br/>WHERE resource=$1 AND path=$2
    DB-->>API: {file_count, total_size} or NULL
    API->>API: default to {0,0} if NULL
    API-->>Client: 200 JSON {resource, path, file_count, size}

    %% ---- Background maintenance via triggers ----
    rect rgb(245,245,245)
      note over SRC,TR: Any data change that affects paths/resources
      Client->>API: (ingest/update via upstream app)*
      API->>DB: INSERT/UPDATE/DELETE on index_record<br/>and/or index_record_authz
      DB->>TR: AFTER INSERT/UPDATE/DELETE fire
      TR->>TR: compute path prefixes (/a, /a/b, …)
      TR->>AGG: UPSERT/UPDATE per (path, resource)<br/>file_count ±1, total_size ±Δ
      AGG-->>TR: ok
      TR-->>DB: done
    end

    note over Client,API:This service exposes the read endpoint. Writes typically come from indexd workflows/services.

Database Trigger

A PostgreSQL trigger is used to automatically update the path_agg_res aggregation table whenever indexd tables are modified.
This ensures that aggregated counts and sizes remain consistent without manual intervention.

Trigger Function:
The trigger function listens for INSERT, UPDATE, or DELETE events on the source data table. On each event, it recalculates affected aggregates and updates the path_agg_res table.
Benefits:
- Keeps aggregation data up-to-date.
- Reduces the need for expensive queries at runtime.
- Ensures data integrity between source and aggregate tables.

Technologies

Python 3.12: Main programming language.
FastAPI: REST API framework.
SQLAlchemy / psycopg: Database access.
PostgreSQL: Relational database.
Pytest: Testing framework.
Poetry: Dependency management.

Components

API Layer (app/):
Exposes REST endpoints for resource and path aggregation. Handles request validation and response formatting.
Database Layer (app/db.py):
Contains functions for querying and aggregating data from PostgreSQL. Uses raw SQL and connection helpers.
Models (app/models.py):
Defines data structures for resources, paths, and aggregation results.
Tests (tests/):
Unit and integration tests for API and database logic.

Data Flow

Client Request:
Client sends HTTP request to API endpoint with resource, path, and depth parameters.
API Processing:
FastAPI validates input, normalizes paths, and calls aggregation logic.
Database Query:
Aggregation function constructs SQL query to fetch matching paths and children up to specified depth.
Response:
Results are formatted as JSON and returned to the client.

Directory Structure

app/
  ├── main.py         # FastAPI app entrypoint
  ├── db.py           # Database access functions
  ├── models.py       # Data models
  └── ...
tests/
  └── ...             # Test cases and fixtures
pyproject.toml        # Project configuration
README.md             # Project overview

Notes

All database queries are parameterized to prevent SQL injection.
API endpoints are documented with OpenAPI via FastAPI.
Integration tests require a running PostgreSQL instance.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
app		app
migrations		migrations
openapi		openapi
tests		tests
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Path Aggregation API

Endpoints

Quickstart (dev)

Environment

Database / Migration

Tests

Integration tests with Postgres (SQLAlchemy)

Architecture

Database Trigger

Technologies

Components

Data Flow

Directory Structure

Notes

About

Uh oh!

Releases

Packages

Languages

calypr/path_hierarchy

Folders and files

Latest commit

History

Repository files navigation

Path Aggregation API

Endpoints

Quickstart (dev)

Environment

Database / Migration

Tests

Integration tests with Postgres (SQLAlchemy)

Architecture

Database Trigger

Technologies

Components

Data Flow

Directory Structure

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages