GitHub - sipcapture/HEPop: HEP Server powered by DuckDB + Parquet Storage

HEPop is a high-performance HEP Capture Server built with DuckDB, Bun and Apache Arrow/Parquet

Features

%%{
  init: {
    'theme': 'base',
    'themeVariables': {
      'primaryColor': '#BB2528',
      'primaryTextColor': '#fff',
      'primaryBorderColor': '#7C0000',
      'lineColor': '#F8B229',
      'secondaryColor': '#006100',
      'tertiaryColor': '#fff'
    }
  }
}%%

  graph TD;
      HEP-Client-- UDP/TCP -->HEPop;
      HEPop-->ParquetWriter;
      ParquetWriter-->Storage;
      ParquetWriter-->Metadata;
      Storage-->Compactor;
      Compactor-->Storage;
      Compactor-->Metadata;
      Storage-.->LocalFS;
      Storage-.->S3;
      HTTP-API-- GET/POST --> HEPop;
      DuckDB-->Storage;
      DuckDB-->Metadata;

      subgraph HEPop[HEPop Server]
        ParquetWriter
        Compactor
        Metadata;
        DuckDB;
      end

Install & Start

Use Bun to install, build and run hepop

bun install
bun start

Configuration

Configure HEPop using Environment variables:

PORT: HEP server port (default: 9069)
HTTP_PORT: Query API port (default: PORT + 1)
HOST: Bind address (default: "0.0.0.0")
PARQUET_DIR: Data directory (default: "./data")
WRITER_ID: Instance identifier (default: hostname)

Storage Structure

HEPop organizes data in a time-based directory structure:

data/
└── writer1/
    └── dbs/
        └── hep-0/
            ├── hep_1-0/
            │   └── 2025-02-08/
            │       ├── 19-00/
            │       │   └── c_0000000001.parquet
            │       ├── 19-10/
            │       │   └── 0000000002.parquet
            │       └── metadata.json
            └── hep_100-0/
                └── ...

Each HEP type gets its own directory structure
Generated Parquet files are organized by date and hour
Compacted sets (c_) consolidate files for fast access
Metadata tracks all files, compaction and statistics

{
  "type": 1,
  "parquet_size_bytes": 379739,
  "row_count": 359,
  "min_time": 1739043338978000000,
  "max_time": 1739043934193000000,
  "wal_sequence": 32,
  "files": [
    {
      "id": 0,
      "path": "data/writer1/dbs/hep-0/hep_1-0/2025-02-08/19-00/c_0000000032.parquet",
      "size_bytes": 379739,
      "row_count": 359,
      "chunk_time": 1739043000000000000,
      "min_time": 1739043338978000000,
      "max_time": 1739043934193000000,
      "range": "1h",
      "type": "compacted"
    }
  ]
}

Query API

Query the HEP data using the HTTP API. The server provides both GET and POST endpoints for querying data.

Query Features

Time Range: If not specified, defaults to last 10 minutes
Dynamic Columns: Select specific columns or use * for all
Filtering: WHERE clause supports standard SQL conditions
Sorting: ORDER BY supports all columns
Pagination: Use LIMIT and OFFSET for paging

Available HEP Fields:

HEP virtual fields are automatically exploded at query time

timestamp/time: Event timestamp
rcinfo: Raw HEP protocol header (JSON)
payload: HEP Protocol payload
src_ip: Source IP (rcinfo)
dst_ip: Destination IP (rcinfo)
src_port: Source port (rcinfo)
dst_port: Destination port (rcinfo)

GET /query

# Query last 10 minutes of SIP messages
curl "http://localhost:9070/query?q=SELECT time,src_ip,dst_ip,payload FROM hep_1 LIMIT 10"

# Complex query with time range and conditions
curl -X POST http://localhost:9070/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT time, src_ip, dst_ip, payload FROM hep_1 WHERE time >= '\''2025-02-08T19:00:00'\'' AND payload LIKE '\''%INVITE%'\'' ORDER BY time DESC"
  }'

OLAP Query

Query HEP data using DuckDB, ClickHouse, Databend or any Parquet-compatible tool:

SELECT count() FROM 'data/writer1/dbs/hep-0/hep_1-*/*/*/c_0000000001.parquet' LIMIT 10;

Line Protocol API

HEPop.js also supports InfluxDB Line Protocol ingestion for metrics and events.

POST /write

Send metrics using the InfluxDB Line Protocol format. Each line represents a single data point with measurement, tags, fields and optional timestamp.

# Single metric
curl -i -XPOST "http://localhost:9070/write" --data-raw 'cpu,host=server01,region=us-west usage_idle=92.6,usage_user=7.4'

# Multiple metrics
curl -i -XPOST "http://localhost:9070/write" --data-raw '
memory,host=server01,region=us-west used_percent=23.43,free=7.82
disk,host=server01,region=us-west used_percent=86.45,free=21.45
network,host=server01,region=us-west rx_bytes=7834,tx_bytes=9843
'

Line Protocol Format

<measurement>[,<tag_key>=<tag_value>] <field_key>=<field_value>[,<field_key>=<field_value>] [timestamp]

measurement: Name of the metric (required)
tags: Optional key-value pairs for categorizing data
fields: One or more key-value pairs of the actual metric values
timestamp: Optional timestamp in nanoseconds since Unix epoch

Query Line Protocol Data

Query metrics using the same SQL interface:

# Query last 10 minutes of CPU metrics
curl -X POST http://localhost:9070/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT time, host, region, usage_idle, usage_user FROM cpu WHERE time >= '\''2025-02-09T16:00:00'\''"
  }'

# Aggregate metrics by host
curl -X POST http://localhost:9070/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT host, avg(used_percent) as avg_used FROM memory GROUP BY host ORDER BY avg_used DESC"
  }'

The Line Protocol data is stored in Parquet files using the same directory structure and compaction strategy as HEP data, allowing for efficient querying and storage.

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
.github/workflows		.github/workflows
examples		examples
patches		patches
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
docker-compose.yml		docker-compose.yml
hepop.js		hepop.js
jsconfig.json		jsconfig.json
lineproto.js		lineproto.js
package.json		package.json
query.js		query.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Install & Start

Configuration

Storage Structure

Query API

Query Features

Available HEP Fields:

GET /query

OLAP Query

Line Protocol API

POST /write

Line Protocol Format

Query Line Protocol Data

License

About

Releases

Packages

Contributors 8

Languages

License

sipcapture/HEPop

Folders and files

Latest commit

History

Repository files navigation

Features

Install & Start

Configuration

Storage Structure

Query API

Query Features

Available HEP Fields:

GET /query

OLAP Query

Line Protocol API

POST /write

Line Protocol Format

Query Line Protocol Data

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages