Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup examples #1

Merged
merged 12 commits into from
Dec 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
249 changes: 12 additions & 237 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

### Pre-requisistes

- have the latest version of `prometheus_fdw` extension enabled in your instance.
- have `pg_partman` and `pg_cron` installed and enabled.
- Install `prometheus_fdw`
- (Optional) install `pg_partman` and `pg_cron`

### Set-up
### Quick start

`create extension prometheus_fdw;`

Expand All @@ -32,254 +32,29 @@ Create Foreign Table:
CREATE FOREIGN TABLE IF NOT EXISTS metrics (
metric_name TEXT,
metric_labels JSONB,
metric_time BIGINT,
metric_time BIGINT,
metric_value FLOAT8
)
)
server my_prometheus_server
options (
object 'metrics',
step '10m'
);
```


Create tables to store information locally:

```
CREATE TABLE IF NOT EXISTS metrics_local (
metric_name TEXT,
metric_labels JSONB,
metric_time BIGINT,
metric_value FLOAT8
);

-- Create metric_labels table
CREATE TABLE public.metric_labels (
id BIGSERIAL NOT NULL,
name TEXT NOT NULL,
labels JSONB,
PRIMARY KEY (id),
UNIQUE (name, labels)
);

-- Create metric_values table
CREATE TABLE public.metric_values (
label_id bigint NOT NULL,
"time" bigint NOT NULL,
value double precision NOT NULL,
id serial NOT NULL,
PRIMARY KEY (id, "time"),
UNIQUE (label_id, "time"),
FOREIGN KEY (label_id) REFERENCES public.metric_labels(id)
) PARTITION BY RANGE ("time");
```

## Queries

To simply run the fdw and look at values

```
SELECT
*
FROM metrics
WHERE
metric_name='container_cpu_usage_seconds_total'
AND metric_time > 1696046800 AND metric_time < 1696133000;
```

To store the information in your local database for future use
```
INSERT INTO metrics_local
SELECT * FROM metrics
WHERE
metric_name='container_cpu_usage_seconds_total'
AND metric_time > 1696046800 AND metric_time < 1696133000;
```

To save information for long term and/or analysis
```
INSERT INTO public.metric_labels (name, labels)
SELECT
metric_name,
metric_labels
FROM metrics_local
WHERE
metric_time > 1696046800 AND metric_time < 1696133000
AND metric_name = 'container_cpu_usage_seconds_total'
ON CONFLICT (name, labels) DO NOTHING;
```

To store values for long term and/or analysis
```
INSERT INTO metric_values (id, time, value)
SELECT
mlab.id,
ml.metric_time,
ml.metric_value
FROM
metrics_local ml
INNER JOIN
metric_labels mlab
ON
ml.metric_labels = mlab.labels
ON CONFLICT (id, time) DO NOTHING;
```

### Performance improvements:

#### Indexing:

```
CREATE INDEX IF NOT EXISTS metric_labels_labels_idx ON metric_labels USING GIN (labels);

CREATE INDEX IF NOT EXISTS metric_values_id_time_idx on metric_values USING btree (id, time DESC);

CREATE INDEX IF NOT EXISTS metric_values_time_idx on metric_values USING btree (time DESC);
```

#### Partioning:
To create partitions 30 days into the past and future:
```
SELECT create_parent(
p_parent_table := 'public.metric_values',
p_control := 'time',
p_type := 'native',
p_interval := 'daily',
p_automatic_maintenance := 'on',
p_start_partition := '2023-09-06', -- Adjust the date to 30 days in the past from the current date
p_epoch := 'seconds',
p_premake := 30 -- Create partitions 30 days into the future
);
```

To create new partitions and delete old ones automatically, set up pg_cron
```
SELECT cron.schedule('0 3 * * *', $$CALL partman.run_maintenance()$$);
```

You can manually set the data retention date
```
UPDATE public.part_config
SET retention = '30 days'
WHERE parent_table = 'public.metric_values';
```

### Queries using pg_cron

To continuosly update the tables and information

Create functions to do the tasks mentioned above
```
CREATE OR REPLACE FUNCTION insert_metrics() RETURNS void LANGUAGE plpgsql AS $$
DECLARE
start_time BIGINT;
end_time BIGINT;
BEGIN
start_time := EXTRACT(epoch FROM now() - interval '1 hour' + interval '1 second')::BIGINT;
end_time := EXTRACT(epoch FROM now())::BIGINT;

EXECUTE format(
'INSERT INTO metrics_local
SELECT * FROM metrics
WHERE
metric_name = ''container_cpu_usage_seconds_total''
AND metric_time > %s
AND metric_time < %s;',
start_time,
end_time
);
END;
$$;

CREATE OR REPLACE FUNCTION insert_metric_labels() RETURNS void LANGUAGE plpgsql AS $$
BEGIN
EXECUTE '
INSERT INTO public.metric_labels (name, labels)
SELECT
metric_name,
metric_labels
FROM metrics_local
ON CONFLICT (name, labels) DO NOTHING;
';
END;
$$;

CREATE OR REPLACE FUNCTION insert_metric_values() RETURNS void LANGUAGE plpgsql AS $$
BEGIN
EXECUTE '
INSERT INTO metric_values (id, time, value)
SELECT
mlab.id,
ml.metric_time,
ml.metric_value
FROM
metrics_local ml
INNER JOIN
metric_labels mlab
ON
ml.metric_labels = mlab.labels
ON CONFLICT (id, time) DO NOTHING;
';
END;
$$;

CREATE OR REPLACE FUNCTION truncate_metrics_local() RETURNS void LANGUAGE plpgsql AS $$
BEGIN
EXECUTE 'TRUNCATE TABLE metrics_local;';
END;
$$;
```

Create a cron job to run the functions every hour
```
SELECT cron.schedule(
'10 * * * *',
$$
SELECT
insert_metrics(),
insert_metric_labels(),
insert_metric_values(),
truncate_metrics_local();
$$
);
```

NOTE: Run this command after setting every cron job
```
UPDATE cron.job SET nodename = '';
*
FROM metrics
WHERE
metric_name='container_cpu_usage_seconds_total'
AND metric_time > 1696046800 AND metric_time < 1696133000;
```

### Optional

To collect information about multiple metrics, define the insert_metrics function as follows:
```
CREATE OR REPLACE FUNCTION public.insert_metrics()
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE
start_time BIGINT;
end_time BIGINT;
metric_name text;
metric_names text[] := ARRAY['container_cpu_usage_seconds_total', 'container_memory_working_set_bytes']; -- Add your metric names here
BEGIN
start_time := EXTRACT(epoch FROM now() - interval '1 hour' + interval '1 second')::BIGINT;
end_time := EXTRACT(epoch FROM now())::BIGINT;
## Examples

FOREACH metric_name IN ARRAY metric_names
LOOP
EXECUTE format(
'INSERT INTO metrics_local
SELECT * FROM metrics
WHERE
metric_name = %L
AND metric_time > %s
AND metric_time < %s;',
metric_name,
start_time,
end_time
);
END LOOP;
END;
$function$;
```
Please see the `examples/` directory to find a basic example and a practical example. In the practical example, metrics are automatically synced into the database using `pg_cron`, and automatically expired using `pg_partman`. Performance is optimized using indexes and partitioning.
8 changes: 8 additions & 0 deletions examples/0_hello_world/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# This image is a normal Postgres container,
# configured for local dev (network access configs and password are insecure),
# plus the CLI tool "Trunk" for installing extensions from https://pgt.dev
#
# Source: https://github.com/tembo-io/tembo-images
FROM quay.io/tembo/tembo-local:latest

RUN trunk install prometheus_fdw --version 0.1.3
6 changes: 6 additions & 0 deletions examples/0_hello_world/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Prometheus FDW basic demo

- **Dockerfile**: A Postgres database with prometheus_fdw installed
- **setup-prometheus-fdw.sql**: A script to show how to configure
- **sample-query.sql**: A sample query against Prometheus
- **run-demo.sh**: A script showing the whole process of running the demo
23 changes: 23 additions & 0 deletions examples/0_hello_world/run-demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash

# Build the image and run it
docker build -t example-local-image .
docker rm --force local-tembo
docker run -d -it --name local-tembo -p 5432:5432 --rm example-local-image

# wait for connect
until psql postgres://postgres:postgres@localhost:5432 -c "select 1" &> /dev/null; do
echo "Waiting for postgres to start..."
sleep 1
done
echo "Ready!"

# Run sample scripts
psql postgres://postgres:postgres@localhost:5432 -f ./setup-prometheus-fdw.sql

start_time=$(TZ=UTC date -j -v-800S +%s)
end_time=$(TZ=UTC date -j -v-300S +%s)

echo "Start time: $start_time"
echo "End time: $end_time"
psql postgres://postgres:postgres@localhost:5432 -v start_time=$start_time -v end_time=$end_time -f ./sample-query.sql
7 changes: 7 additions & 0 deletions examples/0_hello_world/sample-query.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
SELECT
*
FROM metrics
WHERE
metric_name = 'container_cpu_usage_seconds_total'
AND metric_time > :start_time
AND metric_time < :end_time;
26 changes: 26 additions & 0 deletions examples/0_hello_world/setup-prometheus-fdw.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
-- Enable the extension
CREATE EXTENSION prometheus_fdw;

-- Create the FDW
CREATE FOREIGN DATA WRAPPER prometheus_wrapper
HANDLER prometheus_fdw_handler
VALIDATOR prometheus_fdw_validator;

-- Configure connection to server
CREATE SERVER my_prometheus_server
FOREIGN DATA WRAPPER prometheus_wrapper
OPTIONS (
base_url 'https://prometheus-data-1.use1.dev.plat.cdb-svc.com/');

-- Create FDW table we can query to get metrics
CREATE FOREIGN TABLE metrics (
metric_name TEXT,
metric_labels JSONB,
metric_time BIGINT,
metric_value FLOAT8
)
SERVER my_prometheus_server
OPTIONS (
object 'metrics',
step '30s'
);
17 changes: 17 additions & 0 deletions examples/1_practical_example/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# This image is a normal Postgres container,
# configured for local dev (network access configs and password are insecure),
# plus the CLI tool "Trunk" for installing extensions from https://pgt.dev
#
# Source: https://github.com/tembo-io/tembo-images
# Documentation: https://tembo.io/docs/tembo-cloud/try-extensions-locally
FROM quay.io/tembo/tembo-local:latest

# We will use pg_partman to optimize storing and querying timeseries data
RUN trunk install pg_partman --version 4.7.4
# We will use pg_cron to schedule syncing prometheus into the local DB
RUN trunk install pg_cron --version 1.5.2

RUN trunk install prometheus_fdw --version 0.1.3

# Extra Postgres configurations
COPY custom.conf $PGDATA/extra-configs
Loading
Loading