Skip to content

Commit b1d05a4

Browse files
committed
initial commit 🎉
0 parents  commit b1d05a4

File tree

8 files changed

+304
-0
lines changed

8 files changed

+304
-0
lines changed

.dockerignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.*

.github/workflows/build.yaml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
name: Build
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
tags:
8+
workflow_dispatch:
9+
pull_request:
10+
11+
jobs:
12+
build:
13+
runs-on: ubuntu-latest
14+
steps:
15+
- uses: actions/checkout@v3
16+
17+
- name: Login to GitHub container registry
18+
uses: docker/login-action@v2
19+
with:
20+
registry: ghcr.io
21+
username: ${{ secrets.GHCR_USERNAME }}
22+
password: ${{ secrets.GHCR_PASSWORD }}
23+
24+
- name: Set up QEMU
25+
uses: docker/setup-qemu-action@v1
26+
27+
- name: Set up Docker Buildx
28+
uses: docker/setup-buildx-action@v1
29+
30+
- name: Docker metadata
31+
id: docker_metadata
32+
uses: docker/metadata-action@v4
33+
with:
34+
images: ghcr.io/significa/fly-pg-dump-to-s3
35+
tags: |
36+
type=sha
37+
type=ref,event=tag
38+
type=ref,event=branch
39+
type=semver,pattern={{version}}
40+
type=semver,pattern={{major}}.{{minor}}
41+
type=semver,pattern={{major}}
42+
flavor: |
43+
latest=true
44+
45+
- name: Build and push
46+
id: docker_build
47+
uses: docker/build-push-action@v3
48+
with:
49+
context: ./
50+
file: ./Dockerfile
51+
platforms: linux/amd64, linux/arm64, linux/386
52+
push: ${{ github.event_name != 'pull_request' }}
53+
tags: ${{ steps.docker_metadata.outputs.tags }}
54+
labels: ${{ steps.docker_metadata.outputs.labels }}

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
.env
2+
.env*

Dockerfile

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
FROM alpine
2+
3+
RUN apk add --no-cache bash curl aws-cli postgresql-client && \
4+
curl -L https://fly.io/install.sh | sh
5+
6+
COPY ./pg-dump-to-s3.sh ./entrypoint.sh /
7+
8+
CMD [ "/entrypoint.sh" ]

LICENSE

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
This is free and unencumbered software released into the public domain.
2+
3+
Anyone is free to copy, modify, publish, use, compile, sell, or
4+
distribute this software, either in source code form or as a compiled
5+
binary, for any purpose, commercial or non-commercial, and by any
6+
means.
7+
8+
In jurisdictions that recognize copyright laws, the author or authors
9+
of this software dedicate any and all copyright interest in the
10+
software to the public domain. We make this dedication for the benefit
11+
of the public at large and to the detriment of our heirs and
12+
successors. We intend this dedication to be an overt act of
13+
relinquishment in perpetuity of all present and future rights to this
14+
software under copyright law.
15+
16+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
19+
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
20+
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
21+
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22+
OTHER DEALINGS IN THE SOFTWARE.
23+
24+
For more information, please refer to <https://unlicense.org>

README.md

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# Fly pg_dump to AWS S3
2+
3+
This is a **hacky** way to have a Fly app that dumps postgres databases that are also on Fly, to AWS S3 buckets.
4+
This uses a dedicaded app for the *backup worker* that is woken up to start the dump. When it finished it is scaled back to 0, meaning it is not billable when idle*.
5+
6+
*The machine is not billable, any volumes will be. This can further be improved so volumes are deleted. Volumes are required as the temporary disk is of an unknown, small size.
7+
8+
9+
## Why this?
10+
11+
Indeed Fly's pg images support wal-g config to S3 via env vars. But I wanted a way to create simple archives periodically with pg_dump, making it easy for developers to replicate databases, and have a point in time recovery.
12+
13+
Since the backup worker is running on Fly, and not in some other external service like AWS or GitHub actions, we can create backups rather quickly. And also because the latency/bandwith from Fly to AWS are quite good (in the regions I've tested).
14+
15+
And what about Fly machines? I haven't tried them.
16+
17+
## Requirements
18+
19+
1. Fly postgres instance and a user with read permissons.
20+
Create the `db_backup_worker` user with:
21+
```sql
22+
CREATE USER db_backup_worker WITH PASSWORD '<password>';
23+
GRANT CONNECT ON DATABASE <db_name> TO db_backup_worker;
24+
-- For all schemas (example for public):
25+
GRANT USAGE ON SCHEMA public TO db_backup_worker;
26+
GRANT SELECT ON ALL TABLES IN SCHEMA public TO db_backup_worker;
27+
GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO db_backup_worker;
28+
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO db_backup_worker;
29+
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON SEQUENCES TO db_backup_worker;
30+
```
31+
32+
2. AWS S3 bucket and an access token with write permissions to it.
33+
Iam policy:
34+
```json
35+
{
36+
"Version": "2012-10-17",
37+
"Statement": [
38+
{
39+
"Sid": "WriteDatabaseBackups",
40+
"Effect": "Allow",
41+
"Action": [
42+
"s3:PutObject",
43+
"s3:AbortMultipartUpload",
44+
"s3:ListMultipartUploadParts"
45+
],
46+
"Resource": [
47+
"arn:aws:s3:::your-s3-bucket/backup.tar.gz"
48+
]
49+
}
50+
]
51+
}
52+
```
53+
54+
55+
## Installation
56+
57+
1. Launch your database backup worker with `fly launch --image ghcr.io/significa/fly-pg-dump-to-s3`
58+
59+
2. Create a volume for temporary files with `fly volumes create --no-encryption --size $SIZE_IN_GB temp_data`
60+
61+
3. Add the volume to your `fly.toml`
62+
```toml
63+
[mounts]
64+
destination = "/tmp/db-backups"
65+
source = "temp_data"
66+
```
67+
68+
4. Set the required fly secrets (env vars). Example:
69+
```env
70+
AWS_ACCESS_KEY_ID=XXXX
71+
AWS_SECRET_ACCESS_KEY=XXXX
72+
DATABASE_URL=postgresql://username:password@my-fly-db-instance.internal:5432/my_database
73+
S3_DESTINATON=s3://your-s3-bucket/backup.tar.gz
74+
FLY_API_TOKEN=XXXX
75+
```
76+
77+
5. Run `flyctl scale count 1` whenever you want to start a backup. Add this to any periodic runner along with the envs `FLY_APP` and `FLY_API_TOKEN` to run it periodically.
78+
79+
80+
## What about backup history?
81+
82+
You could add a date to the S3_DESTINATON filename (by changing the docker CMD). But I recommend adding versioning to your S3 and manage retention via policies.
83+
84+
85+
## Backup multiple databases/backups in one go?
86+
87+
Just use the env vars like so:
88+
89+
```env
90+
BACKUP_CONFIGURATION_NAMES=ENV1,STAGING_ENVIRONMENT,test
91+
92+
ENV1_DATABASE_URL=postgresql://username:password@env1/my_database
93+
ENV1_S3_DESTINATON=s3://sample-bucket/sample.tar.gz
94+
95+
STAGING_ENVIRONMENT_DATABASE_URL=postgresql://username:password@sample/staging
96+
STAGING_ENVIRONMENT_S3_DESTINATON=s3://sample-db-backups/staging_backup.tar.gz
97+
98+
TEST_DATABASE_URL=postgresql://username:password@sample/test
99+
TEST_S3_DESTINATON=s3://sample-db-backups/test_backup.tar.gz
100+
```
101+
102+
It will backup all the databases to the desired s3 destination. AWS and fly tokens are reused.
103+
104+
## Env vars documentation
105+
106+
- `DATABASE_URL`: Postgres database URL. Example: `postgresql://username:password@test:5432/my_database`
107+
- `S3_DESTINATON`: AWS S3 fill file destinaton Postgres database URl
108+
- `BACKUP_CONFIGURATION_NAMES`: Optional: Configuration names/prefixs for `DATABASE_URL` and `S3_DESTINATON`
109+
- `FLY_APP_NAME`: Optional to scale down the worker. Automatically set by Fly.
110+
- `FLY_API_TOKEN`: Optional to scale down the worker. Fly api token created via flyctl or the UI.
111+
- `BACKUPS_TEMP_DIR`: Optional: Where the temp files shoudl go. Defaults to: `/tmp/db-backups`
112+
- `PG_DUMP_ARGS`: Optional: Override the default `pg_dump` args: `--no-owner --clean --no-privileges --no-sync --jobs=4 --format=directory --compress=0`
113+
114+
## Is this hacky? Does it work in production environments?
115+
116+
Yes. Yes :sweat_smile:
117+
118+
## Will this work outside fly?
119+
120+
Yes, if FLY_APP_NAME or FLY_API_TOKEN are not prsent, fly commands will be ignored.

entrypoint.sh

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/bin/bash
2+
3+
set -e
4+
5+
BACKUP_CONFIGURATION_NAMES=${BACKUP_CONFIGURATION_NAMES:-}
6+
7+
backup () {
8+
local prefix=$1
9+
10+
database_url_var_name="${prefix}DATABASE_URL"
11+
database_url=${!database_url_var_name}
12+
13+
s3_destination_var_name="${prefix}S3_DESTINATON"
14+
s3_destination=${!s3_destination_var_name}
15+
16+
if [[ -z $database_url || -z $s3_destination ]]; then
17+
echo "Required env vars: ${database_url_var_name}, ${s3_destination_var_name}"
18+
exit 1
19+
fi
20+
21+
./pg-dump-to-s3.sh "${database_url}" "${s3_destination}"
22+
}
23+
24+
main () {
25+
if [[ -z $BACKUP_CONFIGURATION_NAMES ]]; then
26+
echo "Backup starting"
27+
backup ""
28+
else
29+
for configuration_name in ${BACKUP_CONFIGURATION_NAMES//,/ }; do
30+
echo "Backing up $configuration_name"
31+
backup "${configuration_name^^}_"
32+
done
33+
fi
34+
}
35+
36+
main || echo "ERROR backing up, see the logs above."
37+
38+
if [[ -n $FLY_APP_NAME && -n $FLY_API_TOKEN ]]; then
39+
echo "Scaling $FLY_APP_NAME to 0"
40+
/root/.fly/bin/flyctl -a "$FLY_APP_NAME" scale count 0
41+
fi
42+
43+
echo "Done! Sleeping..."
44+
sleep infinity

pg-dump-to-s3.sh

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
#!/bin/bash
2+
3+
set -e
4+
5+
_USAGE="
6+
Usage: ./backup-db <database_url> <s3-destination>
7+
Example:
8+
./backup-db postgresql://username:password@hostname:5432/my_database s3://my-bucket-name/my_backup.tar.gz
9+
"
10+
11+
BACKUPS_TEMP_DIR=${BACKUPS_TEMP_DIR:-/tmp/db-backups}
12+
13+
# we are not using pg_dump for compression as we want concurrrency, later we compress with tar manually
14+
default_pg_dump_args="--no-owner --clean --no-privileges --no-sync --jobs=4 --format=directory --compress=0"
15+
PG_DUMP_ARGS=${PG_DUMP_ARGS:-$default_pg_dump_args}
16+
17+
database_url=$1
18+
destination=$2
19+
20+
if [[ -z "$database_url" || -z "$destination" ]]; then
21+
echo "$_USAGE"
22+
exit 1
23+
fi
24+
25+
if [[ -z $AWS_ACCESS_KEY_ID || -z $AWS_SECRET_ACCESS_KEY ]]; then
26+
echo "Required env vars: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY"
27+
exit 1
28+
fi
29+
30+
mkdir -p "${BACKUPS_TEMP_DIR}"
31+
32+
backup_dir="${BACKUPS_TEMP_DIR}/db_dump"
33+
backup_filename="${BACKUPS_TEMP_DIR}/db_dump.tar.gz"
34+
35+
# In the future we could add a configuration to prevent deletion of existing files
36+
rm -rf "${backup_dir}" "${backup_filename}"
37+
38+
echo "Dumping database to ${backup_dir}"
39+
pg_dump $PG_DUMP_ARGS \
40+
--dbname="${database_url}" \
41+
--file="${backup_dir}"
42+
43+
echo "Compressing backup to ${backup_filename}"
44+
tar -czf "${backup_filename}" -C "${backup_dir}" .
45+
46+
echo "Uploading backup to ${destination}"
47+
aws s3 cp --only-show-errors "${backup_filename}" "${destination}"
48+
49+
rm -rf "${backup_dir}" "${backup_filename}"
50+
51+
echo "Database backup finished!"

0 commit comments

Comments
 (0)