ThemisDB Backup & Recovery Guide

Complete guide for backing up and recovering ThemisDB data with zero data loss.

Backup Strategies
Online vs Offline Backups
Point-in-Time Recovery
Disaster Recovery Planning
Backup Verification
Restore Procedures
Cross-Region Backups
Automation Scripts

Backup Strategies

Full Backup

Complete database snapshot:

# Basic full backup
themisdb-backup \
  --backup-directory /backups/themisdb/full/$(date +%Y%m%d_%H%M%S) \
  --database production \
  --compress \
  --threads 8

# With encryption
themisdb-backup \
  --backup-directory /backups/themisdb/full/$(date +%Y%m%d_%H%M%S) \
  --database production \
  --compress \
  --encrypt \
  --encryption-key-file /etc/themisdb/backup.key \
  --threads 8

# All databases
themisdb-backup \
  --backup-directory /backups/themisdb/full/$(date +%Y%m%d_%H%M%S) \
  --all-databases \
  --compress \
  --include-system-collections

Pros:

Simple to manage
Fast restore
Complete point-in-time snapshot

Cons:

Requires significant storage
Takes longer for large databases
No granular recovery between backups

Best For: Small to medium databases (< 500 GB), weekly/monthly archival

Incremental Backup

Only backs up changes since last full backup:

# Day 1: Full backup
themisdb-backup \
  --backup-directory /backups/themisdb/full/2024-01-24 \
  --database production \
  --type full \
  --compress

# Day 2-7: Incremental backups
themisdb-backup \
  --backup-directory /backups/themisdb/incremental/2024-01-25 \
  --database production \
  --type incremental \
  --base-backup /backups/themisdb/full/2024-01-24 \
  --compress

# Restore process (requires full + all incrementals)
themisdb-restore \
  --backup-directory /backups/themisdb/full/2024-01-24 \
  --incremental-directories \
    /backups/themisdb/incremental/2024-01-25 \
    /backups/themisdb/incremental/2024-01-26 \
  --database production

Pros:

Faster backup (only changes)
Less storage space
More frequent backups possible

Cons:

Slower restore (chain required)
Backup chain complexity
Cannot skip incrementals

Best For: Large databases (> 500 GB), daily backups

Differential Backup

Backs up changes since last full backup:

# Week 1: Full backup
themisdb-backup \
  --backup-directory /backups/themisdb/full/week1 \
  --type full

# Days 2-7: Differential backups
for day in {2..7}; do
  themisdb-backup \
    --backup-directory /backups/themisdb/diff/week1_day${day} \
    --type differential \
    --base-backup /backups/themisdb/full/week1
done

# Restore (only need full + latest differential)
themisdb-restore \
  --backup-directory /backups/themisdb/full/week1 \
  --differential-directory /backups/themisdb/diff/week1_day7

Pros:

Simpler restore than incremental
Good balance of speed/space
Skip intermediate backups

Cons:

Grows larger each day
More storage than incremental

Best For: Medium databases, balance between backup speed and restore speed

Snapshot Backup

Filesystem/storage-level snapshots:

# LVM snapshot
lvcreate -L 50G -s -n themisdb-snapshot /dev/vg0/themisdb-data

# Mount snapshot (read-only)
mkdir /mnt/themisdb-snapshot
mount -o ro /dev/vg0/themisdb-snapshot /mnt/themisdb-snapshot

# Backup from snapshot
tar -czf /backups/themisdb-snapshot-$(date +%Y%m%d).tar.gz \
  -C /mnt/themisdb-snapshot .

# Remove snapshot after backup
umount /mnt/themisdb-snapshot
lvremove -f /dev/vg0/themisdb-snapshot

# ZFS snapshot
zfs snapshot tank/themisdb@backup-$(date +%Y%m%d)
zfs send tank/themisdb@backup-$(date +%Y%m%d) | gzip > /backups/themisdb.zfs.gz

# Btrfs snapshot
btrfs subvolume snapshot -r /var/lib/themisdb /var/lib/themisdb-snapshot-$(date +%Y%m%d)

Pros:

Instant snapshot
Minimal impact on database
Consistent point-in-time

Cons:

Requires specific filesystem
Additional storage management
OS-level dependency

Best For: Very large databases (> 1 TB), minimal backup window

Continuous Backup (WAL Archiving)

Archive transaction logs for point-in-time recovery:

# themisdb.conf
wal:
  enabled: true
  directory: /var/lib/themisdb/wal
  
  # Archive to separate location
  archiveMode: on
  archiveCommand: |
    cp %p /backups/themisdb/wal_archive/%f && \
    s3cmd put %p s3://backups/themisdb/wal/%f

  archiveTimeout: 300  # Force archive every 5 minutes
  maxWalSize: 1GB

Archive Script:

#!/bin/bash
# wal_archive.sh

WAL_FILE=$1
ARCHIVE_DIR="/backups/themisdb/wal_archive"
S3_BUCKET="s3://backups/themisdb/wal"

# Local archive
cp "$WAL_FILE" "$ARCHIVE_DIR/"

# Remote archive
aws s3 cp "$WAL_FILE" "$S3_BUCKET/"

# Verify
if [[ $? -eq 0 ]]; then
  echo "$(date): Archived $WAL_FILE" >> /var/log/themisdb/wal_archive.log
  exit 0
else
  echo "$(date): Failed to archive $WAL_FILE" >> /var/log/themisdb/wal_archive.log
  exit 1
fi

Pros:

Minimal data loss (seconds)
Point-in-time recovery
Continuous protection

Cons:

Additional storage
More complex restore
Network dependency

Best For: Critical production systems, RPO < 1 minute

Online vs Offline Backups

Online Backups (Hot Backup)

Backup while database is running:

# Create consistent backup without stopping database
themisdb-backup \
  --backup-directory /backups/themisdb/online/$(date +%Y%m%d_%H%M%S) \
  --database production \
  --online \
  --consistent-snapshot \
  --compress

# Monitor backup progress
themisdb-admin backup-status

# For minimal impact, limit I/O
ionice -c 3 themisdb-backup \
  --backup-directory /backups/themisdb/ \
  --database production \
  --online \
  --throttle 50MB/s

Advantages:

Zero downtime
24/7 backup capability
No service interruption

Considerations:

May impact performance during backup
Requires consistent snapshot capability
Longer backup time due to active writes

⚠️ Warning: Use --consistent-snapshot to ensure data consistency

Offline Backups (Cold Backup)

Backup with database stopped:

#!/bin/bash
# offline_backup.sh

# 1. Stop database
systemctl stop themisdb

# 2. Verify stopped
while pgrep themisdb-server > /dev/null; do
  echo "Waiting for database to stop..."
  sleep 2
done

# 3. Perform backup
BACKUP_DIR="/backups/themisdb/offline/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"

# Copy data directory
tar -czf "$BACKUP_DIR/data.tar.gz" /var/lib/themisdb/

# Copy configuration
cp /etc/themisdb/themisdb.conf "$BACKUP_DIR/"

# Copy logs
tar -czf "$BACKUP_DIR/logs.tar.gz" /var/log/themisdb/

# 4. Calculate checksums
cd "$BACKUP_DIR"
sha256sum * > checksums.txt

# 5. Start database
systemctl start themisdb

# 6. Verify started
until curl -f http://localhost:8529/_api/version; do
  echo "Waiting for database to start..."
  sleep 2
done

echo "Backup complete: $BACKUP_DIR"

Advantages:

Guaranteed consistency
Faster backup
Simpler process

Considerations:

Requires downtime
Not suitable for 24/7 systems

Best For: Maintenance windows, development/staging environments

Point-in-Time Recovery

PITR Setup

Enable WAL archiving:

# themisdb.conf
wal:
  enabled: true
  archiveMode: on
  archiveCommand: "cp %p /backups/themisdb/wal/%f"
  maxWalSize: 1GB
  keepWalFiles: 10

# Alternatively, archive to S3
wal:
  archiveCommand: "aws s3 cp %p s3://backups/themisdb/wal/$(date +%Y%m%d)/%f"

Recovery to Specific Time

Restore to exact point in time:

#!/bin/bash
# pitr_recovery.sh

TARGET_TIME="2024-01-24 14:30:00"
BASE_BACKUP="/backups/themisdb/full/2024-01-24"
WAL_ARCHIVE="/backups/themisdb/wal_archive"

echo "Performing PITR to: $TARGET_TIME"

# 1. Stop database
systemctl stop themisdb

# 2. Backup current state (safety)
mv /var/lib/themisdb /var/lib/themisdb.pre-pitr

# 3. Restore base backup
themisdb-restore \
  --backup-directory "$BASE_BACKUP" \
  --target /var/lib/themisdb

# 4. Create recovery configuration
cat > /var/lib/themisdb/recovery.conf << EOF
restore_command = 'cp $WAL_ARCHIVE/%f %p'
recovery_target_time = '$TARGET_TIME'
recovery_target_action = promote
EOF

# 5. Start recovery
themisdb-server \
  --database.path /var/lib/themisdb \
  --database.recovery-mode

# Wait for recovery to complete
tail -f /var/log/themisdb/themisdb.log | grep -m 1 "recovery complete"

# 6. Start normal operations
systemctl start themisdb

# 7. Verify recovery point
RECOVERED_TIME=$(themisdb-admin info | grep lastAppliedTimestamp | cut -d: -f2)
echo "Recovered to: $RECOVERED_TIME"

💡 Pro Tip: Always test PITR in non-production before relying on it.

Recovery Timeline

Full Backup          Incremental     Incremental     CRASH
(Day 1)             (Day 2)         (Day 3)         (Day 3, 14:35)
    |                   |               |               |
    └───────────────────┴───────────────┴───────────────┘
                        WAL Files
                        
Recovery Process:
1. Restore full backup (Day 1)
2. Apply incremental backup (Day 2)
3. Apply incremental backup (Day 3)
4. Replay WAL files until 14:30 (5 minutes before crash)

Result: Database state at 2024-01-24 14:30:00

Automated PITR Testing

#!/bin/bash
# test_pitr.sh

echo "=== Testing PITR Capability ==="

# 1. Create test data with timestamp
TIMESTAMP=$(date +%s)
themisdb-shell << EOF
db._databases();
db._useDatabase('test');
db.pitr_test.save({timestamp: $TIMESTAMP, data: 'PITR test'});
EOF

echo "Inserted test record with timestamp: $TIMESTAMP"

# 2. Wait 60 seconds
echo "Waiting 60 seconds..."
sleep 60

# 3. Insert more data (this should not be recovered)
themisdb-shell << EOF
db._useDatabase('test');
db.pitr_test.save({timestamp: $(date +%s), data: 'After PITR target'});
EOF

# 4. Perform PITR to first timestamp
TARGET_TIME=$(date -d "@$TIMESTAMP" "+%Y-%m-%d %H:%M:%S")
./pitr_recovery.sh "$TARGET_TIME"

# 5. Verify recovery
RECOVERED_COUNT=$(themisdb-shell --quiet << EOF
db._useDatabase('test');
db._query("FOR doc IN pitr_test FILTER doc.timestamp == $TIMESTAMP RETURN doc").toArray().length
EOF
)

if [[ $RECOVERED_COUNT -eq 1 ]]; then
  echo "✓ PITR test passed"
else
  echo "✗ PITR test failed"
fi

Disaster Recovery Planning

Disaster Recovery Objectives

Define your requirements:

Metric	Description	Example Target
RTO (Recovery Time Objective)	Maximum acceptable downtime	4 hours
RPO (Recovery Point Objective)	Maximum acceptable data loss	15 minutes
RLO (Recovery Level Objective)	Minimum service level after recovery	80% capacity

DR Strategy Matrix

Select strategy based on requirements:

Strategy	RTO	RPO	Cost	Complexity
Backup/Restore	4-8 hours	24 hours	$	Low
Backup/Restore + WAL	2-4 hours	15 minutes	$$	Medium
Warm Standby	30 minutes	5 minutes	$$$	Medium
Hot Standby (Multi-region)	1 minute	0 seconds	$$$$	High

Disaster Recovery Plan

Complete DR runbook:

# ThemisDB Disaster Recovery Plan

## Emergency Contacts
- DBA Team: +1-555-0100 (24/7)
- Infrastructure: +1-555-0101
- Security: +1-555-0102

## Disaster Scenarios

### Scenario 1: Hardware Failure (Single Server)
**RTO:** 2 hours | **RPO:** 15 minutes

1. Declare incident
2. Provision new hardware
3. Restore from latest backup
4. Apply WAL files
5. Verify data integrity
6. Update DNS/load balancer
7. Resume operations

### Scenario 2: Data Corruption
**RTO:** 4 hours | **RPO:** 1 hour

1. Assess corruption extent
2. Stop application writes
3. Identify last good backup
4. Perform PITR to pre-corruption time
5. Verify recovered data
6. Resume operations

### Scenario 3: Complete Site Loss
**RTO:** 4 hours | **RPO:** 30 minutes

1. Activate DR site
2. Restore from off-site backup
3. Apply WAL from remote archive
4. Update application endpoints
5. Verify functionality
6. Resume operations

### Scenario 4: Ransomware Attack
**RTO:** 8 hours | **RPO:** 24 hours

1. Isolate infected systems
2. Verify backup integrity
3. Clean environment
4. Restore from verified backup
5. Security audit
6. Resume operations

DR Testing Schedule

#!/bin/bash
# dr_test_schedule.sh

# Quarterly full DR test
0 2 1 */3 * /opt/themisdb/dr/full_dr_test.sh

# Monthly restore test
0 3 1 * * /opt/themisdb/dr/restore_test.sh

# Weekly backup verification
0 4 * * 0 /opt/themisdb/dr/verify_backups.sh

# Daily PITR capability test
0 5 * * * /opt/themisdb/dr/test_pitr.sh

Full DR Test Script:

#!/bin/bash
# full_dr_test.sh

echo "=== Disaster Recovery Test ==="
echo "Date: $(date)"

# 1. Provision DR environment
echo "1. Provisioning DR environment..."
terraform apply -auto-approve -var-file=dr-environment.tfvars

# 2. Restore from backup
echo "2. Restoring from backup..."
LATEST_BACKUP=$(ls -t /backups/themisdb/full/ | head -1)
themisdb-restore \
  --backup-directory "/backups/themisdb/full/$LATEST_BACKUP" \
  --target-host dr-server:8529

# 3. Apply WAL files
echo "3. Applying WAL files..."
themisdb-admin wal-replay \
  --host dr-server:8529 \
  --wal-directory /backups/themisdb/wal_archive/

# 4. Run verification tests
echo "4. Running verification tests..."
./verify_dr_restore.sh dr-server:8529

# 5. Measure RTO
RECOVERY_TIME=$SECONDS
echo "Recovery Time: $RECOVERY_TIME seconds (Target: 7200 seconds)"

if [[ $RECOVERY_TIME -lt 7200 ]]; then
  echo "✓ RTO met"
else
  echo "✗ RTO exceeded"
fi

# 6. Cleanup DR environment
echo "6. Cleaning up DR environment..."
terraform destroy -auto-approve -var-file=dr-environment.tfvars

echo "=== DR Test Complete ==="

Backup Verification

Integrity Checking

Verify backup integrity:

#!/bin/bash
# verify_backup.sh

BACKUP_DIR=$1

echo "Verifying backup: $BACKUP_DIR"

# 1. Check checksums
echo "1. Verifying checksums..."
cd "$BACKUP_DIR"
sha256sum -c checksums.txt || exit 1

# 2. Test decompression
echo "2. Testing decompression..."
for file in *.gz; do
  gunzip -t "$file" || exit 1
done

# 3. Verify backup metadata
echo "3. Checking metadata..."
themisdb-backup-info --backup-directory "$BACKUP_DIR"

# 4. Test restore (to temporary location)
echo "4. Testing restore..."
TMP_DIR=$(mktemp -d)
themisdb-restore \
  --backup-directory "$BACKUP_DIR" \
  --target "$TMP_DIR" \
  --verify-only

# 5. Quick data sampling
echo "5. Sampling data..."
themisdb-server \
  --database.path "$TMP_DIR" \
  --server.endpoint none \
  --database.verify-only

# Cleanup
rm -rf "$TMP_DIR"

echo "✓ Backup verification passed"

Automated Verification

#!/bin/bash
# automated_backup_verification.sh

# Run daily to verify last night's backup

BACKUP_DIR=$(ls -td /backups/themisdb/full/* | head -1)

echo "Daily Backup Verification - $(date)"
echo "Backup: $BACKUP_DIR"

# Verify backup
if ./verify_backup.sh "$BACKUP_DIR"; then
  echo "✓ Backup verification passed" | mail -s "Backup OK" admin@company.com
else
  echo "✗ Backup verification FAILED" | mail -s "ALERT: Backup Failed" admin@company.com
  exit 1
fi

# Update verification log
echo "$(date -Iseconds): $BACKUP_DIR - OK" >> /var/log/themisdb/backup_verification.log

Restore Testing

Regular restore tests:

#!/bin/bash
# monthly_restore_test.sh

echo "=== Monthly Restore Test ==="

# 1. Select random backup from last month
RANDOM_BACKUP=$(ls /backups/themisdb/full/ | shuf -n 1)
echo "Testing backup: $RANDOM_BACKUP"

# 2. Provision test environment
docker run -d \
  --name themisdb-restore-test \
  -v /tmp/restore-test:/var/lib/themisdb \
  themisdb/themisdb:latest

# 3. Restore backup
themisdb-restore \
  --backup-directory "/backups/themisdb/full/$RANDOM_BACKUP" \
  --target-host localhost:8529

# 4. Run data validation queries
themisdb-shell << EOF
// Verify collection counts
const collections = db._collections();
collections.forEach(coll => {
  print(\`\${coll.name()}: \${coll.count()} documents\`);
});

// Verify indexes
collections.forEach(coll => {
  const indexes = coll.getIndexes();
  print(\`\${coll.name()}: \${indexes.length} indexes\`);
});

// Sample query test
db._query("FOR doc IN users LIMIT 100 RETURN doc");
EOF

# 5. Performance test
themisdb-bench \
  --host localhost:8529 \
  --workload read \
  --duration 60 \
  --threads 4

# 6. Cleanup
docker stop themisdb-restore-test
docker rm themisdb-restore-test

echo "✓ Restore test complete"

Restore Procedures

Full Database Restore

Complete restore procedure:

#!/bin/bash
# full_restore.sh

set -e

BACKUP_DIR=$1
TARGET_DIR="/var/lib/themisdb"

if [[ -z "$BACKUP_DIR" ]]; then
  echo "Usage: $0 <backup-directory>"
  exit 1
fi

echo "=== Full Database Restore ==="
echo "Backup: $BACKUP_DIR"
echo "Target: $TARGET_DIR"

read -p "This will OVERWRITE existing data. Continue? (yes/no): " CONFIRM

if [[ "$CONFIRM" != "yes" ]]; then
  echo "Aborted."
  exit 1
fi

# 1. Stop database
echo "1. Stopping database..."
systemctl stop themisdb

# 2. Backup current data (safety)
echo "2. Backing up current state..."
mv "$TARGET_DIR" "${TARGET_DIR}.pre-restore.$(date +%s)"

# 3. Restore data
echo "3. Restoring data..."
themisdb-restore \
  --backup-directory "$BACKUP_DIR" \
  --target "$TARGET_DIR" \
  --verbose

# 4. Verify restored data
echo "4. Verifying restored data..."
themisdb-server \
  --database.path "$TARGET_DIR" \
  --database.verify-only

# 5. Fix permissions
echo "5. Fixing permissions..."
chown -R themisdb:themisdb "$TARGET_DIR"

# 6. Start database
echo "6. Starting database..."
systemctl start themisdb

# 7. Wait for startup
echo "7. Waiting for database..."
until curl -f http://localhost:8529/_api/version 2>/dev/null; do
  sleep 2
done

# 8. Run post-restore checks
echo "8. Running post-restore checks..."
themisdb-admin verify-all

echo "✓ Restore complete"

Selective Collection Restore

Restore specific collections:

# Restore single collection
themisdb-restore \
  --backup-directory /backups/themisdb/full/2024-01-24 \
  --database production \
  --collection users \
  --target-collection users_restored

# Restore multiple collections
for coll in users orders products; do
  themisdb-restore \
    --backup-directory /backups/themisdb/full/2024-01-24 \
    --database production \
    --collection $coll
done

# Restore with data transformation
themisdb-restore \
  --backup-directory /backups/themisdb/full/2024-01-24 \
  --database production \
  --collection users \
  --transform-script /opt/themisdb/transform_users.js

Cross-Version Restore

Restore from older version:

#!/bin/bash
# cross_version_restore.sh

OLD_BACKUP="/backups/themisdb/v1.3.5/full/2024-01-24"
NEW_VERSION="1.4.0"

# 1. Restore to temporary location
TEMP_DIR="/tmp/themisdb-migration"
themisdb-restore \
  --backup-directory "$OLD_BACKUP" \
  --target "$TEMP_DIR"

# 2. Start old version database
docker run -d \
  --name themisdb-old \
  -v "$TEMP_DIR:/var/lib/themisdb" \
  themisdb/themisdb:1.3.5

# 3. Export data in portable format
themisdb-admin export \
  --host localhost:8529 \
  --database production \
  --output-directory /tmp/export \
  --format jsonl

# 4. Stop old version
docker stop themisdb-old

# 5. Start new version
docker run -d \
  --name themisdb-new \
  -p 8529:8529 \
  themisdb/themisdb:$NEW_VERSION

# 6. Import data
themisdb-admin import \
  --host localhost:8529 \
  --database production \
  --input-directory /tmp/export \
  --create-collections

# 7. Verify
themisdb-admin verify-all

echo "Cross-version restore complete: 1.3.5 → $NEW_VERSION"

Cross-Region Backups

Setup Cross-Region Replication

AWS S3 cross-region backup:

#!/bin/bash
# setup_cross_region_backup.sh

PRIMARY_REGION="us-east-1"
DR_REGION="us-west-2"
BUCKET_NAME="themisdb-backups"

# 1. Create buckets in both regions
aws s3 mb s3://${BUCKET_NAME}-${PRIMARY_REGION} --region $PRIMARY_REGION
aws s3 mb s3://${BUCKET_NAME}-${DR_REGION} --region $DR_REGION

# 2. Enable versioning
aws s3api put-bucket-versioning \
  --bucket ${BUCKET_NAME}-${PRIMARY_REGION} \
  --versioning-configuration Status=Enabled

# 3. Setup replication
cat > replication-config.json << EOF
{
  "Role": "arn:aws:iam::123456789:role/s3-replication-role",
  "Rules": [{
    "Status": "Enabled",
    "Priority": 1,
    "Filter": {},
    "Destination": {
      "Bucket": "arn:aws:s3:::${BUCKET_NAME}-${DR_REGION}",
      "ReplicationTime": {
        "Status": "Enabled",
        "Time": {
          "Minutes": 15
        }
      }
    }
  }]
}
EOF

aws s3api put-bucket-replication \
  --bucket ${BUCKET_NAME}-${PRIMARY_REGION} \
  --replication-configuration file://replication-config.json

echo "Cross-region replication configured"

Backup to Multiple Regions

#!/bin/bash
# multi_region_backup.sh

BACKUP_DIR="/backups/themisdb/$(date +%Y%m%d_%H%M%S)"
REGIONS=("us-east-1" "us-west-2" "eu-west-1")

# 1. Create local backup
themisdb-backup \
  --backup-directory "$BACKUP_DIR" \
  --database production \
  --compress \
  --encrypt

# 2. Upload to all regions in parallel
for region in "${REGIONS[@]}"; do
  (
    echo "Uploading to $region..."
    aws s3 sync "$BACKUP_DIR" \
      s3://themisdb-backups-$region/$(date +%Y%m%d)/ \
      --region $region \
      --storage-class STANDARD_IA
    
    echo "✓ Upload to $region complete"
  ) &
done

# Wait for all uploads
wait

echo "Multi-region backup complete"

# 3. Verify uploads
for region in "${REGIONS[@]}"; do
  echo "Verifying $region..."
  aws s3 ls s3://themisdb-backups-$region/$(date +%Y%m%d)/ --region $region
done

Cross-Region Restore

#!/bin/bash
# cross_region_restore.sh

DR_REGION="us-west-2"
BACKUP_DATE="2024-01-24"

echo "Restoring from DR region: $DR_REGION"

# 1. Download backup from DR region
aws s3 sync \
  s3://themisdb-backups-$DR_REGION/$BACKUP_DATE/ \
  /tmp/dr-restore/ \
  --region $DR_REGION

# 2. Verify download
cd /tmp/dr-restore/
sha256sum -c checksums.txt || exit 1

# 3. Restore
themisdb-restore \
  --backup-directory /tmp/dr-restore/ \
  --target /var/lib/themisdb/

# 4. Cleanup
rm -rf /tmp/dr-restore/

echo "Cross-region restore complete"

Automation Scripts

Daily Backup Script

#!/bin/bash
# daily_backup.sh

set -e

BACKUP_BASE="/backups/themisdb"
DATE=$(date +%Y%m%d)
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

# Logging
exec 1> >(logger -s -t themisdb-backup) 2>&1

echo "Starting daily backup: $TIMESTAMP"

# Full backup on Sundays, incremental otherwise
if [[ $(date +%u) -eq 7 ]]; then
  BACKUP_TYPE="full"
  BACKUP_DIR="$BACKUP_BASE/full/$DATE"
else
  BACKUP_TYPE="incremental"
  BACKUP_DIR="$BACKUP_BASE/incremental/$DATE"
  LAST_FULL=$(ls -td $BACKUP_BASE/full/* | head -1)
fi

# Create backup
if [[ "$BACKUP_TYPE" == "full" ]]; then
  themisdb-backup \
    --backup-directory "$BACKUP_DIR" \
    --all-databases \
    --compress \
    --encrypt \
    --threads 8 \
    --verbose
else
  themisdb-backup \
    --backup-directory "$BACKUP_DIR" \
    --all-databases \
    --type incremental \
    --base-backup "$LAST_FULL" \
    --compress \
    --encrypt \
    --threads 8 \
    --verbose
fi

# Verify backup
if ./verify_backup.sh "$BACKUP_DIR"; then
  echo "✓ Backup successful: $BACKUP_DIR"
else
  echo "✗ Backup verification failed!"
  exit 1
fi

# Upload to S3
aws s3 sync "$BACKUP_DIR" \
  s3://themisdb-backups/$(hostname)/$DATE/ \
  --storage-class STANDARD_IA

# Cleanup old backups
find $BACKUP_BASE/incremental/* -mtime +$RETENTION_DAYS -exec rm -rf {} \;
find $BACKUP_BASE/full/* -mtime +90 -exec rm -rf {} \;  # Keep full backups 90 days

# Archive old WAL files
find /var/lib/themisdb/wal/* -mtime +7 -exec \
  aws s3 cp {} s3://themisdb-backups/$(hostname)/wal/ \; -delete

echo "Daily backup complete"

Backup Monitoring Script

#!/bin/bash
# monitor_backups.sh

# Check if daily backup completed successfully

EXPECTED_BACKUP="/backups/themisdb/*/$(date +%Y%m%d)*"
BACKUP_AGE_HOURS=24

# Find latest backup
LATEST_BACKUP=$(ls -td $EXPECTED_BACKUP 2>/dev/null | head -1)

if [[ -z "$LATEST_BACKUP" ]]; then
  echo "CRITICAL: No backup found for today" | \
    mail -s "ALERT: Backup Missing" admin@company.com
  exit 2
fi

# Check backup age
BACKUP_TIME=$(stat -c %Y "$LATEST_BACKUP")
CURRENT_TIME=$(date +%s)
AGE_HOURS=$(( ($CURRENT_TIME - $BACKUP_TIME) / 3600 ))

if [[ $AGE_HOURS -gt $BACKUP_AGE_HOURS ]]; then
  echo "WARNING: Latest backup is $AGE_HOURS hours old" | \
    mail -s "WARNING: Old Backup" admin@company.com
  exit 1
fi

# Verify backup integrity
if ! ./verify_backup.sh "$LATEST_BACKUP"; then
  echo "CRITICAL: Backup verification failed" | \
    mail -s "ALERT: Backup Corrupted" admin@company.com
  exit 2
fi

echo "OK: Backup is recent and valid"
exit 0

Automated Restore Testing

#!/bin/bash
# automated_restore_test.sh

# Weekly automated restore test

TEST_ENV="restore-test-$(date +%Y%m%d)"
LATEST_BACKUP=$(ls -td /backups/themisdb/full/* | head -1)

echo "=== Automated Restore Test ==="
echo "Backup: $LATEST_BACKUP"
echo "Test Environment: $TEST_ENV"

# 1. Provision test environment
docker run -d \
  --name $TEST_ENV \
  -p 9529:8529 \
  themisdb/themisdb:latest

# 2. Restore backup
if ! themisdb-restore \
  --backup-directory "$LATEST_BACKUP" \
  --target-host localhost:9529; then
  
  echo "CRITICAL: Restore failed!" | \
    mail -s "ALERT: Restore Test Failed" admin@company.com
  exit 1
fi

# 3. Run validation tests
if ! ./validate_restored_data.sh localhost:9529; then
  echo "CRITICAL: Data validation failed!" | \
    mail -s "ALERT: Restore Validation Failed" admin@company.com
  exit 1
fi

# 4. Cleanup
docker stop $TEST_ENV
docker rm $TEST_ENV

echo "✓ Automated restore test passed" | \
  mail -s "OK: Weekly Restore Test Passed" admin@company.com

Quick Reference

Backup Commands Cheatsheet

# Full backup
themisdb-backup --backup-directory /backups/full/$(date +%Y%m%d) --all-databases

# Incremental backup
themisdb-backup --type incremental --base-backup /backups/full/latest

# Single collection backup
themisdb-backup --database mydb --collection users --output users_backup.jsonl

# Compressed encrypted backup
themisdb-backup --compress --encrypt --encryption-key-file /etc/themisdb/key

# Verify backup
themisdb-restore --verify-only --backup-directory /backups/full/20240124

# Restore
themisdb-restore --backup-directory /backups/full/20240124

# PITR
themisdb-admin wal-replay --target-time "2024-01-24 14:30:00"

Backup Schedule Template

Daily:    Incremental backup + WAL archiving
Weekly:   Full backup (Sunday)
Monthly:  Full backup + off-site copy
Quarterly: DR test with full restore
Annually:  DR full-site failover test

Retention:
- Daily incrementals: 7 days
- Weekly full: 4 weeks
- Monthly full: 12 months
- Yearly archive: 7 years

Last Updated: 2026-04-06
Version: 1.4.0
Maintainer: ThemisDB Team

FilesExpand file tree

BACKUP_RECOVERY.md

Latest commit

History

BACKUP_RECOVERY.md

File metadata and controls

ThemisDB Backup & Recovery Guide

Table of Contents

Backup Strategies

Full Backup

Incremental Backup

Differential Backup

Snapshot Backup

Continuous Backup (WAL Archiving)

Online vs Offline Backups

Online Backups (Hot Backup)

Offline Backups (Cold Backup)

Point-in-Time Recovery

PITR Setup

Recovery to Specific Time

Recovery Timeline

Automated PITR Testing

Disaster Recovery Planning

Disaster Recovery Objectives

DR Strategy Matrix

Disaster Recovery Plan

DR Testing Schedule

Backup Verification

Integrity Checking

Automated Verification

Restore Testing

Restore Procedures

Full Database Restore

Selective Collection Restore

Cross-Version Restore

Cross-Region Backups

Setup Cross-Region Replication

Backup to Multiple Regions

Cross-Region Restore

Automation Scripts

Daily Backup Script

Backup Monitoring Script

Automated Restore Testing

Quick Reference

Backup Commands Cheatsheet

Backup Schedule Template