Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Сервер: Watchdog #290

Merged
merged 11 commits into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions .github/workflows/deploy_watchdog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
name: ⚖️ Deploy Watchdog Websockets
on:
workflow_dispatch:
inputs:
container_name:
type: string
description: "Watched container name"
required: true
default: "map_websocket_server"
image_name:
type: string
description: "Watchdog container name"
required: true
default: "map_watchdog"
threshold:
type: integer
description: "CPU threshold (in %)"
required: true
default: 60
restart_threshold:
type: integer
description: "How many times CPU must hit threshold"
required: true
default: 2
interval:
type: integer
description: "Checking interval (in seconds)"
required: true
default: 30
log_file:
type: string
description: "Log file"
required: true
default: "watchdog.log"

run-name: Deploy Watchdog Websockets from '${{ github.ref_name }}' branch
jobs:
deploy_websocket:
runs-on: ubuntu-latest
steps:
- name: Redeploy Watchdog WebSockets
uses: appleboy/ssh-action@v1.2.0
with:
host: ${{ secrets.CLOUD_SERVER_IP }}
username: ${{ secrets.CLOUD_SERVER_USER }}
key: ${{ secrets.CLOUD_SERVER_SSH_KEY }}
script: |
cd /root/ukraine_alarm_map/deploy/
git fetch --all
git switch ${{ github.ref_name }}
git pull
bash redeploy_watchdog.sh -c ${{ inputs.container_name }} -w ${{ inputs.image_name }} -t ${{ inputs.threshold }} -r ${{ inputs.restart_threshold }} -i ${{ inputs.interval }} -l ${{ inputs.log_file }}
- name: Clear unused images
uses: appleboy/ssh-action@v1.2.0
with:
host: ${{ secrets.CLOUD_SERVER_IP }}
username: ${{ secrets.CLOUD_SERVER_USER }}
key: ${{ secrets.CLOUD_SERVER_SSH_KEY }}
script: |
docker image prune -f
91 changes: 91 additions & 0 deletions deploy/redeploy_watchdog.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/bin/bash

# Default values
CONTAINER_NAME="map_websocket_server" # Replace with your container name or ID
IMAGE_NAME="map_watchdog"
THRESHOLD=60 # CPU usage threshold in percentage
RESTART_THRESHOLD=2 # CPU threshold hits
INTERVAL=30 # Time in seconds between checks
LOG_FILE="watchdog.log" # Path to the log file


# Check for arguments
while [[ $# -gt 0 ]]; do
case "$1" in
-c|--container-name)
CONTAINER_NAME="$2"
shift 2
;;
-w|--image-name)
IMAGE_NAME="$2"
shift 2
;;
-t|--threshold)
THRESHOLD="$2"
shift 2
;;
-r|--restart-threshold)
RESTART_THRESHOLD="$2"
shift 2
;;
-i|--interval)
INTERVAL="$2"
shift 2
;;
-l|--log-file)
LOG_FILE="$2"
shift 2
;;
*)
echo "Unknown argument: $1"
exit 1
;;
esac
done

echo "WATCHDOG"

echo "CONTAINER_NAME: $CONTAINER_NAME"
echo "THRESHOLD: $THRESHOLD"
echo "RESTART_THRESHOLD: $RESTART_THRESHOLD"
echo "INTERVAL: $INTERVAL"
echo "LOG_FILE: $LOG_FILE"
echo "IMAGE_NAME: $IMAGE_NAME"


# Updating the Git repo
echo "Updating Git repo..."
#cd /path/to/your/git/repo
git pull

# Moving to the deployment directory
echo "Moving to deployment directory..."
cd watchdog

# Building Docker image
echo "Building Docker image "$IMAGE_NAME"..."
docker build -t "$IMAGE_NAME" -f Dockerfile .

# Stopping and removing the old container (if exists)
echo "Stopping and removing old container "$IMAGE_NAME"..."
docker stop "$IMAGE_NAME" || true
docker rm "$IMAGE_NAME" || true

touch /root/"$LOG_FILE"

# Deploying the new container
echo "Deploying new container..."
docker run --name "$IMAGE_NAME" \
--restart unless-stopped \
--network=jaam \
-d \
-e CONTAINER_NAME="$CONTAINER_NAME" \
-e THRESHOLD="$THRESHOLD" \
-e RESTART_THRESHOLD="$RESTART_THRESHOLD" \
-e INTERVAL="$INTERVAL" \
-e LOG_FILE="$LOG_FILE" \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /root/"$LOG_FILE":/"$LOG_FILE" \
"$IMAGE_NAME"

echo "Container deployed successfully!"
14 changes: 14 additions & 0 deletions deploy/watchdog/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Use a lightweight base image
FROM alpine:latest

# Install Docker CLI and necessary tools
RUN apk add --no-cache docker bash bc

# Copy the watchdog script into the container
COPY watchdog.sh /watchdog.sh

# Make the script executable
RUN chmod +x /watchdog.sh

# Set the script as the default command
CMD ["/watchdog.sh"]
54 changes: 54 additions & 0 deletions deploy/watchdog/watchdog.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/bin/bash

CONTAINER_NAME="${CONTAINER_NAME:?Environment variable CONTAINER_NAME is required}" # Container name or ID
THRESHOLD="${THRESHOLD:?Environment variable THRESHOLD is required}" # CPU usage threshold (%)
INTERVAL="${INTERVAL:-60}" # Check interval (default: 60s)
LOG_FILE="${LOG_FILE:-watchdog.log}" # Log file path
RESTART_THRESHOLD="${RESTART_THRESHOLD:-2}" # Number of consecutive threshold hits to restart

# Function to log messages with timestamp
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

# Ensure the log file exists and is writable
touch "$LOG_FILE" && chmod 666 "$LOG_FILE"

# Handle termination signals
trap "log 'Stopping watchdog script'; exit 0" SIGINT SIGTERM

# Initialize counter for threshold violations
violation_count=0

while true; do
# Get the CPU usage percentage of the container
CPU_USAGE=$(docker stats --no-stream --format "{{.CPUPerc}}" "$CONTAINER_NAME" 2>/dev/null | tr -d '%')

if [ -z "$CPU_USAGE" ]; then
log "Failed to get stats for container $CONTAINER_NAME. Ensure it exists and is running."
sleep "$INTERVAL"
continue
fi

# Check if the CPU usage exceeds the threshold
if (( $(echo "$CPU_USAGE > $THRESHOLD" | bc -l) )); then
violation_count=$((violation_count + 1))
log "CPU usage ($CPU_USAGE%) exceeded threshold ($THRESHOLD%). Consecutive hits: $violation_count/$RESTART_THRESHOLD."

if (( violation_count >= RESTART_THRESHOLD )); then
log "CPU usage threshold exceeded $RESTART_THRESHOLD times. Restarting container..."
if docker restart "$CONTAINER_NAME"; then
log "Container $CONTAINER_NAME restarted successfully."
else
log "Failed to restart container $CONTAINER_NAME. Check the container status."
fi
violation_count=0 # Reset counter after restart
fi
else
log "CPU usage is $CPU_USAGE% of $THRESHOLD%."
violation_count=0 # Reset counter if CPU usage is below threshold
fi

# Wait for the specified interval before the next check
sleep "$INTERVAL"
done
28 changes: 0 additions & 28 deletions deploy/watchdog_ws.sh

This file was deleted.

Loading