Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
from app.core.lifespan import lifespan
from app.api.endpoints import predictions
from app.core.logging_config import setup_logging
from prometheus_fastapi_instrumentator import Instrumentator

setup_logging()
app = FastAPI(
Expand All @@ -12,8 +11,6 @@
lifespan=lifespan
)

instrumentator = Instrumentator().instrument(app)
instrumentator.expose(app, include_in_schema=False, endpoint="/actuator/prometheus")
app.include_router(predictions.router, prefix="/api/v1", tags=["Prediction"])

@app.get("/health")
Expand Down
19 changes: 19 additions & 0 deletions deployment/docker-compose.monitoring.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,25 @@ services:
networks:
- mynetwork

dcgm-exporter:
image: nvcr.io/nvidia/k8s/dcgm-exporter:latest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

프로덕션 환경의 안정성을 위해 Docker 이미지에 latest 태그를 사용하는 것을 지양해야 합니다. latest 태그는 예기치 않은 변경사항을 가져올 수 있으므로, 특정 버전으로 고정하는 것이 좋습니다. 예를 들어 3.3.5-3.1.8-ubuntu22.04와 같은 구체적인 버전을 명시해주세요.

      image: nvcr.io/nvidia/k8s/dcgm-exporter:3.3.5-3.1.8-ubuntu22.04

container_name: dcgm-exporter
ports:
- "9400:9400"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=utility,compute
restart: unless-stopped
networks:
- mynetwork

networks:
mynetwork:
driver: bridge
10 changes: 9 additions & 1 deletion deployment/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ services:
count: 1
capabilities: [gpu]
container_name: app-blue
networks:
- mynetwork

app-green:
image: ${ECR_URI}/dearbelly-cv:latest
Expand All @@ -27,4 +29,10 @@ services:
- driver: nvidia
count: 1
capabilities: [ gpu ]
container_name: app-green
container_name: app-green
networks:
- mynetwork

networks:
mynetwork:
driver: bridge
Comment on lines +36 to +38

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

mynetwork 네트워크가 docker-compose.monitoring.yml 파일과 docker-compose.yml 파일 양쪽 모두에 정의되어 있습니다. 여러 compose 파일에서 동일한 네트워크를 사용하는 경우, 한 파일에서만 네트워크를 정의하고 다른 파일에서는 external: true 옵션을 사용하여 외부 네트워크로 참조하는 것이 좋습니다. 이렇게 하면 중복을 피하고 설정을 더 명확하게 관리할 수 있습니다. 예를 들어, 이 파일(docker-compose.yml)에 네트워크 정의를 남겨두고, docker-compose.monitoring.yml에서는 mynetwork를 외부 네트워크로 참조하도록 수정할 수 있습니다.

2 changes: 1 addition & 1 deletion deployment/generate_review.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def send_prompt():

# Gemini에 전달할 프롬프트 구성
prompt = f"""
당신은 시니어 개발자입니다. 제출된 Pull Request(PR)에 대해 동료 개발자에게 건설적이고 상세한 코드 리뷰를 제공합니다.
너는 시니어 개발자다. 제출된 Pull Request(PR)에 대해 건설적이고 상세한 코드 리뷰를 제공해주세요.
리뷰는 반드시 '우선순위 레벨(P1~P5)'로 분류해 주세요.

[우선순위 정의]
Expand Down
18 changes: 7 additions & 11 deletions deployment/prometheus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,15 @@ scrape_configs:
honor_timestamps: false
scheme: 'http'
static_configs:
- targets: ['${REMOTE_HOST}:9090']
- targets: ['127.0.0.1:9090']
labels:
service: 'monitor-1'
- job_name: 'node'
static_configs:
- targets: ['${REMOTE_HOST}:9090']
- job_name: 'fastapi-actuator-blue'
metrics_path: '/actuator/prometheus'
scrape_interval: 1m
- targets: ['127.0.0.1:9090']
- job_name: 'dcgm'
scrape_interval: 15s
static_configs:
- targets: [ 'app-blue:8000' ]
- job_name: 'fastapi-actuator-green'
metrics_path: '/actuator/prometheus'
scrape_interval: 1m
static_configs:
- targets: [ 'app-green:8001' ]
- targets: ['dcgm-exporter:9400']
labels:
exporter: 'dcgm'
4 changes: 1 addition & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,4 @@ Pillow==11.3.0
dotenv
openai
timm
logging
prometheus-client==0.19.0
prometheus-fastapi-instrumentator==6.1.0
logging