Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Changelog

## [0.2.1]
- Misc hot fixes
- `LocalDriver` pull image if it doesn't exist locally.
- Fix Compute Manager healthcheck

## [0.2.0]
- The objective of this release is to support mounting the Flint Metastore as a POSIX-like filesystem. Consequently, reducing the number of Docker volumes required by the Control Plane.
- The Experiment Tracker has been collapsed into the Experiment Server for simplicity and improved robustness around `inotify` events.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<img width="60%" src="docs/_assets/logo-text.png" alt="FlintML Logo Text" /><br/>

<!-- Badges, all inside the same HTML block -->
<img src="https://img.shields.io/badge/version-v0.2.0-cf051c" alt="Version 0.2.0" />
<img src="https://img.shields.io/badge/version-v0.2.1-cf051c" alt="Version 0.2.1" />
<img src="https://img.shields.io/badge/license-BSL_1.1-blue" alt="License BSL 1.1" />

</br>
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.2.0
0.2.1
5 changes: 5 additions & 0 deletions src/compute-manager/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ RUN poetry config virtualenvs.create false && \

FROM python:3.12-slim AS runtime

# install curl
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Inject dependencies
Expand Down
24 changes: 23 additions & 1 deletion src/compute-manager/src/driver/local.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import asyncio
import docker
from docker.models.containers import Container
from docker.errors import NotFound, APIError
from docker.errors import ImageNotFound, APIError
from docker.types import LogConfig
from typing import Dict
import os
Expand All @@ -25,9 +25,31 @@ def __init__(self, config: dict):
super().__init__(config)
self._docker: docker.DockerClient = docker.from_env()

self._ensure_worker_image()

self._containers: Dict[str, Tuple[ContainerContext, Optional[Container]]] = {}
self._watch_tasks: Dict[str, asyncio.Task] = {}

def _ensure_worker_image(self) -> None:
"""
Check for self.worker_image in the local cache, and pull from Docker Hub
if it's not found.
"""
try:
self._docker.images.get(self.worker_image)
logging.info(f"Image {self.worker_image} already present, skipping pull.")
except ImageNotFound:
logging.info(f"Image {self.worker_image} not found locally, pulling…")
try:
self._docker.images.pull(self.worker_image)
logging.info(f"Successfully pulled {self.worker_image}.")
except APIError as e:
logging.error(f"Failed to pull {self.worker_image}: {e}")
raise e
except APIError as e:
logging.error(f"Docker error inspecting image {self.worker_image}: {e}")
raise e

async def launch_container(self, ctx: ContainerContext) -> None:
"""Start a container and begin watching it for unexpected exits."""
try:
Expand Down
7 changes: 6 additions & 1 deletion src/docker-compose.build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,12 @@ services:
context: .
dockerfile: ./workspace/Dockerfile
depends_on:
- catalog-explorer
storage:
condition: service_healthy
compute-manager:
condition: service_healthy
catalog-explorer:
condition: service_started
restart: always

reverse-proxy:
Expand Down
7 changes: 6 additions & 1 deletion src/docker-compose.release-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,12 @@ services:
<<: [*storage-creds, *s3fs]
image: flintml/workspace:${VERSION}
depends_on:
- catalog-explorer
storage:
condition: service_healthy
compute-manager:
condition: service_healthy
catalog-explorer:
condition: service_started
restart: always

reverse-proxy:
Expand Down
3 changes: 1 addition & 2 deletions src/experiment-server/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,4 @@ server_pid=$!
aim up \
--repo /mnt/metastore/experiment \
--host 0.0.0.0 --port 43800 \
--base-path /experiment-tracker \
--log-level debug
--base-path /experiment-tracker