Implement non-blocking model loading with accurate health state management #36

eliteprox · 2025-09-15T22:20:06Z

This pull request introduces a new example for model loading and refactors the model loading workflow to use a sentinel message pattern, improving reliability and testability. The changes ensure the server starts immediately and remains available for health checks while the model loads in the background, with clear transitions between LOADING and IDLE states. The model loading is now triggered via parameter updates, and thread-safe mechanisms are added to prevent redundant loading.

Model Loading Example and Workflow Improvements

Added a new example script examples/model_loading_example.py demonstrating non-blocking server startup and background model loading via a sentinel message, with health endpoint transitions from LOADING to IDLE.
Updated .vscode/launch.json to include a launch configuration for the new model loading example.

Core Library Refactoring for Model Loading

Refactored FrameProcessor to add thread-safe model loading via ensure_model_loaded, using an asyncio lock and a _model_loaded flag to guarantee the model loads exactly once.
Modified StreamProcessor and FrameProcessor to trigger model loading via a sentinel parameter (load_model) and attach server state for accurate health reporting; the server no longer transitions to IDLE until model loading completes. [1] [2] [3] [4]

Minor Fixes

Fixed a bug in orchestrator registration by using resp.status_code instead of resp.status for error handling.

pytrickle/stream_processor.py

pschroedl · 2025-09-20T00:05:41Z

pytrickle/stream_processor.py

+            # Schedule non-blocking background preload so server can accept /health immediately
+            async def _background_preload():
+                try:
+                    if getattr(self._frame_processor, "state", None) is not None:


This seems to be repeated a lot - we could extract this into a method, get it once, and maybe assign it to a variable if appropriate to avoid the repeated calls.

_on_startup is a "callback" method that's being registered to the server's existing post-startup event to initiate load_model

pytrickle/pytrickle/stream_processor.py

Lines 123 to 126 in f888c13

try:

self.server.app.on_startup.append(_on_startup)

except Exception as e:

logger.error(f"Failed to register startup hook: {e}")

This can also be done manually outside of pytrickle by accessing the StreamProcessor's server property as was done in ComfyStream to load the pipeline:

https://github.com/livepeer/comfystream/blob/ad301a4c2efca54c17fb583e063c54f321537a9e/server/byoc.py#L180

https://github.com/livepeer/comfystream/blob/ad301a4c2efca54c17fb583e063c54f321537a9e/server/frame_processor.py#L144-L157

Ah...you're referring to getting the current state. Totally agree!

The issue is the state of frame processor and server are still separate. I think we could simplify get and update the state of the Server instead of keeping a state on the frame processor and using "attach_state". wdyt?

I changed the state property on the base frame_processor class so these None checks could be removed
e810f27

pschroedl · 2025-09-20T01:15:16Z

pytrickle/stream_processor.py

+                    logger.error(f"Error preloading model on startup: {e}")
+
+            try:
+                asyncio.get_running_loop().create_task(_background_preload())


Do we really want this async? seems like we could get the stated intent of the PR, to load the model synchronously ( blocking ), with await _background_preload() - asyncio might report "ready" before the model is loaded

Well, due to this being called via the server startup event, it blocks the server from being available unless a task is created to run it in the background (non-blocking). The health state begins with LOADING and transitions to IDLE. For managed containers, it is important for the server to be available immediately to rule out other potential docker container issues and complete a health check (in this case LOADING is returned until set_startup_complete() is called at the end). In a sense, model loading is now synchronous due to the model loading lock and the health state.

Here is where the user's load_model callback is used with the lock

pytrickle/pytrickle/frame_processor.py

Lines 68 to 76 in e810f27

async def ensure_model_loaded(self, **kwargs):

"""Thread-safe wrapper that ensures model is loaded exactly once."""

async with self._model_load_lock:

if not self._model_loaded:

await self.load_model(**kwargs)

self._model_loaded = True

logger.debug(f"Model loaded for {self.__class__.__name__}")

else:

logger.debug(f"Model already loaded for {self.__class__.__name__}")

This can be tested by simpling running process_video_example.py from the launch config, and sending a curl request within the first 3 seconds:

curl -X GET http://localhost:8000/health -H "Accept: application/json"

It should read LOADING and flip to IDLE after 3 seconds. This can be adjusted here

pytrickle/examples/process_video_example.py

Line 28 in e810f27

MODEL_LOAD_DELAY_SECONDS = 3.0

Made another change to track the background model loading from stream processor and cancel if needed. I think this is more of what you were looking for? 3a8c523

This reverts commit ecfc1ef.

…up _model_loaded var

…ests

eliteprox marked this pull request as ready for review September 15, 2025 22:29

eliteprox requested review from ad-astra-video and pschroedl September 16, 2025 22:15

pschroedl reviewed Sep 20, 2025

View reviewed changes

pytrickle/stream_processor.py Outdated Show resolved Hide resolved

pschroedl reviewed Sep 20, 2025

View reviewed changes

eliteprox changed the title ~~load_model sync, attach frame processor to server health state~~ Implement non-blocking model preloading with accurate health state management Sep 20, 2025

eliteprox changed the title ~~Implement non-blocking model preloading with accurate health state management~~ Implement non-blocking model loading with accurate health state management Sep 20, 2025

eliteprox marked this pull request as draft September 25, 2025 19:57

eliteprox mentioned this pull request Sep 29, 2025

PyTrickle: Ensure load_model runs automatically livepeer/comfystream#451

Open

eliteprox self-assigned this Oct 1, 2025

eliteprox marked this pull request as ready for review October 27, 2025 17:32

eliteprox added 9 commits October 27, 2025 14:53

load_model sync, attach frame processor to server health state

5d0984c

Revert "fix status_code var"

bae40f3

This reverts commit ecfc1ef.

fix model_loaded race condition with lock

ea805e6

frame_processor: create StreamState in base class to resolve None checks

e192200

stream_processor: track model loading as background task and remove d…

3c8a046

…up _model_loaded var

client: remove duplicate state tracking

6f220fd

consolidate state management for model loading

1a9ed6f

fix failing tests

45e43ea

fix state check to be logical

6962fc0

eliteprox force-pushed the fix/load-model-sync branch from 64f3073 to 6962fc0 Compare October 27, 2025 18:55

eliteprox added 7 commits October 28, 2025 15:50

Implement model loading in param_updater, add tests

62f9a15

examples: add model loading example

2b98746

revert changes to process_video_example

8773b23

code cleanup

a152c4a

rename _load_model sentinel to load_model, update docs, consolidate t…

a462ebc

…ests

model_loading_example: simplify comments

2e035e3

update readme

b95d775

eliteprox requested a review from pschroedl October 28, 2025 21:51

Merge branch 'main' into fix/load-model-sync

3d0a4a3

eliteprox mentioned this pull request Oct 28, 2025

refactor(byoc): use pytrickle warmup routine, improve prompt execution livepeer/comfystream#484

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement non-blocking model loading with accurate health state management #36

Implement non-blocking model loading with accurate health state management #36

eliteprox commented Sep 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

pschroedl Sep 20, 2025

Uh oh!

eliteprox Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

eliteprox Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

pschroedl Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	try:
	self.server.app.on_startup.append(_on_startup)
	except Exception as e:
	logger.error(f"Failed to register startup hook: {e}")

	async def ensure_model_loaded(self, **kwargs):
	"""Thread-safe wrapper that ensures model is loaded exactly once."""
	async with self._model_load_lock:
	if not self._model_loaded:
	await self.load_model(**kwargs)
	self._model_loaded = True
	logger.debug(f"Model loaded for {self.__class__.__name__}")
	else:
	logger.debug(f"Model already loaded for {self.__class__.__name__}")

Uh oh!

Implement non-blocking model loading with accurate health state management #36

Are you sure you want to change the base?

Implement non-blocking model loading with accurate health state management #36

Conversation

eliteprox commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pschroedl Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

pschroedl Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eliteprox commented Sep 15, 2025 •

edited

Loading

eliteprox Sep 20, 2025 •

edited

Loading

eliteprox Sep 20, 2025 •

edited

Loading

pschroedl Sep 20, 2025 •

edited

Loading

eliteprox Sep 20, 2025 •

edited

Loading