-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
19 lines (15 loc) · 1.08 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Simple prototype of deploying model pipelines from a config on Ray Serve.
- Requires that you're running on the Ray nightly wheels (for Serve CLI).
- "Models" are defined in serve_pipeline/__init__.py. They're currently just random number generators but could have any Python code filled in.
- Pipeline is defined in example_pipeline.json. The "class" field must be the import path to a class that's installed in the Python environment on the Ray cluster.
To run:
pip install -e serve_pipeline
ray start --head
serve start
python deploy.py example_pipeline.json
curl -X GET localhost:8000/api # Should return a random float in [0, 1).
Other notes:
- Deploying the pipeline (deploy.py) could be done from a remote machine via the Ray client.
- The model code definitions should be built into the Docker image that the Ray cluster is running.
- This can support all of the Ray Serve features - scaling each model, batching requests, using GPUs, etc.
- Error handling is really sloppy right now (e.g., it won't clean up resources if it fails halfway through), but we could make this declarative.