Best practice for multiple service serving #4771
Replies: 2 comments
-
I have a follow up question: We can also use one of the services as a router as well like:
and when we run However, my question is, are we using the services with THEIR full capabilities this way through What is the best practice here? |
Beta Was this translation helpful? Give feedback.
-
Whether to expose an endpoint for each individual service depends on the use case. The router service you have above is a perfectly valid pattern if all models endpoints should be exposed. Alternatively, if a service should be used as internal functions that assist on another function exposed externally, it can just be invoked as an internal function without an external endpoint. I think you are absolutely correct services were designed to run and scale independently with dedicate CPU and GPU resources. That is why they can be defined individually for each service. When a bento with multiple services is deployed to BentoCloud, this is exactly the topology it is deployed where each service is a replica set. |
Beta Was this translation helpful? Give feedback.
-
Hi everyone,
I am just wondering what are your thoughts on the best practice of serving multiple bentoml service with their own endpoints.
I mean, let's say you have 3
bentoml.service
you want to serve, one of them uses the other two usingbentoml.depends
to call them async and merge their outputs. But, I also need to serve those two independently as well.You can only serve one service when you call
bentoml serve service:(one of those three)
but you can run them separately at different ports as well.What I am trying to achieve is, I was thinking to write a quick nginx conf to forward all three individual ports to one port with their corresponding endpoint, so that when this app is being deployed to Cloud, even though they run on their own ports while container is running, calls from outside host/endpoint1, host/endpoint2, host/endpoint3 can be forward to local:port1/endpoint1, local:port2/endpoint2, etc.
Long story short, All these three services can be called individually and can have their own traffic load. What do you think?
Beta Was this translation helpful? Give feedback.
All reactions