BentoML-0.5.4
-
Prometheus metrics improvements in API server
- metrics now work in multi-process mode when running with gunicorn
- added 3 default metrics including to BentoAPIServer:
- request latency
- request count total labeled by status code
- request gauge for monitoring concurrent prediction requests
-
New Tensorflow TensorHandler!
- Receive tf.Tensor data structure within your API function for your Tensorflow and Keras model
- TfSavedModel now can automatically transform input tensor shape based on tf.Function input signature
-
Largely improved error handling in API Server
- Proper HTTP error code and message
- Protect user error details being leaked to client-side
- Introduced Exception classes for users to use when customizing BentoML handlers
-
Deployment guide on Google Cloud Run
-
Deployment guide on AWS Fargate ECS
-
AWS Lambda deployment improvements