Release BentoML-0.5.4 · bentoml/BentoML

Prometheus metrics improvements in API server
- metrics now work in multi-process mode when running with gunicorn
- added 3 default metrics including to BentoAPIServer:
  - request latency
  - request count total labeled by status code
  - request gauge for monitoring concurrent prediction requests
New Tensorflow TensorHandler!
- Receive tf.Tensor data structure within your API function for your Tensorflow and Keras model
- TfSavedModel now can automatically transform input tensor shape based on tf.Function input signature
Largely improved error handling in API Server
- Proper HTTP error code and message
- Protect user error details being leaked to client-side
- Introduced Exception classes for users to use when customizing BentoML handlers
Deployment guide on Google Cloud Run
Deployment guide on AWS Fargate ECS
AWS Lambda deployment improvements

Provide feedback