Releases: bentoml/BentoML
BentoML-0.12.1
Detailed Changelog: v0.12.0...v0.12.1
PaddlePaddle Support
We are thrilled to announce that BentoML now fully supports the PaddlePaddle framework from Baidu. Users can easily serve their own models created with Paddle via Paddle Inference and serve pre-trained models from PaddleHub, which contains over 300 production-grade pre-trained models.
Tutorial notebooks for using BentoML with PaddlePaddle:
- Paddle Inference: https://github.com/bentoml/gallery/blob/master/paddlepaddle/LinearRegression/LinearRegression.ipynb
- PaddleHub: https://github.com/bentoml/gallery/blob/master/paddlehub/image-segmentation/image-segmentation.ipynb
See the announcement and release note from PaddleHub: https://github.com/PaddlePaddle/PaddleHub/releases/tag/v2.1.0
Thank you @cqvu @deehrlic for contributing this feature in BentoML.
Bug fixes
BentoML-0.12.0
Detailed Changelog: v0.11.0...v0.12.0
New Features
-
Breaking Change: Default Model Worker count is set to one #1454
- Please use the
--worker
CLI argument for specifying a number of workers of your deployment - For heavy production workload, we recommend experiment with different worker count and benchmark test your BentoML service in API server in your target hardware to get a better understanding of the model server performance
- Please use the
-
Breaking Change: Micro-batching layer(Marshal Server) is now enabled by default #1498
- For Inference APIs defined with
batch=True
, this will enable micro-batching behavior when serving. User can disable with the--diable-microbatch
flag - For Inference APIs with
batch=False
, API requests are now being queued in Marshal and then forwarded to the model backend server
- For Inference APIs defined with
-
New: Use non-root user in BentoML's API server docker image
-
New: API/CLI for bulk delete of BentoML bundle in Yatai #1313
-
Easier dependency management for PyPI and conda
- Support all pip install options via a user-provided
requirements.txt
file - Breaking Change: when
requirements_txt_file
option is in use, other pip package options will be ignored conda_override_channels
option for using explicit conda channel for conda dependencies: https://docs.bentoml.org/en/latest/concepts.html#conda-packages
- Support all pip install options via a user-provided
- Better support for pip install options and remote python dependencies #1421
- Let BentoML do it for you:
@bentoml.env(infer_pip_packages=True)
- use the existing "pip_packages" API, to specify list of dependencies:
@bentoml.env(
pip_packages=[
'scikit-learn',
'pandas @https://github.com/pypa/pip/archive/1.3.1.zip',
]
)
- use a requirements.txt file to specify all dependencies:
@bentoml.env(requirements_txt_file='./requirements.txt')
In the ./requirements.txt
file, all pip install options can be used:
#
# These requirements were autogenerated by pipenv
# To regenerate from the project's Pipfile, run:
#
# pipenv lock --requirements
#
-i https://pypi.org/simple
scikit-learn==0.20.3
aws-sam-cli==0.33.1
psycopg2-binary
azure-cli
bentoml
pandas @https://github.com/pypa/pip/archive/1.3.1.zip
https://[username[:password]@]pypi.company.com/simple
https://user:he%2F%2Fo@pypi.company.com
git+https://myvcs.com/some_dependency@sometag#egg=SomeDependency
- API/CLI for bulk delete #1313
CLI command for delete:
# Delete all saved Bento with specific name
bentoml delete --name IrisClassifier
bentoml delete --name IrisClassifier -y # do it without confirming with user
bentoml delete --name IrisClassifier --yatai-url=yatai.mycompany.com # delete in remote Yatai
# Delete all saved Bento with specific tag
bentoml delete --labels "env=dev"
bentoml delete --labels "env=dev, user=foobar"
bentoml delete --labels "key1=value1, key2!=value2, key3 In (value3, value3a), key4 DoesNotExist"
# Delete multiple saved Bento by their name:version tag
bentoml delete --tag "IrisClassifier:v1, MyService:v3, FooBar:20200103_Lkj81a"
# Delete all
bentoml delete --all
Yatai Client Python API:
yc = get_yatai_client() # local Yatai
yc = get_yatai_client('remote.yatai.com:50051') # remoate Yatai
yc.repository.delete(prune, labels, bento_tag, bento_name, bento_version, require_confirm)
"""
Params:
prune: boolean, Set true to delete all bento services
bento_tag: Bento tag
labels: string, label selector to filter bento services to delete
bento_name: string
bento_version: string,
require_confirm: boolean require user confirm interactively in CLI
"""
- #1334 Customize route of an API endpoint
@env(infer_pip_packages=True)
@artifacts([...])
class MyPredictionService(BentoService)
@api(route="/my_url_route/foo/bar", batch=True, input=DataframeInput())
def predict(self, df):
# instead of "/predict", the URL for this API endpoint will be "/my_url_route/foo/bar"
...
v0.11.0
New Features
Detailed Changelog: v0.10.1...v0.11.0
Interactively start and stop Model API Server during development
A new API was introduced in 0.11.0 for users to start and test an API server while developing their BentoService class:
service = MyPredictionService()
service.pack("model", model)
# Start an API model server in the background
service.start_dev_server(port=5000)
# Send test request to the server or open the URL in browser
requests.post(f'http://localhost:5000/predict', data=review, headers=headers)
# Stop the dev server
service.stop_dev_server()
# Modify code and repeat ♻️
Here's an example notebook showcasing this new feature.
More PyTorch eco-system Integrations
Logging is fully customizable now!
Users can now use one single YAML file to customize the logging behavior in BentoML, including the prediction logs and feedback logs.
https://docs.bentoml.org/en/latest/guides/logging.html
Two new configs are also introduced for quickly turning on/off console logging and file logging:
https://github.com/bentoml/BentoML/blob/v0.11.0/bentoml/configuration/default_bentoml.cfg#L29
[logging]
console_logging_enabled = true
file_logging_enabled = true
If you are not sure how this config works, here's a new guide on how BentoML's configuration works: https://docs.bentoml.org/en/latest/guides/configuration.html
More model management APIs
All model management CLI and Yatai client python API now supports the yatai_url
parameter, making it easy to interact with a remote YataiService, for centrally manage all your BentoML packaged ML models:
Support bundling zipimport modules #1261
Bundling zipmodules
with BentoML is possible now with this newly added API:
@bentoml.env(zipimport_archives=['nested_zipmodule.zip'])
@bentoml.artifacts([SklearnModelArtifact('model')])
class IrisClassifier(bentoml.BentoService):
...
BentoML also manages the sys.path
when loading a saved BentoService with zipimport archives, making sure the zip modules can be imported in user code.
Announcements
Monthly Community Meeting
Thank you again for everyone coming to the first community meeting this week! If you are not invited to the community meeting calendar yet, make sure to join it here: https://github.com/bentoml/BentoML/discussions/1396
Hiring
BentoML team is hiring multiple Software Engineer roles to help build the future of this open-source project and the business behind it - we are looking for someone with experience in one of the following areas: ML infrastructure, backend systems, data engineering, SRE, full-stack, and technical writing. Feel free to pass along the message to anyone you know who might be interested, we'd really appreciate that!
BentoML-0.10.1
Bug Fix
This is a minor release containing one bug fix for issue #1318, where the docker build process for the BentoML API model server was broken due to an error in the init shell script. The issue has been fixed in #1319 and included in this new release.
The reason our integration tests did not catch this issue was due to the fact that we are bundling the "dirty" BentoML installation in the generated docker file in the development environment and CI/Test environment, whereas the production release version of BentoML, uses the BentoML installed from PyPI. And the issue in #1318 was an edge case that can be triggered only when using the released version of BentoML and published docker image. We are investigating ways to run all our integration tests with a preview release before making a final release, as part of our QA process, which should help us prevent this type of bugs from getting into final releases in the future.
BentoML-0.10.0
New Features & Improvements
from bentoml.yatai.client import get_yatai_client
bento_service.save() # Save and register the bento service locally
# push to save bento service to remote yatai service.
yc = get_yatai_client('http://staging.yatai.mycompany.com:50050')
yc.repository.push(
f'{bento_service.name}:{bento_service.version}',
)
# Pull bento service from remote yatai server and register locally
yc = get_yatai_client('http://staging.yatai.mycompany.com:50050')
yc.repository.pull(
'bento_name:version',
)
#delete in local yatai
yatai_client = get_yatai_client()
yatai_client.repository.delete('name:version')
# delete in batch by labels
yatai_client = get_yatai_client()
yatai_client.prune(labels='cicd=failed, framework In (sklearn, xgboost)')
# Get bento service metadata
yatai_client.repository.get('bento_name:version', yatai_url='http://staging.yatai.mycompany.com:50050')
# List bento services by label
yatai_client.repositorylist(labels='label_key In (value1, value2), label_key2 Exists', yatai_url='http://staging.yatai.mycompany.com:50050')
New CLI commands for model management:
Push local bento service to remote yatai service:
$ bentoml push bento_service_name:version --yatai-url http://staging.yatai.mycompany.com:50050
Added --yatai-url
option for the following CLI commands to interact with remote yatai service directly:
bentoml get
bentoml list
bentoml delete
bentoml retrieve
bentoml run
bentoml serve
bentoml serve-gunicorn
bentoml info
bentoml containerize
bentoml open-api-spec
- Model Metadata API #1179 shoutout to @jackyzha0 for designing and building this feature!
Ability to save additional metadata for any artifact type, e.g.:
model_metadata = {
'k1': 'v1',
'job_id': 'ABC',
'score': 0.84,
'datasets': ['A', 'B'],
}
svc.pack("model", test_model, metadata=model_metadata)
svc.save_to_dir(str(tmpdir))
loaded_service = bentoml.load(str(tmpdir))
print(loaded_service.artifacts.get('model').metadata)
-
Improved Tensorflow Support, by @bojiang
-
Automated AWS EC2 deployment #1160 massive 3800+ line PR by @mayurnewase
- Create auto-scaling endpoint on AWS EC2 with just one command, see documentation here https://docs.bentoml.org/en/latest/deployment/aws_ec2.html
-
Enable input & output data capture in Sagemaker deployment #1189 by @j-hartshorn
-
Faster docker image rebuild when only model artifacts are updated #1199
-
Support URL location prefix in yatai-service gRPC/Web server #1063 #1184
-
Support relative path for showing Swagger UI page in the model server #1207
-
Add onnxruntime gpu as supported backend #1213
-
Add label and artifact metadata display to yatai web ui #1249
-
Make bentoml module executable #1274
python -m bentoml <subcommand>
bentoml serve-gunicorn --enable-microbatch --mb-max-latency 3333 --mb-max-batch-size 3333 IrisClassifier:20201202154246_C8DC0A
Bug fixes
- Allow deleting bento that was previously deleted with the same name and version #1211
- Construct docker API client from env #1233
- Pin-down SqlAlchemy version #1238
- Avoid potential TypeError in batching server #1252
- Fix inference API docstring override by default #1302
Documentation
- Add examples of queries with requests for adapters #1202
- Update import paths to reflect fastai2->fastai rename #1227
- Add model artifact metadata information to the core concept page #1259
- Update adapters.rst to include new input adapters #1269
- Update quickstart guide #1262
- Docs for gluon support #1271
- Fix CURL commands for posting files in input adapters doc string #1307
Internal, CI, and Tests
- Fix installing bundled pip dependencies in Azure and Sagemaker deployments #1214 (affects bentoml developers only)
- Add Integration test for Fasttext #1221
- Add integration test for spaCy #1236
- Add integration test for models using tf native API #1245
- Add tests for run_api_server_docker_container microbatch #1247
- Add integration test for LightGBM #1243
- Update Yatai web ui node dependencies version #1256
- Add integration test for bento management #1263
- Add yatai server integration tests to Github CI #1265
- Update e2e yatai service tests #1266
- Include additional information for EC2 test #1270
- Refactor CI for TensorFlow2 #1277
- Make tensorflow integration tests run faster #1278
- Fix overrided protobuf version in CI #1286
- Add integration test for tf1 #1285
- Refactor yatai service integration test #1290
- Refactor Saved Bundle Loader #1291
- Fix flaky yatai service integration tests #1298
- Refine KerasModelArtifact & its integration test #1295
- Improve API server integration tests #1299
- Add integration tests for ragged_tensor #1303
Announcements
- We have started using Github Projects feature to track roadmap items for BentoML, you can find it here: https://github.com/bentoml/BentoML/projects/1
- We are hiring senior engineers and a lead developer advocate to join our team, let us know if you or someone you know might be interested 👉 contact@bentoml.ai
- Apologize for the long wait between 0.9 and 0.10 releases, we are getting back to doing our bi-weekly release schedule now! We need help with documenting new features, writing release notes as well as QA new release before it went out, let us know if you'd be interested in helping out!
Thank you everyone for contributing to this release! @j-hartshorn @withsmilo @yubozhao @bojiang @changhw01 @mayurnewase @telescopic @jackyzha0 @pncnmnp @kishore-ganesh @rhbian @liusy182 @awalvie @cathy-kim @jsemric 🎉🎉🎉
BentoML-0.9.2
BentoML-0.9.1
A minor release with a bug fix
0.9.1 fixed an issue when using the requirements_txt_file
parameter in @env
definition, API server fails to start in a docker container. See more details in #1153.
BentoML-0.9.0
What's New
TLDR;
- New input/output adapter design that let's user choose between batch or non-batch implementation
- Speed up the API model server docker image build time
- Changed the recommended import path of artifact classes, now artifact classes should be imported from
bentoml.frameworks.*
- Improved python pip package management
- Huggingface/Transformers support!!
- Managed packaged models with Labels API
- Support GCS(Google Cloud Storage) as model storage backend in YataiService
- Current Roadmap for feedback: #1128
New Input/Output adapter design
A massive refactoring on BentoML's inference API and input/output adapter redesign, lead by @bojiang with help from @akainth015.
BREAKING CHANGE: API definition now requires declaring if it is a batch API or non-batch API:
from typings import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import JsonInput
from bentoml.types import JsonSerializable # type annotations are optional
@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('classifier')])
class MyPredictionService(BentoService):
@api(input=JsonInput(), batch=True)
def predict_batch(self, parsed_json_list: List[JsonSerializable]):
results = self.artifacts.classifier([j['text'] for j in parsed_json_list])
return results
@api(input=JsonInput()) # default batch=False
def predict_non_batch(self, parsed_json: JsonSerializable):
results = self.artifacts.classifier([parsed_json['text']])
return results[0]
For APIs with batch=True
, the user-defined API function will be required to process a list of input item at a time, and return a list of results of the same length. On the contrary, @api
by default uses batch=False
, which processes one input item at a time. Implementing a batch API allow your workload to benefit from BentoML's adaptive micro-batching mechanism when serving online traffic, and also will speed up offline batch inference job. We recommend using batch=True
if performance & throughput is a concern. Non-batch APIs are usually easier to implement, good for quick POC, simple use cases, and deploying on Serverless platforms such as AWS Lambda, Azure function, and Google KNative.
Read more about this change and example usage here: https://docs.bentoml.org/en/latest/api/adapters.html
BREAKING CHANGE: For DataframeInput
and TfTensorInput
users, it is now required to add batch=True
DataframeInput and TfTensorInput are special input types that only support accepting a batch of input at one time.
Input data validation while handling batch input
When the API function received a list of input, it is now possible to reject a subset of the input data and return an error code to the client, if the input data is invalid or malformated. Users can do this via the InferenceTask#discard
API, here's an example:
from typings import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import JsonInput
from bentoml.types import JsonSerializable, InferenceTask # type annotations are optional
@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('classifier')])
class MyPredictionService(BentoService):
@api(input=JsonInput(), batch=True)
def predict_batch(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]):
model_input = []
for json, task in zip(parsed_json_list, tasks):
if "text" in json:
model_input.append(json['text'])
else:
task.discard(http_status=400, err_msg="input json must contain `text` field")
results = self.artifacts.classifier(model_input)
return results
The number of tasks got discarded plus the length of the results array returned, should be equal to the length of the input list, this will allow BentoML to match the results back to tasks that have not yet been discarded.
Allow fine-grained control of the HTTP response, CLI inference job output, etc. E.g.:
import bentoml
from bentoml.types import JsonSerializable, InferenceTask, InferenceError # type annotations are optional
class MyService(bentoml.BentoService):
@bentoml.api(input=JsonInput(), batch=False)
def predict(self, parsed_json: JsonSerializable, task: InferenceTask) -> InferenceResult:
if task.http_headers['Accept'] == "application/json":
predictions = self.artifact.model.predict([parsed_json])
return InferenceResult(
data=predictions[0],
http_status=200,
http_headers={"Content-Type": "application/json"},
)
else:
return InferenceError(err_msg="application/json output only", http_status=400)
Or when batch=True:
import bentoml
from bentoml.types import JsonSerializable, InferenceTask, InferenceError # type annotations are optional
class MyService(bentoml.BentoService):
@bentoml.api(input=JsonInput(), batch=True)
def predict(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]) -> List[InferenceResult]:
rv = []
predictions = self.artifact.model.predict(parsed_json_list)
for task, prediction in zip(tasks, predictions):
if task.http_headers['Accept'] == "application/json":
rv.append(
InferenceResult(
data=prediction,
http_status=200,
http_headers={"Content-Type": "application/json"},
))
else:
rv.append(InferenceError(err_msg="application/json output only", http_status=400))
# or task.discard(err_msg="application/json output only", http_status=400)
return rv
Other adapter changes:
-
Added a 3 base adapters for implementing advanced adapters: FileInput, StringInput, MultiFileInput
-
Implementing new adapters that support micro-batching is a lot easier now: https://github.com/bentoml/BentoML/blob/v0.9.0.pre/bentoml/adapters/base_input.py
-
Per inference task prediction log #1089
-
More adapters support launching batch inference job from BentoML CLI run command now, see API reference for detailed examples: https://docs.bentoml.org/en/latest/api/adapters.html
Docker Build Improvements
- Optimize docker image build time (#1081) kudos to @ZeyadYasser!!
- Per python minor version base image to speed up image building #1101 #1096, thanks @gregd33!!
- Add "latest" tag to all user-facing docker base images (#1046)
Improved pip package management
Setting pip install options in BentoService @env
specification
As suggested here: #1036 (comment), Thanks @danield137 for suggesting the pip_extra_index_url
option!
@env(
auto_pip_dependencies=True,
pip_index_url='my_pypi_host_url',
pip_trusted_host='my_pypi_host_url',
pip_extra_index_url='extra_pypi_index_url'
)
@artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):
...
BREAKING CHANGE Due to this change, we have now removed the previous docker build arg PIP_INDEX_URL and ARG PIP_TRUSTED_HOST, due to it may be conflicting with settings in base image #1036
-
Support passing a conda environment.yml file to
@env
, as suggested in #725 #725 -
When a version is not specified in pip_packages list, it is expected to pin to the version found in the current python session. Now it is doing the same for packages added from adapter and artifact classes
-
Support specifying package requirement range now, e.g.:
@env(pip_packages=["abc==1.3", "foo>1.2,<=1.4"])
It can be any pip version requirement specifier https://pip.pypa.io/en/stable/reference/pip_install/#requirement-specifiers
- Renamed
pip_dependencies
topip_packages
andauto_pip_dependencies
toinfer_pip_packages
, the old API still works but will eventually be deprecated.
GCS support in YataiService
Adding Google Cloud Storage (GCS) support in YataiService, as the storage backend. This is an alternative to AWS S3, MiniIO, or POSIX file system. #1017 - Thank you @Korusuke @PrabhanshuAttri for creating the GCS support!
YataiService Labels API for model management
Managed packaged models in YataiService with labels API implemented in #1064
- Add labels to
BentoService.save
svc = MyBentoService()
svc.save(labels={'my_key': 'my_value', 'test': 'passed'})
- Add label query for CLI commands
-
bentoml get BENTO_NAME
,bentoml list
,bentoml deployment list
,bentoml lambda list
,bentoml sagemaker list
,bentoml azure-functions list
-
label query supports
=
,!=
,In
,NotIn
,Exists
,DoesNotExists
operator- e.g. key1=value1, key2!=value2, env In (prod, staging), Key Exists, Another_key DoesNotExist
Simple key/value label selector
Use In operator
<img width="1348" alt="Screen Shot 2020-09-0...
BentoML-0.8.6
What's New
Yatai service helm chart for Kubernetes deployment #945 @jackyzha0
Helm chart offers a convenient way to deploy YataiService to a Kubernetes cluster
# Download BentoML source
$ git clone https://github.com/bentoml/BentoML.git
$ cd BentoML
# 1. Install an ingress controller if your cluster doesn't already have one, Yatai helm chart installs nginx-ingress by default:
$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx && helm dependencies build helm/YataiService
# 2. Install YataiService helm chart to the Kubernetes cluster:
$ helm install -f helm/YataiService/values/postgres.yaml yatai-service YataiService
# 3. To uninstall the YataiService from your cluster:
$ helm uninstall yatai-service
@jackyzha0 added a great tutorial about YataiService helm chart deployment. You can find the guide at https://docs.bentoml.org/en/latest/guides/helm.html
[Experimental] AnnotatedImageInput adapter for image plus additional JSON data #973 @ecrows
The AnnotatedImageInput adapter is designed for the common use-cases of image input to include additional information such as object detection bounding boxes, segmentation masks, etc. for prediction. This new adapter significantly improves the developer experience over the previous workaround solution.
Warning: Input adapters are currently under refactoring #1002, we may change the API for AnnotatedImageInput in future releases.
from bentoml.adapters import AnnotatedImageInput
from bentoml.artifact import TensorflowSavedModelArtifact
import bentoml
CLASS_NAMES = ['cat', 'dog']
@bentoml.artifacts([TensorflowSavedModelArtifact('classifier')]
class PetClassification(bentoml.BentoService):
@api(input=AnnotatedImageInput)
def predict(self, image, annotations):
cropped_pets = some_pet_finder(image, annotations)
results = self.artifacts.classifier.predict(cropped_pets)
return [CLASS_NAMES[r] for r in results]
Making a request using curl
$ curl -F image=@image.png -F annotations=@annotations.json http://localhost:5000/predict
You can find the current API reference at https://docs.bentoml.org/en/latest/api/adapters.html#annotatedimageinput
Improvements:
- #992 Make the prediction and feedback loggers log to console by default - @jackyzha0
- #952 Add tutorial for deploying BentoService to Azure SQL server to the documentation @yashika51
Bug Fixes:
- #987 & #991 Better AWS IAM roles handles for Sagemaker Deployment - @dinakar29
- #995 Fix an edge case for encountering RecursionError when running gunicorn server with
--enable-microbatch
on MacOS @bojiang - #1012 Fix ruamel.yaml missing issue when using containerized BentoService with Conda. @parano
Internal & Testing:
- #983 Move CI tests to Github Actions
Contributors:
Thank you, everyone, for contributing to this exciting release!
@bojiang @jackyzha0 @ecrows @dinakar29 @yashika51 @akainth015