Releases: bentoml/BentoML
BentoML - v1.0.8
🍱 BentoML v1.0.8
is released with a list of improvement we hope that you’ll find useful.
-
Introduced Bento Client for easy access to the BentoML service over HTTP. Both sync and async calls are supported. See the Bento Client Guide for more details.
from bentoml.client import Client client = Client.from_url("http://localhost:3000") # Sync call response = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]])) # Async call response = await client.async_classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
-
Introduced custom metrics support for easy instrumentation of custom metrics over Prometheus. See Metrics Guide for more details.
# Histogram metric inference_duration = bentoml.metrics.Histogram( name="inference_duration", documentation="Duration of inference", labelnames=["nltk_version", "sentiment_cls"], ) # Counter metric polarity_counter = bentoml.metrics.Counter( name="polarity_total", documentation="Count total number of analysis by polarity scores", labelnames=["polarity"], )
Full Prometheus style syntax is supported for instrumenting custom metrics inside API and Runner definitions.
# Histogram inference_duration.labels( nltk_version=nltk.__version__, sentiment_cls=self.sia.__class__.__name__ ).observe(time.perf_counter() - start) # Counter polarity_counter.labels(polarity=is_positive).inc()
-
Improved health checking to also cover the status of runners to avoid returning a healthy status before runners are ready.
-
Added SSL/TLS support to gRPC serving.
bentoml serve-grpc --ssl-certfile=credentials/cert.pem --ssl-keyfile=credentials/key.pem --production --enable-reflection
-
Added channelz support for easy debugging gRPC serving.
-
Allowed nested requirements with the
-r
syntax.# requirements.txt -r nested/requirements.txt pydantic Pillow fastapi
-
Improved the adaptive batching dispatcher auto-tuning ability to avoid sporadic request failures due to batching in the beginning of the runner lifecycle.
-
Fixed a bug such that runners will raise a
TypeError
when overloaded. Now anHTTP 503 Service Unavailable
will be returned when runner is overloaded.File "python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 188, in async_run_method return tuple(AutoContainer.from_payload(payload) for payload in payloads) TypeError: 'Response' object is not iterable
💡 We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.
- Check out the updated PyTorch Framework Guide on how to use
external_modules
to save classes or utility functions required by the model. - See the Metrics Guide on how to add custom metrics to your API and custom Runners.
- Learn more about how to use the Bento Client to call your BentoML service with Python easily.
- Check out the latest blog post on why model serving over gRPC matters to data scientists.
🥂 We’d like to thank the community for your continued support and engagement.
- Shout out to @judahrand for multiple contributions to BentoML and bentoctl.
- Shout out to @phildamore-phdata, @quandollar, @2JooYeon, and @fortunto2 for their first contribution to BentoML.
BentoML - v1.0.7
🍱 BentoML released v1.0.7
as a patch to quickly fix a critical module import issue introduced in v1.0.6
. The import error manifests in the import of any modules under io.*
or models.*
. The following is an example of a typical error message and traceback. Please upgrade to v1.0.7
to address this import issue.
packages/anyio/_backends/_asyncio.py", line 21, in <module>
from io import IOBase
ImportError: cannot import name 'IOBase' from 'bentoml.io'
What's Changed
- test(grpc): e2e + unit tests by @aarnphm in #2984
- feat: support multipart upload for large bento and model by @yetone in #3044
- fix(config): respect
api_server.workers
by @judahrand in #3049 - chore(lint): remove unused import by @aarnphm in #3051
- fix(import): namespace collision by @aarnphm in #3058
New Contributors
- @judahrand made their first contribution in #3049
Full Changelog: v1.0.6...v1.0.7
BentoML - v1.0.6
🍱 BentoML has just released v1.0.6
featuring the gRPC preview! Without changing a line of code, you can now serve your Bentos as a gRPC service. Similar to serving over HTTP, BentoML gRPC supports all the ML frameworks, observability features, adaptive batching, and more out-of-the-box, simply by calling the serve-grpc
CLI command.
> pip install "bentoml[grpc]"
> bentoml serve-grpc iris_classifier:latest --production
- Checkout our updated tutorial for a quick 10-minute crash course of BentoML gRPC.
- Review the standardized Protobuf definition of service APIs and IO types, NDArray, DataFrame, File/Image, JSON, etc.
- Learn more about multi-language client support (Python, Go, Java, Node.js, etc) with working examples.
- Customize gRPC service by mounting new servicers and interceptors.
v1.1.0
minor version release.
- Enhanced access logging format to output Trace and Span IDs in the more standard hex encoding by default.
- Added request total, duration, and in-progress metrics to Runners, in addition to API Servers.
- Added support for XGBoost SKLearn models.
- Added support for restricting image mime types in the Image IO descriptor.
🥂 We’d like to thank our community for their contribution and support.
- Shout out to @benjamintanweihao for fixing a BentoML CLI bug.
- Shout out to @lsh918 for mixing a PyTorch framework issue.
- Shout out to @jeffthebear for enhancing the Pandas DataFrame OpenAPI schema.
- Shout out to @jiewpeng for adding the support for customizing access logs with Trace and Span ID formats.
What's Changed
- fix: log runner errors explicitly by @ssheng in #2952
- ci: temp fix for models test by @sauyon in #2949
- fix: fix context parameter for multi-input IO descriptors by @sauyon in #2948
- fix: use
torch.from_numpy()
instead oftorch.Tensor()
to keep data type by @lsh918 in #2951 - docs: fix wrong name for example neural net by @ssun-g in #2959
- docs: fix bentoml containerize command help message by @aarnphm in #2957
- chore(cli): remove unused
--no-trunc
by @benjamintanweihao in #2965 - fix: relax regex for setting environment variables by @benjamintanweihao in #2964
- docs: update wrong paths for disabling logs by @creativedutchmen in #2974
- feat: track serve update for start subcommands by @ssheng in #2976
- feat: logging customization by @jiewpeng in #2961
- chore(cli): using quotes instead of backslash by @sauyon in #2981
- feat(cli): show full tracebacks in debug mode by @sauyon in #2982
- feature(runner): add multiple output support by @larme in #2912
- docs: add airflow integration page by @parano in #2990
- chore(ci): fix the unit test of transformers by @bojiang in #3003
- chore(ci): fix the issue caused by the change of check_task by @bojiang in #3004
- fix(multipart): support multipart file inputs to non-file descriptors by @sauyon in #3005
- feat(server): add runner metrics; refactoring batch size metrics by @bojiang in #2977
- EXPERIMENTAL: gRPC support by @aarnphm in #2808
- fix(runner): receive requests before cork by @bojiang in #2996
- fix(server): service_name label of runner metrics by @bojiang in #3008
- chore(misc): remove mentioned for team member from PR request by @aarnphm in #3009
- feat(xgboost): support xgboost sklearn models by @sauyon in #2997
- feat(io/image): allow restricting mime types by @sauyon in #2999
- fix(grpc): docker message by @aarnphm in #3012
- fix: broken legacy metrics by @aarnphm in #3019
- fix(e2e): exception test for image IO by @aarnphm in #3017
- revert(3017): filter write-only mime type for Image IO by @bojiang in #3020
- chore: cleanup containerize utils by @aarnphm in #3014
- feat(proto): add
serialized_bytes
topb.Part
by @aarnphm in #3022 - docs: Update README.md by @parano in #3023
- chore(grpc): vcs generated stubs by @aarnphm in #3016
- feat(io/image): allow writeable mimes as output by @sauyon in #3024
- docs: fix descriptor typo by @darioarias in #3027
- fix(server): log localhost instead of 0.0.0.0 by @sauyon in #3033
- fix(io): Pandas OpenAPI schema by @jeffthebear in #3032
- chore(docker): support more cuda versions by @larme in #3035
- docs: updates on blocks that failed to render by @aarnphm in #3031
- chore: migrate to pyproject.toml by @aarnphm in #3025
- docs: gRPC tutorial by @aarnphm in #3013
- docs: gRPC advanced guides by @aarnphm in #3034
- feat(configuration): override options with envvar by @bojiang in #3018
- chore: update links by @aarnphm in #3040
- fix(configuration): should validate config early by @aarnphm in #3041
- qa(bentos): update latest options by @aarnphm in #3042
- qa: ignore tools from distribution by @aarnphm in #3045
- dependencies: ignore broken pypi combination by @aarnphm in #3043
- feat: gRPC tracking by @aarnphm in #3015
- configuration: migrate schema to
api_server
by @ssheng in #3046 - qa: cleanup MLflow by @aarnphm in #2945
New Contributors
- @lsh918 made their first contribution in #2951
- @ssun-g made their first contribution in #2959
- @benjamintanweihao made their first contribution in #2965
- @creativedutchmen made their first contribution in #2974
- @darioarias made their first contribution in #3027
- @jeffthebear made their first contribution in #3032
Full Changelog: v1.0.5...v1.0.6
BentoML - v1.0.5
🍱 BentoML v1.0.5
is released as a quick fix to a Yatai incompatibility introduced in v1.0.4
.
- The incompatibility manifests in the following error message when deploying a bento on Yatai. Upgrading BentoML to
v1.0.5
will resolve the issue.Error while finding module specification for 'bentoml._internal.server.cli.api_server' (ModuleNotFoundError: No module named 'bentoml._internal.server.cli')
- The incompatibility resides in all Yatai versions prior to
v1.0.0-alpha.*
. Alternatively, upgrading Yatai tov1.0.0-alpha.*
will also restore the compatibility with bentos built inv1.0.4
.
BentoML - v1.0.4
🍱 BentoML v1.0.4 is here!
-
Added support for explicit GPU mapping for runners. In addition to specifying the number of GPU devices allocated to a runner, we can map a list of device IDs directly to a runner through configuration.
runners: iris_clf_1: resources: nvidia.com/gpu: [2, 4] # Map device 2 and 4 to iris_clf_1 runner iris_clf_2: resources: nvidia.com/gpu: [1, 3] # Map device 1 and 3 to iris_clf_2 runner
-
Added SSL support for API server through both CLI and configuration.
--ssl-certfile TEXT SSL certificate file --ssl-keyfile TEXT SSL key file --ssl-keyfile-password TEXT SSL keyfile password --ssl-version INTEGER SSL version to use (see stdlib 'ssl' module) --ssl-cert-reqs INTEGER Whether client certificate is required (see stdlib 'ssl' module) --ssl-ca-certs TEXT CA certificates file --ssl-ciphers TEXT Ciphers to use (see stdlib 'ssl' module)
-
Added adaptive batching size histogram metrics,
BENTOML_{runner}_{method}_adaptive_batch_size_bucket
, for observability of batching mechanism details. -
Added support OpenTelemetry OTLP exporter for tracing and configures the OpenTelemetry resource automatically if user has not explicitly configured it through environment variables. Upgraded OpenTelemetry python packages to version
0.33b0
. -
Added support for saving
external_modules
alongside with models in thesave_model
API. Saving external Python modules is useful for models with external dependencies, such as tokenizers, preprocessors, and configurations. -
Enhanced Swagger UI to include additional documentation and helper links.
💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.
- Checkout the adaptive batching documentation on how to leverage batching to improve inference latency and efficiency.
- Checkout the runner configuration documentation on how to customize resource allocation for runners at run time.
🙌 We continue to receive great engagement and support from the BentoML community.
- Shout out to @sptowey for their contribution on adding SSL support.
- Shout out to @dbuades for their contribution on adding the OTLP exporter.
- Shout out to @tweeklab for their contribution on fixing a bug on
import_model
in the MLflow framework.
What's Changed
- refactor: cli to
bentoml_cli
by @sauyon in #2880 - chore: remove typing-extensions dependency by @sauyon in #2879
- fix: remove chmod install scripts by @aarnphm in #2830
- fix: relative imports to lazy by @aarnphm in #2882
- fix(cli): click utilities imports by @aarnphm in #2883
- docs: add custom model runner example by @parano in #2885
- qa: analytics unit tests by @aarnphm in #2878
- chore: script for releasing quickstart bento by @parano in #2892
- fix: pushing models from Bento instead of local modelstore by @parano in #2887
- fix(containerize): supports passing multiple tags by @aarnphm in #2872
- feat: explicit GPU runner mappings by @jjmachan in #2862
- fix: setuptools doesn't include
bentoml_cli
by @bojiang in #2898 - feat: Add SSL support for http api servers via bentoml serve by @sptowey in #2886
- patch: ssl styling and default value check by @aarnphm in #2899
- fix(scheduling): raise an error for invalid resources by @bojiang in #2894
- chore(templates): cleanup debian dependency logic by @aarnphm in #2904
- fix(ci): unittest failed by @bojiang in #2908
- chore(cli): add figlet for CLI by @aarnphm in #2909
- feat: codespace by @aarnphm in #2907
- feat: use yatai proxy to upload/download bentos/models by @yetone in #2832
- fix(scheduling): numpy worker environs are not taking effect by @bojiang in #2893
- feat: Adaptive batching size histogram metrics by @ssheng in #2902
- chore(swagger): include help links by @parano in #2927
- feat(tracing): add support for otlp exporter by @dbuades in #2918
- chore: Lock OpenTelemetry versions and add tracing metadata by @ssheng in #2928
- revert: unminify CSS by @aarnphm in #2931
- fix: importing mlflow:/ urls with no extra path info by @tweeklab in #2930
- fix(yatai): make presigned_urls_deprecated optional by @bojiang in #2933
- feat: add timeout option for bentoml runner config by @jjmachan in #2890
- perf(cli): speed up by @aarnphm in #2934
- chore: remove multipart IO descriptor warning by @ssheng in #2936
- fix(json): revert eager check by @aarnphm in #2926
- chore: remove
--config
flag to load the bentoml runtime config by @jjmachan in #2939 - chore: update README messaging by @ssheng in #2937
- fix: use a temporary file for file uploads by @sauyon in #2929
- feat(cli): add CLI command to serve a runner by @bojiang in #2920
- docs: Runner configuration for batching and resource allocation by @ssheng in #2941
- bug: handle bad image file by @parano in #2942
- chore(docs): earlier check for buildx by @aarnphm in #2940
- fix(cli): helper message default values by @ssheng in #2943
- feat(sdk): add external_modules option to save_model by @bojiang in #2895
- fix(cli): component name regression by @ssheng in #2944
New Contributors
- @sptowey made their first contribution in #2886
- @dbuades made their first contribution in #2918
- @tweeklab made their first contribution in #2930
Full Changelog: v1.0.3...v1.0.4
BentoML - v1.0.3
🍱 BentoML v1.0.3 release has brought a list of performance and feature improvement.
-
Improved Runner IO performance by enhancing the underlying serialization and deserialization, especially in models with large input and output sizes. Our image input benchmark showed a 100% throughput improvement.
-
Added support for specifying URLs to exclude from tracing.
🙌 We continue to receive great engagement and support from the BentoML community.
- Shout out to Ben Kessler for helping benchmarking performance.
- Shout out to Jiew Peng Lim for adding the support for configuring URLs to exclude from tracing.
- Shout out to Susana Bouchardet for add the support for JSON IO Descriptor to return empty response body.
- Thanks to Keming and mplk for contributing their first PRs in BentoML.
What's Changed
- chore(deps): bump actions/setup-node from 2 to 3 by @dependabot in #2846
- fix: extend --cache-from consumption to python tuple by @anwang2009 in #2847
- feat: add support for excluding urls from tracing by @jiewpeng in #2843
- docs: update notice about buildkit by @aarnphm in #2837
- chore: add CODEOWNERS by @aarnphm in #2842
- doc(frameworks): tensorflow by @bojiang in #2718
- feat: add support for specifying urls to exclude from tracing as a list by @jiewpeng in #2851
- fix(configuration): merging global runner config to runner specific config by @jjmachan in #2849
- fix: Setting status code and cookies by @ssheng in #2854
- chore: README typo by @kemingy in #2859
- chore: gallery links to
bentoml/examples
by @aarnphm in #2858 - fix(runner): use pickle instead for multi payload parameters by @aarnphm in #2857
- doc(framework): pytorch guide by @bojiang in #2735
- docs: add missing
output
to Runner docs by @mplk in #2868 - chore: fix push and load interop by @aarnphm in #2863
- fix: Usage stats by @ssheng in #2876
- fix:
JSON(IODescriptor[JSONType]).to_http_response
returns empty body when the response isNone
. by @sbouchardet in #2874 - chore: Address comments in the #2874 by @ssheng in #2877
- fix: debugger breaks on circus process by @aarnphm in #2875
- feat: support custom components for OpenAPI generation by @aarnphm in #2845
New Contributors
- @anwang2009 made their first contribution in #2847
- @jiewpeng made their first contribution in #2843
- @kemingy made their first contribution in #2859
- @mplk made their first contribution in #2868
- @sbouchardet made their first contribution in #2874
Full Changelog: v1.0.2...v1.0.3
BentoML - v1.0.2
🍱 We have just released BentoML v1.0.2 with a number of features and bug fixes requested by the community.
- Added support for custom model versions, e.g.
bentoml.tensorflow.save_model("model_name:1.2.4", model)
. - Fixed PyTorch Runner payload serialization issue due to tensor not on CPU.
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first
- Fixed Transformers GPU device assignment due to kwargs handling.
- Fixed excessive Runner thread spawning issue under high load.
- Fixed PyTorch Runner inference error due to saving tensor during inference mode.
RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
- Fixed Keras Runner error when the input has only a single element.
- Deprecated the
validate_json
option in JSON IO descriptor and recommended specifying validation logic natively in the Pydantic model.
🎨 We added an examples directory and in it you will find interesting sample projects demonstrating various applications of BentoML. We welcome your contribution if you have a project idea and would like to share with the community.
💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.
- Did you know BentoML service supports mounting and calling runners from custom FastAPI and Flask apps?
- Did you know IO descriptor supports input and output validation of schema, shape, and data types?
What's Changed
- chore: remove all
--pre
from documentation by @aarnphm in #2738 - chore(framework): onnx guide minor improvements by @larme in #2744
- fix(framework): fix how pytorch DataContainer convert GPU tensor by @larme in #2739
- doc: add missing variable by @robsonpeixoto in #2752
- chore(deps):
cattrs>=22.1.0
in setup.cfg by @sugatoray in #2758 - fix(transformers): kwargs and migrate to framework tests by @ssheng in #2761
- chore: add type hint for run and async_run by @aarnphm in #2760
- docs: fix typo in SECURITY.md by @parano in #2766
- chore: use pypa/build as PEP517 backend by @aarnphm in #2680
- chore(e2e): capture log output by @aarnphm in #2767
- chore: more robust prometheus directory ensuring by @bojiang in #2526
- doc(framework): add scikit-learn section to ONNX documentation by @larme in #2764
- chore: clean up dependencies by @sauyon in #2769
- docs: misc docs reorganize and cleanups by @parano in #2768
- fix(io descriptors): finish removing init_http_response by @sauyon in #2774
- chore: fix typo by @aarnphm in #2776
- feat(model): allow custom model versions by @sauyon in #2775
- chore: add watchfiles as bentoml dependency by @aarnphm in #2777
- doc(framework): keras guide by @larme in #2741
- docs: Update service schema and validation by @ssheng in #2778
- doc(frameworks): fix pip package syntax by @larme in #2782
- fix(runner): thread limiter doesn't take effect by @bojiang in #2781
- feat: add additional env var configuring num of threads in Runner by @parano in #2786
- fix(templates): sharing variables at template level by @aarnphm in #2796
- bug: fix JSON io_descriptor validate_json option by @parano in #2803
- chore: improve error message when failed importing user service code by @parano in #2806
- chore: automatic cache action version update and remove stale bot by @aarnphm in #2798
- chore(deps): bump actions/checkout from 2 to 3 by @dependabot in #2810
- chore(deps): bump codecov/codecov-action from 2 to 3 by @dependabot in #2811
- chore(deps): bump github/codeql-action from 1 to 2 by @dependabot in #2813
- chore(deps): bump actions/cache from 2 to 3 by @dependabot in #2812
- chore(deps): bump actions/setup-python from 2 to 4 by @dependabot in #2814
- fix(datacontainer): pytorch to_payload should disable gradient by @aarnphm in #2821
- fix(framework): fix keras single input edge case by @larme in #2822
- fix(framework): keras GPU handling by @larme in #2824
- docs: update custom bentoserver guide by @parano in #2809
- fix(runner): bind limiter to runner_ref instead by @bojiang in #2826
- fix(pytorch): inference_mode context is thead local by @bojiang in #2828
- fix: address multiple tags for containerize by @aarnphm in #2797
- chore: Add gallery projects under examples by @ssheng in #2833
- chore: running formatter on examples folder by @aarnphm in #2834
- docs: update security auth middleware by @g0nz4rth in #2835
- fix(io_descriptor): DataFrame columns check by @alizia in #2836
- fix: examples directory structure by @ssheng in #2839
- revert: "fix: address multiple tags for containerize (#2797)" by @ssheng in #2840
New Contributors
- @robsonpeixoto made their first contribution in #2752
- @sugatoray made their first contribution in #2758
- @g0nz4rth made their first contribution in #2835
- @alizia made their first contribution in #2836
Full Changelog: v1.0.0...v1.0.1
BentoML - v1.0.0
🍱 The wait is over. BentoML has officially released v1.0.0. We are excited to share with you the notable features improvements.
- Introduced BentoML Runner, an abstraction for parallel model inference. It allows the compute intensive model inference step to scale separately from the transformation and business logic. The Runner is easily instantiated and invoked, but behind the scenes, BentoML is optimizing for micro-batching and fanning out inference if needed. Here’s a simple example of instantiating a Runner. Learn more about using runners.
- Redesigned how models are saved, moved, and loaded with BentoML. We introduced new primitives which allow users to call a save_model() method which saves the model in the most optimal way based the recommended practices of the ML framework. The model is then stored in a flexible local repository where users can use “import” and “export” functionality to push and pull “finalized” models from remote locations like S3. Bentos can be built locally or remotely with these models. Once built, Yatai or bentoctl can easily deploy to the cloud service of your choice. Learn more about preparing models and building bentos.
- Enhanced micro-batching capability with the new runner abstraction, batching is even more powerful. When incoming data is spread to different transformation processes, the runner will fan in inferences when inference is invoked. Multiple inputs will be batched into a single inference call. Most ML frameworks implement some form of vectorization which improves performance for multiple inputs at once. Our adaptive batching not only batches inputs as they are received, but also regresses the time of the last several groups of inputs in order to optimize the batch size and latency windows.
- Improved reproducibility of the model by recording and locking the dependent library versions. We use the versions to package the correct dependencies so that the environment in which the model runs in production is identical to the environment it was trained in. All direct and transitive dependencies are recorded and deployed with the model when running in production. In our 1.0 version we now support Conda as well as several different ways to customize your pip packages when “building your Bento”. Learn more about building bentos.
- Simplified Docker image creation during containerization to generate the right image for you depending on the features that you’ve decided to implement in your service. For example, if your runner specifies that it can run on a GPU, we will automatically choose the right Nvidia docker image as a base when containerizing your service. If needed, we also provide the flexibility to customize your docker image as well. Learn more about containerization.
- Improved input and output validation with native type validation rules. Numpy and Pandas DataFrame can specify a static shape or even dynamically infer schema by providing sample data. The Pydantic schema that is produced per endpoint also integrates with our Swagger UI so that each endpoint is better documented for sharing. Learn more about service APIs and IO Descriptors.
bentoml==0.13.1
. We have also prepared a migration guide from v0.13.1 to v1.0.0 to help with your project migration. We are committed to supporting the v0.13-LTS versions with critical bug fixes and security patches.
🎉 After years of seeing hundreds of model serving use cases, we are proud to present the official release of BentoML 1.0. We could not have done it without the growth and support of our community.
BentoML - 1.0.0-rc3
We have just released BentoML 1.0.0rc3
with a number of highly anticipated features and improvements. Check it out with the following command!
$ pip install -U bentoml --pre
1.0.0
version next week and remove the need to use --pre
tag to install BentoML versions after 1.0.0
. If you wish to stay on the 0.13.1
LTS version, please lock the dependency with bentoml==0.13.1
.
- Added support for framework runners in the following ML frameworks.
- Added support for Huggingface Transformers custom pipelines.
- Fixed a logging issue causing the api_server and runners to not generate error logs.
- Optimized Tensorflow inference procedure.
- Improved resource request configuration for runners.
-
Resource request can be now configured in the BentoML configuration. If unspecified, runners will be scheduled to best utilized the available system resources.
runners: resources: cpu: 8.0 nvidia.com/gpu: 4.0
-
Updated the API for custom runners to declare the types of supported resources.
import bentoml class MyRunnable(bentoml.Runnable): SUPPORTS_CPU_MULTI_THREADING = True # Deprecated SUPPORT_CPU_MULTI_THREADING SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu") # Deprecated SUPPORT_NVIDIA_GPU ... my_runner = bentoml.Runner( MyRunnable, runnable_init_params={"foo": foo, "bar": bar}, name="custom_runner_name", ... )
-
Deprecated the API for specifying resources from the framework
to_runner()
and custom Runner APIs. For better flexibility at runtime, it is recommended to specifying resources through configuration.
-
What's Changed
- fix(dependencies): require pyyaml>=5 by @sauyon in #2626
- refactor(server): merge contexts; add yatai headers by @bojiang in #2621
- chore(pylint): update pylint configuration by @sauyon in #2627
- fix: Transformers NVIDIA_VISIBLE_DEVICES value type casting by @ssheng in #2624
- fix: Server silently crash without logging exceptions by @ssheng in #2635
- fix(framework): some GPU related fixes by @larme in #2637
- tests: minor e2e test cleanup by @sauyon in #2643
- docs: Add model in bentoml.pytorch.save_model() pytorch integration example by @AlexandreNap in #2644
- chore(ci): always enable actions on PR by @sauyon in #2646
- chore: updates ci by @aarnphm in #2650
- fix(docker): templates bash heredoc should pass
-ex
by @aarnphm in #2651 - feat: CatBoost integration by @yetone in #2615
- feat: FastAI by @aarnphm in #2571
- feat: Support Transformers custom pipeline by @ssheng in #2640
- feat(framework): onnx support by @larme in #2629
- chore(tensorflow): optimize inference procedure by @bojiang in #2567
- fix(runner): validate runner names by @sauyon in #2588
- fix(runner): lowercase runner names and add tests by @sauyon in #2656
- style: github naming by @aarnphm in #2659
- tests(framework): add new framework tests by @sauyon in #2660
- docs: missing code annotation by @jjmachan in #2654
- perf(templates): cache python installation via conda by @aarnphm in #2662
- fix(ci): destroy the runner after init_local by @bojiang in #2665
- fix(conda): python installation order by @aarnphm in #2668
- fix(tensorflow): casting error on kwargs by @bojiang in #2664
- feat(runner): implement resource configuration by @sauyon in #2632
New Contributors
- @AlexandreNap made their first contribution in #2644
Full Changelog: v1.0.0-rc2...v1.0.0-rc3
BentoML - 1.0.0-rc2
We have just released BentoML 1.0.0rc2 with an exciting lineup of improvements. Check it out with the following command!
$ pip install -U bentoml --pre
-
Standardized logging configuration and improved logging performance.
- If imported as a library, BentoML will no longer configure logging explicitly and will respect the logging configuration of the importing Python process. To customize BentoML logging as a library, configurations can be added for the
bentoml
logger.
formatters: ... handlers: ... loggers: ... bentoml: handlers: [...] level: INFO ...
- If started as a server, BentoML will continue to configure logging format and output to
stdout
atINFO
level. All third party libraries will be configured to log at theWARNING
level.
- If imported as a library, BentoML will no longer configure logging explicitly and will respect the logging configuration of the importing Python process. To customize BentoML logging as a library, configurations can be added for the
-
Added LightGBM framework support.
-
Updated model and bento creation timestamps CLI display to use the local timezone for better use experience, while timestamps in metadata will remain in the UTC timezone.
-
Improved the reliability of bento build with advanced options including base_image and dockerfile_template.
Beside all the exciting product work, we also started a blog at modelserving.com sharing our learnings gained from building BentoML and supporting the MLOps community. Checkout our latest blog [Breaking up with Flask & FastAPI: Why ML model serving requires a specialized framework] (share your thoughts with us on our LinkedIn post.
Lastly, a big shoutout to @mike Kuhlen for adding the LightGBM framework support. 🥂
What's Changed
- feat(cli): output times in the local timezone by @sauyon in #2572
- fix(store): use >= for time checking by @sauyon in #2574
- fix(build): use subprocess to call pip-compile by @sauyon in #2573
- docs: fix wrong variable name in comment by @kim-sardine in #2575
- feat: improve logging by @sauyon in #2568
- fix(service): JsonIO doesn't return a pydantic model by @bojiang in #2578
- fix: update conda env yaml file name and default channel by @parano in #2580
- chore(runner): add shcedule shortcuts to runners by @bojiang in #2576
- fix(cli): cli encoding error on Windows by @bojiang in #2579
- fix(bug): Make
model.with_options()
additive by @ssheng in #2519 - feat: dockerfile templates advanced guides by @aarnphm in #2548
- docs: add setuptools to docs dependencies by @parano in #2586
- test(frameworks): minor test improvements by @sauyon in #2590
- feat: Bring LightGBM back by @mqk in #2589
- fix(runner): pass init params to runnable by @sauyon in #2587
- fix: propagate should be false by @aarnphm in #2594
- fix: Remove starlette request log by @ssheng in #2595
- fix: Bug fix for 2596 by @timc in #2597
- chore(frameworks): update framework template with new checks and remove old framework code by @sauyon in #2592
- docs: Update streaming.rst by @ssheng in #2605
- bug: Fix Yatai client push bentos with model options by @ssheng in #2604
- docs: allow running tutorial from docker by @parano in #2611
- fix(model): lock attrs to >=21.1.0 by @bojiang in #2610
- docs: Fix documentation links and formats by @ssheng in #2612
- fix(model): load ModelOptions lazily by @sauyon in #2608
- feat: install.sh for python packages by @aarnphm in #2555
- fix/routing path by @aarnphm in #2606
- qa: build config by @aarnphm in #2581
- fix: invalid build option python_version="None" when base_image is used by @parano in #2623
New Contributors
- @kim-sardine made their first contribution in #2575
- @timc made their first contribution in #2597
Full Changelog: v1.0.0-rc1...v1.0.0rc2