Releases · bentoml/BentoML

01 Nov 00:43

ssheng

v1.0.8

8365375

BentoML - v1.0.8

🍱 BentoML v1.0.8 is released with a list of improvement we hope that you’ll find useful.

Introduced Bento Client for easy access to the BentoML service over HTTP. Both sync and async calls are supported. See the Bento Client Guide for more details.

from bentoml.client import Client

client = Client.from_url("http://localhost:3000")

# Sync call
response = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]]))

# Async call
response = await client.async_classify(np.array([[4.9, 3.0, 1.4, 0.2]]))

Introduced custom metrics support for easy instrumentation of custom metrics over Prometheus. See Metrics Guide for more details.

# Histogram metric
inference_duration = bentoml.metrics.Histogram(
    name="inference_duration",
    documentation="Duration of inference",
    labelnames=["nltk_version", "sentiment_cls"],
)

# Counter metric
polarity_counter = bentoml.metrics.Counter(
    name="polarity_total",
    documentation="Count total number of analysis by polarity scores",
    labelnames=["polarity"],
)

Full Prometheus style syntax is supported for instrumenting custom metrics inside API and Runner definitions.

# Histogram
inference_duration.labels(
    nltk_version=nltk.__version__, sentiment_cls=self.sia.__class__.__name__
).observe(time.perf_counter() - start)

# Counter
polarity_counter.labels(polarity=is_positive).inc()

Improved health checking to also cover the status of runners to avoid returning a healthy status before runners are ready.

Added SSL/TLS support to gRPC serving.

bentoml serve-grpc --ssl-certfile=credentials/cert.pem --ssl-keyfile=credentials/key.pem --production --enable-reflection

Added channelz support for easy debugging gRPC serving.

Allowed nested requirements with the -r syntax.

# requirements.txt
-r nested/requirements.txt

pydantic
Pillow
fastapi

Improved the adaptive batching dispatcher auto-tuning ability to avoid sporadic request failures due to batching in the beginning of the runner lifecycle.

Fixed a bug such that runners will raise a TypeError when overloaded. Now an HTTP 503 Service Unavailable will be returned when runner is overloaded.

File "python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 188, in async_run_method
    return tuple(AutoContainer.from_payload(payload) for payload in payloads)
TypeError: 'Response' object is not iterable

💡 We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

Check out the updated PyTorch Framework Guide on how to use external_modules to save classes or utility functions required by the model.
See the Metrics Guide on how to add custom metrics to your API and custom Runners.
Learn more about how to use the Bento Client to call your BentoML service with Python easily.
Check out the latest blog post on why model serving over gRPC matters to data scientists.

🥂 We’d like to thank the community for your continued support and engagement.

Shout out to @judahrand for multiple contributions to BentoML and bentoctl.
Shout out to @phildamore-phdata, @quandollar, @2JooYeon, and @fortunto2 for their first contribution to BentoML.

Contributors

fortunto2, judahrand, and 3 other contributors

Assets 4

03 Oct 00:59

ssheng

v1.0.7

47e884f

BentoML - v1.0.7

🍱 BentoML released v1.0.7 as a patch to quickly fix a critical module import issue introduced in v1.0.6. The import error manifests in the import of any modules under io.* or models.*. The following is an example of a typical error message and traceback. Please upgrade to v1.0.7 to address this import issue.

packages/anyio/_backends/_asyncio.py", line 21, in <module>
    from io import IOBase
ImportError: cannot import name 'IOBase' from 'bentoml.io'

What's Changed

test(grpc): e2e + unit tests by @aarnphm in #2984
feat: support multipart upload for large bento and model by @yetone in #3044
fix(config): respect api_server.workers by @judahrand in #3049
chore(lint): remove unused import by @aarnphm in #3051
fix(import): namespace collision by @aarnphm in #3058

New Contributors

@judahrand made their first contribution in #3049

Full Changelog: v1.0.6...v1.0.7

Contributors

yetone, judahrand, and aarnphm

Assets 4

27 Sep 23:11

ssheng

v1.0.6

b3bd5a7

BentoML - v1.0.6

🍱 BentoML has just released v1.0.6 featuring the gRPC preview! Without changing a line of code, you can now serve your Bentos as a gRPC service. Similar to serving over HTTP, BentoML gRPC supports all the ML frameworks, observability features, adaptive batching, and more out-of-the-box, simply by calling the serve-grpc CLI command.

> pip install "bentoml[grpc]"
> bentoml serve-grpc iris_classifier:latest --production

Checkout our updated tutorial for a quick 10-minute crash course of BentoML gRPC.
Review the standardized Protobuf definition of service APIs and IO types, NDArray, DataFrame, File/Image, JSON, etc.
Learn more about multi-language client support (Python, Go, Java, Node.js, etc) with working examples.
Customize gRPC service by mounting new servicers and interceptors.

⚠️ gRPC is current under preview. The public APIs may undergo incompatible changes in the future patch releases until the official v1.1.0 minor version release.

Enhanced access logging format to output Trace and Span IDs in the more standard hex encoding by default.
Added request total, duration, and in-progress metrics to Runners, in addition to API Servers.
Added support for XGBoost SKLearn models.
Added support for restricting image mime types in the Image IO descriptor.

🥂 We’d like to thank our community for their contribution and support.

Shout out to @benjamintanweihao for fixing a BentoML CLI bug.
Shout out to @lsh918 for mixing a PyTorch framework issue.
Shout out to @jeffthebear for enhancing the Pandas DataFrame OpenAPI schema.
Shout out to @jiewpeng for adding the support for customizing access logs with Trace and Span ID formats.

What's Changed

fix: log runner errors explicitly by @ssheng in #2952
ci: temp fix for models test by @sauyon in #2949
fix: fix context parameter for multi-input IO descriptors by @sauyon in #2948
fix: use torch.from_numpy() instead of torch.Tensor() to keep data type by @lsh918 in #2951
docs: fix wrong name for example neural net by @ssun-g in #2959
docs: fix bentoml containerize command help message by @aarnphm in #2957
chore(cli): remove unused --no-trunc by @benjamintanweihao in #2965
fix: relax regex for setting environment variables by @benjamintanweihao in #2964
docs: update wrong paths for disabling logs by @creativedutchmen in #2974
feat: track serve update for start subcommands by @ssheng in #2976
feat: logging customization by @jiewpeng in #2961
chore(cli): using quotes instead of backslash by @sauyon in #2981
feat(cli): show full tracebacks in debug mode by @sauyon in #2982
feature(runner): add multiple output support by @larme in #2912
docs: add airflow integration page by @parano in #2990
chore(ci): fix the unit test of transformers by @bojiang in #3003
chore(ci): fix the issue caused by the change of check_task by @bojiang in #3004
fix(multipart): support multipart file inputs to non-file descriptors by @sauyon in #3005
feat(server): add runner metrics; refactoring batch size metrics by @bojiang in #2977
EXPERIMENTAL: gRPC support by @aarnphm in #2808
fix(runner): receive requests before cork by @bojiang in #2996
fix(server): service_name label of runner metrics by @bojiang in #3008
chore(misc): remove mentioned for team member from PR request by @aarnphm in #3009
feat(xgboost): support xgboost sklearn models by @sauyon in #2997
feat(io/image): allow restricting mime types by @sauyon in #2999
fix(grpc): docker message by @aarnphm in #3012
fix: broken legacy metrics by @aarnphm in #3019
fix(e2e): exception test for image IO by @aarnphm in #3017
revert(3017): filter write-only mime type for Image IO by @bojiang in #3020
chore: cleanup containerize utils by @aarnphm in #3014
feat(proto): add serialized_bytes to pb.Part by @aarnphm in #3022
docs: Update README.md by @parano in #3023
chore(grpc): vcs generated stubs by @aarnphm in #3016
feat(io/image): allow writeable mimes as output by @sauyon in #3024
docs: fix descriptor typo by @darioarias in #3027
fix(server): log localhost instead of 0.0.0.0 by @sauyon in #3033
fix(io): Pandas OpenAPI schema by @jeffthebear in #3032
chore(docker): support more cuda versions by @larme in #3035
docs: updates on blocks that failed to render by @aarnphm in #3031
chore: migrate to pyproject.toml by @aarnphm in #3025
docs: gRPC tutorial by @aarnphm in #3013
docs: gRPC advanced guides by @aarnphm in #3034
feat(configuration): override options with envvar by @bojiang in #3018
chore: update links by @aarnphm in #3040
fix(configuration): should validate config early by @aarnphm in #3041
qa(bentos): update latest options by @aarnphm in #3042
qa: ignore tools from distribution by @aarnphm in #3045
dependencies: ignore broken pypi combination by @aarnphm in #3043
feat: gRPC tracking by @aarnphm in #3015
configuration: migrate schema to api_server by @ssheng in #3046
qa: cleanup MLflow by @aarnphm in #2945

New Contributors

@lsh918 made their first contribution in #2951
@ssun-g made their first contribution in #2959
@benjamintanweihao made their first contribution in #2965
@creativedutchmen made their first contribution in #2974
@darioarias made their first contribution in #3027
@jeffthebear made their first contribution in #3032

Full Changelog: v1.0.5...v1.0.6

Contributors

larme, creativedutchmen, and 11 other contributors

Assets 4

30 Aug 20:18

ssheng

v1.0.5

03a2b31

BentoML - v1.0.5

🍱 BentoML v1.0.5 is released as a quick fix to a Yatai incompatibility introduced in v1.0.4.

The incompatibility manifests in the following error message when deploying a bento on Yatai. Upgrading BentoML to v1.0.5 will resolve the issue.

Error while finding module specification for 'bentoml._internal.server.cli.api_server' (ModuleNotFoundError: No module named 'bentoml._internal.server.cli')

The incompatibility resides in all Yatai versions prior to v1.0.0-alpha.*. Alternatively, upgrading Yatai to v1.0.0-alpha.* will also restore the compatibility with bentos built in v1.0.4.

Assets 4

26 Aug 18:10

ssheng

v1.0.4

6370972

BentoML - v1.0.4

🍱 BentoML v1.0.4 is here!

Added support for explicit GPU mapping for runners. In addition to specifying the number of GPU devices allocated to a runner, we can map a list of device IDs directly to a runner through configuration.

runners:
  iris_clf_1:
    resources:
      nvidia.com/gpu: [2, 4] # Map device 2 and 4 to iris_clf_1 runner
  iris_clf_2:
    resources:
      nvidia.com/gpu: [1, 3] # Map device 1 and 3 to iris_clf_2 runner

Added SSL support for API server through both CLI and configuration.

  --ssl-certfile TEXT          SSL certificate file
  --ssl-keyfile TEXT           SSL key file
  --ssl-keyfile-password TEXT  SSL keyfile password
  --ssl-version INTEGER        SSL version to use (see stdlib 'ssl' module)
  --ssl-cert-reqs INTEGER      Whether client certificate is required (see stdlib 'ssl' module)
  --ssl-ca-certs TEXT          CA certificates file
  --ssl-ciphers TEXT           Ciphers to use (see stdlib 'ssl' module)

Added adaptive batching size histogram metrics, BENTOML_{runner}_{method}_adaptive_batch_size_bucket, for observability of batching mechanism details.
Added support OpenTelemetry OTLP exporter for tracing and configures the OpenTelemetry resource automatically if user has not explicitly configured it through environment variables. Upgraded OpenTelemetry python packages to version 0.33b0.
Added support for saving external_modules alongside with models in the save_model API. Saving external Python modules is useful for models with external dependencies, such as tokenizers, preprocessors, and configurations.
Enhanced Swagger UI to include additional documentation and helper links.

💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.

Checkout the adaptive batching documentation on how to leverage batching to improve inference latency and efficiency.
Checkout the runner configuration documentation on how to customize resource allocation for runners at run time.

🙌 We continue to receive great engagement and support from the BentoML community.

Shout out to @sptowey for their contribution on adding SSL support.
Shout out to @dbuades for their contribution on adding the OTLP exporter.
Shout out to @tweeklab for their contribution on fixing a bug on import_model in the MLflow framework.

What's Changed

refactor: cli to bentoml_cli by @sauyon in #2880
chore: remove typing-extensions dependency by @sauyon in #2879
fix: remove chmod install scripts by @aarnphm in #2830
fix: relative imports to lazy by @aarnphm in #2882
fix(cli): click utilities imports by @aarnphm in #2883
docs: add custom model runner example by @parano in #2885
qa: analytics unit tests by @aarnphm in #2878
chore: script for releasing quickstart bento by @parano in #2892
fix: pushing models from Bento instead of local modelstore by @parano in #2887
fix(containerize): supports passing multiple tags by @aarnphm in #2872
feat: explicit GPU runner mappings by @jjmachan in #2862
fix: setuptools doesn't include bentoml_cli by @bojiang in #2898
feat: Add SSL support for http api servers via bentoml serve by @sptowey in #2886
patch: ssl styling and default value check by @aarnphm in #2899
fix(scheduling): raise an error for invalid resources by @bojiang in #2894
chore(templates): cleanup debian dependency logic by @aarnphm in #2904
fix(ci): unittest failed by @bojiang in #2908
chore(cli): add figlet for CLI by @aarnphm in #2909
feat: codespace by @aarnphm in #2907
feat: use yatai proxy to upload/download bentos/models by @yetone in #2832
fix(scheduling): numpy worker environs are not taking effect by @bojiang in #2893
feat: Adaptive batching size histogram metrics by @ssheng in #2902
chore(swagger): include help links by @parano in #2927
feat(tracing): add support for otlp exporter by @dbuades in #2918
chore: Lock OpenTelemetry versions and add tracing metadata by @ssheng in #2928
revert: unminify CSS by @aarnphm in #2931
fix: importing mlflow:/ urls with no extra path info by @tweeklab in #2930
fix(yatai): make presigned_urls_deprecated optional by @bojiang in #2933
feat: add timeout option for bentoml runner config by @jjmachan in #2890
perf(cli): speed up by @aarnphm in #2934
chore: remove multipart IO descriptor warning by @ssheng in #2936
fix(json): revert eager check by @aarnphm in #2926
chore: remove --config flag to load the bentoml runtime config by @jjmachan in #2939
chore: update README messaging by @ssheng in #2937
fix: use a temporary file for file uploads by @sauyon in #2929
feat(cli): add CLI command to serve a runner by @bojiang in #2920
docs: Runner configuration for batching and resource allocation by @ssheng in #2941
bug: handle bad image file by @parano in #2942
chore(docs): earlier check for buildx by @aarnphm in #2940
fix(cli): helper message default values by @ssheng in #2943
feat(sdk): add external_modules option to save_model by @bojiang in #2895
fix(cli): component name regression by @ssheng in #2944

New Contributors

@sptowey made their first contribution in #2886
@dbuades made their first contribution in #2918
@tweeklab made their first contribution in #2930

Full Changelog: v1.0.3...v1.0.4

Contributors

parano, tweeklab, and 8 other contributors

Assets 4

08 Aug 18:56

ssheng

v1.0.3

3cc662c

BentoML - v1.0.3

🍱 BentoML v1.0.3 release has brought a list of performance and feature improvement.

Improved Runner IO performance by enhancing the underlying serialization and deserialization, especially in models with large input and output sizes. Our image input benchmark showed a 100% throughput improvement.
- v1.0.2 🐌
- v1.0.3 💨
Added support for specifying URLs to exclude from tracing.
Added support custom components for OpenAPI generation.

🙌 We continue to receive great engagement and support from the BentoML community.

Shout out to Ben Kessler for helping benchmarking performance.
Shout out to Jiew Peng Lim for adding the support for configuring URLs to exclude from tracing.
Shout out to Susana Bouchardet for add the support for JSON IO Descriptor to return empty response body.
Thanks to Keming and mplk for contributing their first PRs in BentoML.

What's Changed

chore(deps): bump actions/setup-node from 2 to 3 by @dependabot in #2846
fix: extend --cache-from consumption to python tuple by @anwang2009 in #2847
feat: add support for excluding urls from tracing by @jiewpeng in #2843
docs: update notice about buildkit by @aarnphm in #2837
chore: add CODEOWNERS by @aarnphm in #2842
doc(frameworks): tensorflow by @bojiang in #2718
feat: add support for specifying urls to exclude from tracing as a list by @jiewpeng in #2851
fix(configuration): merging global runner config to runner specific config by @jjmachan in #2849
fix: Setting status code and cookies by @ssheng in #2854
chore: README typo by @kemingy in #2859
chore: gallery links to bentoml/examples by @aarnphm in #2858
fix(runner): use pickle instead for multi payload parameters by @aarnphm in #2857
doc(framework): pytorch guide by @bojiang in #2735
docs: add missing output to Runner docs by @mplk in #2868
chore: fix push and load interop by @aarnphm in #2863
fix: Usage stats by @ssheng in #2876
fix: JSON(IODescriptor[JSONType]).to_http_response returns empty body when the response is None. by @sbouchardet in #2874
chore: Address comments in the #2874 by @ssheng in #2877
fix: debugger breaks on circus process by @aarnphm in #2875
feat: support custom components for OpenAPI generation by @aarnphm in #2845

New Contributors

@anwang2009 made their first contribution in #2847
@jiewpeng made their first contribution in #2843
@kemingy made their first contribution in #2859
@mplk made their first contribution in #2868
@sbouchardet made their first contribution in #2874

Full Changelog: v1.0.2...v1.0.3

Contributors

ssheng, jjmachan, and 8 other contributors

Assets 4

29 Jul 21:17

ssheng

v1.0.2

bf9d519

BentoML - v1.0.2

🍱 We have just released BentoML v1.0.2 with a number of features and bug fixes requested by the community.

Added support for custom model versions, e.g. bentoml.tensorflow.save_model("model_name:1.2.4", model).
Fixed PyTorch Runner payload serialization issue due to tensor not on CPU.

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

Fixed Transformers GPU device assignment due to kwargs handling.
Fixed excessive Runner thread spawning issue under high load.
Fixed PyTorch Runner inference error due to saving tensor during inference mode.

RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.

Fixed Keras Runner error when the input has only a single element.
Deprecated the validate_json option in JSON IO descriptor and recommended specifying validation logic natively in the Pydantic model.

🎨 We added an examples directory and in it you will find interesting sample projects demonstrating various applications of BentoML. We welcome your contribution if you have a project idea and would like to share with the community.

💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.

Did you know BentoML service supports mounting and calling runners from custom FastAPI and Flask apps?
Did you know IO descriptor supports input and output validation of schema, shape, and data types?

What's Changed

chore: remove all --pre from documentation by @aarnphm in #2738
chore(framework): onnx guide minor improvements by @larme in #2744
fix(framework): fix how pytorch DataContainer convert GPU tensor by @larme in #2739
doc: add missing variable by @robsonpeixoto in #2752
chore(deps): cattrs>=22.1.0 in setup.cfg by @sugatoray in #2758
fix(transformers): kwargs and migrate to framework tests by @ssheng in #2761
chore: add type hint for run and async_run by @aarnphm in #2760
docs: fix typo in SECURITY.md by @parano in #2766
chore: use pypa/build as PEP517 backend by @aarnphm in #2680
chore(e2e): capture log output by @aarnphm in #2767
chore: more robust prometheus directory ensuring by @bojiang in #2526
doc(framework): add scikit-learn section to ONNX documentation by @larme in #2764
chore: clean up dependencies by @sauyon in #2769
docs: misc docs reorganize and cleanups by @parano in #2768
fix(io descriptors): finish removing init_http_response by @sauyon in #2774
chore: fix typo by @aarnphm in #2776
feat(model): allow custom model versions by @sauyon in #2775
chore: add watchfiles as bentoml dependency by @aarnphm in #2777
doc(framework): keras guide by @larme in #2741
docs: Update service schema and validation by @ssheng in #2778
doc(frameworks): fix pip package syntax by @larme in #2782
fix(runner): thread limiter doesn't take effect by @bojiang in #2781
feat: add additional env var configuring num of threads in Runner by @parano in #2786
fix(templates): sharing variables at template level by @aarnphm in #2796
bug: fix JSON io_descriptor validate_json option by @parano in #2803
chore: improve error message when failed importing user service code by @parano in #2806
chore: automatic cache action version update and remove stale bot by @aarnphm in #2798
chore(deps): bump actions/checkout from 2 to 3 by @dependabot in #2810
chore(deps): bump codecov/codecov-action from 2 to 3 by @dependabot in #2811
chore(deps): bump github/codeql-action from 1 to 2 by @dependabot in #2813
chore(deps): bump actions/cache from 2 to 3 by @dependabot in #2812
chore(deps): bump actions/setup-python from 2 to 4 by @dependabot in #2814
fix(datacontainer): pytorch to_payload should disable gradient by @aarnphm in #2821
fix(framework): fix keras single input edge case by @larme in #2822
fix(framework): keras GPU handling by @larme in #2824
docs: update custom bentoserver guide by @parano in #2809
fix(runner): bind limiter to runner_ref instead by @bojiang in #2826
fix(pytorch): inference_mode context is thead local by @bojiang in #2828
fix: address multiple tags for containerize by @aarnphm in #2797
chore: Add gallery projects under examples by @ssheng in #2833
chore: running formatter on examples folder by @aarnphm in #2834
docs: update security auth middleware by @g0nz4rth in #2835
fix(io_descriptor): DataFrame columns check by @alizia in #2836
fix: examples directory structure by @ssheng in #2839
revert: "fix: address multiple tags for containerize (#2797)" by @ssheng in #2840

New Contributors

@robsonpeixoto made their first contribution in #2752
@sugatoray made their first contribution in #2758
@g0nz4rth made their first contribution in #2835
@alizia made their first contribution in #2836

Full Changelog: v1.0.0...v1.0.1

Contributors

larme, robsonpeixoto, and 9 other contributors

Assets 4

13 Jul 09:05

ssheng

v1.0.0

58aa69b

BentoML - v1.0.0

🍱 The wait is over. BentoML has officially released v1.0.0. We are excited to share with you the notable features improvements.

Introduced BentoML Runner, an abstraction for parallel model inference. It allows the compute intensive model inference step to scale separately from the transformation and business logic. The Runner is easily instantiated and invoked, but behind the scenes, BentoML is optimizing for micro-batching and fanning out inference if needed. Here’s a simple example of instantiating a Runner. Learn more about using runners.
Redesigned how models are saved, moved, and loaded with BentoML. We introduced new primitives which allow users to call a save_model() method which saves the model in the most optimal way based the recommended practices of the ML framework. The model is then stored in a flexible local repository where users can use “import” and “export” functionality to push and pull “finalized” models from remote locations like S3. Bentos can be built locally or remotely with these models. Once built, Yatai or bentoctl can easily deploy to the cloud service of your choice. Learn more about preparing models and building bentos.
Enhanced micro-batching capability with the new runner abstraction, batching is even more powerful. When incoming data is spread to different transformation processes, the runner will fan in inferences when inference is invoked. Multiple inputs will be batched into a single inference call. Most ML frameworks implement some form of vectorization which improves performance for multiple inputs at once. Our adaptive batching not only batches inputs as they are received, but also regresses the time of the last several groups of inputs in order to optimize the batch size and latency windows.
Improved reproducibility of the model by recording and locking the dependent library versions. We use the versions to package the correct dependencies so that the environment in which the model runs in production is identical to the environment it was trained in. All direct and transitive dependencies are recorded and deployed with the model when running in production. In our 1.0 version we now support Conda as well as several different ways to customize your pip packages when “building your Bento”. Learn more about building bentos.
Simplified Docker image creation during containerization to generate the right image for you depending on the features that you’ve decided to implement in your service. For example, if your runner specifies that it can run on a GPU, we will automatically choose the right Nvidia docker image as a base when containerizing your service. If needed, we also provide the flexibility to customize your docker image as well. Learn more about containerization.
Improved input and output validation with native type validation rules. Numpy and Pandas DataFrame can specify a static shape or even dynamically infer schema by providing sample data. The Pydantic schema that is produced per endpoint also integrates with our Swagger UI so that each endpoint is better documented for sharing. Learn more about service APIs and IO Descriptors.

⚠️ BentoML v1.0.0 is backward incompatible with v0.13.1. If you wish to stay on the v0.13.1 LTS version, please lock the dependency with bentoml==0.13.1. We have also prepared a migration guide from v0.13.1 to v1.0.0 to help with your project migration. We are committed to supporting the v0.13-LTS versions with critical bug fixes and security patches.

🎉 After years of seeing hundreds of model serving use cases, we are proud to present the official release of BentoML 1.0. We could not have done it without the growth and support of our community.

Assets 4

01 Jul 23:30

ssheng

v1.0.0-rc3

3272fbc

BentoML - 1.0.0-rc3 Pre-release

Pre-release

We have just released BentoML 1.0.0rc3 with a number of highly anticipated features and improvements. Check it out with the following command!

$ pip install -U bentoml --pre

⚠️ BentoML will release the official 1.0.0 version next week and remove the need to use --pre tag to install BentoML versions after 1.0.0. If you wish to stay on the 0.13.1 LTS version, please lock the dependency with bentoml==0.13.1.

Added support for framework runners in the following ML frameworks.
- fast.ai
- CatBoost
- ONNX
Added support for Huggingface Transformers custom pipelines.
Fixed a logging issue causing the api_server and runners to not generate error logs.
Optimized Tensorflow inference procedure.

Improved resource request configuration for runners.

Resource request can be now configured in the BentoML configuration. If unspecified, runners will be scheduled to best utilized the available system resources.
```
runners:
  resources:
    cpu: 8.0
    nvidia.com/gpu: 4.0
```

Updated the API for custom runners to declare the types of supported resources.

import bentoml

class MyRunnable(bentoml.Runnable):
	SUPPORTS_CPU_MULTI_THREADING = True  # Deprecated SUPPORT_CPU_MULTI_THREADING
        SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")  # Deprecated SUPPORT_NVIDIA_GPU
        ...

my_runner = bentoml.Runner(
    MyRunnable,
    runnable_init_params={"foo": foo, "bar": bar},
    name="custom_runner_name",
    ...
)

Deprecated the API for specifying resources from the framework to_runner() and custom Runner APIs. For better flexibility at runtime, it is recommended to specifying resources through configuration.

What's Changed

fix(dependencies): require pyyaml>=5 by @sauyon in #2626
refactor(server): merge contexts; add yatai headers by @bojiang in #2621
chore(pylint): update pylint configuration by @sauyon in #2627
fix: Transformers NVIDIA_VISIBLE_DEVICES value type casting by @ssheng in #2624
fix: Server silently crash without logging exceptions by @ssheng in #2635
fix(framework): some GPU related fixes by @larme in #2637
tests: minor e2e test cleanup by @sauyon in #2643
docs: Add model in bentoml.pytorch.save_model() pytorch integration example by @AlexandreNap in #2644
chore(ci): always enable actions on PR by @sauyon in #2646
chore: updates ci by @aarnphm in #2650
fix(docker): templates bash heredoc should pass -ex by @aarnphm in #2651
feat: CatBoost integration by @yetone in #2615
feat: FastAI by @aarnphm in #2571
feat: Support Transformers custom pipeline by @ssheng in #2640
feat(framework): onnx support by @larme in #2629
chore(tensorflow): optimize inference procedure by @bojiang in #2567
fix(runner): validate runner names by @sauyon in #2588
fix(runner): lowercase runner names and add tests by @sauyon in #2656
style: github naming by @aarnphm in #2659
tests(framework): add new framework tests by @sauyon in #2660
docs: missing code annotation by @jjmachan in #2654
perf(templates): cache python installation via conda by @aarnphm in #2662
fix(ci): destroy the runner after init_local by @bojiang in #2665
fix(conda): python installation order by @aarnphm in #2668
fix(tensorflow): casting error on kwargs by @bojiang in #2664
feat(runner): implement resource configuration by @sauyon in #2632

New Contributors

@AlexandreNap made their first contribution in #2644

Full Changelog: v1.0.0-rc2...v1.0.0-rc3

Contributors

larme, ssheng, and 6 other contributors

Assets 4

22 Jun 06:30

parano

v1.0.0-rc2

ce7e9d2

BentoML - 1.0.0-rc2 Pre-release

Pre-release

We have just released BentoML 1.0.0rc2 with an exciting lineup of improvements. Check it out with the following command!

$ pip install -U bentoml --pre

Standardized logging configuration and improved logging performance.
- If imported as a library, BentoML will no longer configure logging explicitly and will respect the logging configuration of the importing Python process. To customize BentoML logging as a library, configurations can be added for the bentoml logger.
```
formatters:
  ...
handlers:
  ...
loggers:
  ...
  bentoml:
    handlers: [...]
    level: INFO
    ...
```
- If started as a server, BentoML will continue to configure logging format and output to stdout at INFO level. All third party libraries will be configured to log at the WARNING level.
Added LightGBM framework support.
Updated model and bento creation timestamps CLI display to use the local timezone for better use experience, while timestamps in metadata will remain in the UTC timezone.
Improved the reliability of bento build with advanced options including base_image and dockerfile_template.

Beside all the exciting product work, we also started a blog at modelserving.com sharing our learnings gained from building BentoML and supporting the MLOps community. Checkout our latest blog [Breaking up with Flask & FastAPI: Why ML model serving requires a specialized framework] (share your thoughts with us on our LinkedIn post.

Lastly, a big shoutout to @mike Kuhlen for adding the LightGBM framework support. 🥂

What's Changed

feat(cli): output times in the local timezone by @sauyon in #2572
fix(store): use >= for time checking by @sauyon in #2574
fix(build): use subprocess to call pip-compile by @sauyon in #2573
docs: fix wrong variable name in comment by @kim-sardine in #2575
feat: improve logging by @sauyon in #2568
fix(service): JsonIO doesn't return a pydantic model by @bojiang in #2578
fix: update conda env yaml file name and default channel by @parano in #2580
chore(runner): add shcedule shortcuts to runners by @bojiang in #2576
fix(cli): cli encoding error on Windows by @bojiang in #2579
fix(bug): Make model.with_options() additive by @ssheng in #2519
feat: dockerfile templates advanced guides by @aarnphm in #2548
docs: add setuptools to docs dependencies by @parano in #2586
test(frameworks): minor test improvements by @sauyon in #2590
feat: Bring LightGBM back by @mqk in #2589
fix(runner): pass init params to runnable by @sauyon in #2587
fix: propagate should be false by @aarnphm in #2594
fix: Remove starlette request log by @ssheng in #2595
fix: Bug fix for 2596 by @timc in #2597
chore(frameworks): update framework template with new checks and remove old framework code by @sauyon in #2592
docs: Update streaming.rst by @ssheng in #2605
bug: Fix Yatai client push bentos with model options by @ssheng in #2604
docs: allow running tutorial from docker by @parano in #2611
fix(model): lock attrs to >=21.1.0 by @bojiang in #2610
docs: Fix documentation links and formats by @ssheng in #2612
fix(model): load ModelOptions lazily by @sauyon in #2608
feat: install.sh for python packages by @aarnphm in #2555
fix/routing path by @aarnphm in #2606
qa: build config by @aarnphm in #2581
fix: invalid build option python_version="None" when base_image is used by @parano in #2623

New Contributors

@kim-sardine made their first contribution in #2575
@timc made their first contribution in #2597

Full Changelog: v1.0.0-rc1...v1.0.0rc2

Contributors

timc, parano, and 7 other contributors

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: bentoml/BentoML

BentoML - v1.0.8

Contributors

BentoML - v1.0.7

What's Changed

New Contributors

Contributors

BentoML - v1.0.6

What's Changed

New Contributors

Contributors

BentoML - v1.0.5

BentoML - v1.0.4

What's Changed

New Contributors

Contributors

BentoML - v1.0.3

What's Changed

New Contributors

Contributors

BentoML - v1.0.2

What's Changed

New Contributors

Contributors

BentoML - v1.0.0

BentoML - 1.0.0-rc3

What's Changed

New Contributors

Contributors

BentoML - 1.0.0-rc2

What's Changed

New Contributors

Contributors