diff --git a/serverless/pages/ml-nlp-auto-scale.mdx b/serverless/pages/ml-nlp-auto-scale.mdx index 38407ae..51c3284 100644 --- a/serverless/pages/ml-nlp-auto-scale.mdx +++ b/serverless/pages/ml-nlp-auto-scale.mdx @@ -53,7 +53,7 @@ The number of model allocations can be scaled down to 0. They cannot be scaled up to more than 32 allocations, unless you explicitly set the maximum number of allocations to more. Adaptive allocations must be set up independently for each deployment and [inference endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/master/put-inference-api.html). -When you create inference endpoints on serverless deployments using Kibana, adaptive allocations are automatically turned on, and there is no option to disable them. +When you create inference endpoints on Serverless using Kibana, adaptive allocations are automatically turned on, and there is no option to disable them. ### Optimizing for typical use cases