Releases: friendliai/friendli-client
Releases · friendliai/friendli-client
Release v1.5.7 🚀
Release v1.5.6 🚀
- Update package dependencies with security warnings.
- Hotfix on
friendli endpoint list
command. - Update styling for
AWAKING
andUPDATING
endpoint status - Fix descriptions of
friendli api
command:--model
option is required for inference call.
Release v1.5.5 🚀
- gRPC client support using the soft prompt.
- Resolve dependency issues.
- Update GraphQL schemas to handle resources of Friendli Dedicated Endpoints.
Release v1.5.4 🚀
text-to-image
API is removed from the CLI command.- Support cancelling gRPC stream.
Release v1.5.3 🚀
- Support stream close and context manager.
- API E2E tests are added.
Release v1.5.2 🚀
- Hotfix: Automatically close streaming response at the end of the stream.
Release v1.5.1 🚀
Now it is available to use API calls to Friendli Dedicated Endpoints.
from friendli import Friendli
client = Friendli(use_dedicated_endpoint=True)
chat = client.chat.completions.create(
model="{endpoint_id}",
messages=[
{
"role": "user",
"content": "Give three tips for staying healthy.",
}
]
)
If you want to send a request to a specific adapter of the Multi-LoRA endpoint, provide "{endpoint_id}:{adapter_route}" to model
argument:
from friendli import Friendli
client = Friendli(use_dedicated_endpoint=True)
chat = client.chat.completions.create(
model="{endpoint_id}:{adapter_route}",
messages=[
{
"role": "user",
"content": "Give three tips for staying healthy.",
}
]
)
Release v1.5.0 🚀
- Deprecate model conversion and quantization. Alternatively, please use
friendli-model-optmizer
to quantize your models. - Increase default HTTP timeout.
Release v1.4.2 🚀
- Support for Tool Calling API: Added new API to support tool calling.
- Phi3 INT8 Support: Implemented support for Phi3 INT8.
- Snowflake Arctic FP8 Quantizer: Introduced new quantizer for Snowflake Arctic FP8.
- Added support for INT8 quantization for Llama and refactored quantizer to use only safetensors.
Release v1.4.1 🚀
Updating Patch Version
This patch version Introduces explicit resource management to prevent unexpected resource leaks.
By default, the library closes underlying HTTP and gRPC connections when the client is garbage-collected. However, you can now manually close the Friendli
or AsyncFriendli
client using the .close()
method or utilize a context manager to ensure proper closure when exiting a with
block.
Usage examples
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli(base_url="0.0.0.0:8000", use_grpc=True)
async def run():
async with client:
stream = await client.completions.create(
prompt="Explain what gRPC is. Also give me a Python code snippet of gRPC client.",
stream=True,
top_k=1,
)
async for chunk in stream:
print(chunk.text, end="", flush=True)
asyncio.run(run())