Releases: fairyshine/FastMindAPI
Releases · fairyshine/FastMindAPI
Version 0.0.9
New Features:
- Tokenize / Detokenize
Improvement:
- model generation logger print (config.load_model_io=True)
Bug fixes:
- adjust how optional kwargs are inputted
- transformers generate with cutom generation configs
- vllm generate outputs with logprobs (sometimes contain -inf which cannot be solved by JSON encoder)
- llamacpp chat with logits output
Version 0.0.8
New Features:
- Support vLLM now!
Improvement:
- Explicit model type
- Better type hints for parameters
Bug fixes:
- refine optional kwargs settings
Version 0.0.7
Improvement:
- Refine the MCTS algorithm.
Bug fixes:
- incorrect parameters and wrong outputs (logits) with /chat/completions api
Version 0.0.6
New Features:
For transformers, llama.cpp and OpenAI models
- Generate with logits and probs output.
- Requests in the OpenAI-like API (/chat/completions/).
Bug fixes:
- Standardize logits output format for generate method.
Version 0.0.5
New Features:
- Support OpenAI-format-like request "/chat/completions/". (under improvement)
- Add Token Authentication for security.
Bug fixes:
- Fail to load PeftModel.
Version 0.0.4
New Features:
- Support Function Calling now!
Build your own tool set following https://fairyshine.github.io/FastMindAPI/Development/FunctionLibrary/
Version 0.0.3
New Features:
- Run the server in cli through command
fastmindapi-server
. - Generation stop with specific strings.
Bug fixes:
- Some hints are displayed incorrectly.
Version 0.0.2
New Features:
- Build the C/S framework.
- Generate output with logits for transformers model.
- Support PeftModelForCausalLM.
Version 0.0.1
New Features:
- Load the model from the transformers / llama.cpp library.
- Generate output with the specified model.