New Features:
- Tokenize / Detokenize
Improvement:
- model generation logger print (config.load_model_io=True)
Bug fixes:
- adjust how optional kwargs are inputted
- transformers generate with cutom generation configs
- vllm generate outputs with logprobs (sometimes contain -inf which cannot be solved by JSON encoder)
- llamacpp chat with logits output