we have models with many different tokenizers and chat templates, and we already load the tokenizer that has this information, but for multi-turn dialog we push this into the client to figure out, which can cause problems in the various places that do this (openai-proxy, llama-chat, etc). we could avoid this by adding messages. we should figure out how this should interact with the existing prompt, system_prompt, and prompt_template