docs: add docs for parsing

EvoEvolver · May 29, 2024 · ea9b8ca · ea9b8ca
1 parent 266a549
commit ea9b8ca
Show file tree

Hide file tree

Showing 2 changed files with 59 additions and 1 deletion.
diff --git a/docs/parsing.md b/docs/parsing.md
@@ -0,0 +1,58 @@
+
+
+# Parsing rules for chats
+
+## General usage
+
+Passing `parse` argument to `Chat.complete` method, you can parse the result in a specific way.  
+```python
+from mllm import Chat
+chat = Chat()
+chat += "Output a JSON dict with keys 'a' and 'b' and values 1 and 2"
+res = chat.complete(parse="dict")
+print(res['a'])
+```
+Advantages of using the built-in parsing rules:
+
+- Automated retry when the parsing fails
+- Robust parsing rules that can handle various outputs.
+
+## `dict`
+
+Parse the result as a JSON dictionary using `json.loads`. This rule will ignore the `'''json` surrounding the output.
+
+## `list`
+
+Similar to `dict`, but parse the result as a python list. The parser will find the first `[` and the last `]` in the output and try to parse the content in between.
+
+## `obj`
+
+Similar to `dict`, but parse using `ast.literal_eval`. This rule is useful when the output is a python object.
+
+
+## `quotes`
+
+Parse the result as a string. This rule will ignore the ```xxx surrounding the output.
+This rule is useful when you want the LLM to output codes.
+
+```python
+from mllm import Chat
+chat = Chat()
+chat += "Output a python code for quicksort. Start your answer with ```python"
+res = chat.complete(parse="quotes")
+print(res)
+```
+
+
+## `colon`
+
+Capture the contents after the first colon `:` in the output. This rule is useful when you want to limit the topic of the output.
+
+```python
+from mllm import Chat
+chat = Chat()
+chat += "Summarize the following text:<text> This is a test.</text>"
+chat += "Start your answer with Summary:"
+res = chat.complete(parse="colon")
+print(res)
+```
diff --git a/mllm/chat.py b/mllm/chat.py
@@ -204,7 +204,7 @@ def complete(self, model=None, cache=False, expensive=False, parse=None, retry=T
         :param model: The name of the model to use. If None, the default model will be used.
         :param cache: Whether to use cache. If True, the result will be cached.
         :param expensive: Whether to use the expensive model. If True, the default expensive model will be used.
-        :param parse: How to parse the result. Options: "dict", "list", "obj", "quotes", "colon"
+        :param parse: How to parse the result. Options: "dict", "list", "obj", "quotes", "colon". See http://mllm.evoevo.org/parsing for details.
         :param retry: Whether to retry if the completion fails.
         :param options: Additional options for the completion model.
         :return: The completion result from the model. If parse is set, a parsed result will be returned.