predictability and temperature

distantmagic · Jul 6, 2024 · fe88750 · fe88750
1 parent 780e22f
commit fe88750
Show file tree

Hide file tree

Showing 8 changed files with 43 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -11,7 +11,7 @@ For the rendered version, visit our website: https://llmops-handbook.distantmagi
 
 ## License
 
-Creative Commons Attribution Share Alike 4.0 International
+[Creative Commons Attribution Share Alike 4.0 International](./LICENSE)
 
 ## Community
 

diff --git a/book.toml b/book.toml
@@ -10,8 +10,8 @@ additional-js = [
     "assets/global_mermaid.js"
 ]
 default-theme = "ayu"
-edit-url-template = "https://github.com/distantmagic/llmops-handbook/edit/master/{path}"
-git-repository-url = "https://github.com/distantmagic/llmops-handbook/tree/master"
+edit-url-template = "https://github.com/distantmagic/llmops-handbook/edit/main/{path}"
+git-repository-url = "https://github.com/distantmagic/llmops-handbook/tree/main"
 preferred-dark-theme = "ayu"
 smart-punctuation = true
 additional-css = ["./mdbook-admonish.css"]

diff --git a/src/SUMMARY.md b/src/SUMMARY.md
@@ -14,6 +14,7 @@
         - [Gateway](./general-concepts/load-balancing/gateway/README.md)
     - [Model Parameters]()
     - [Supervisor]()
+    - [Temperature](./general-concepts/temperature/README.md)
     - [Vector Database]()
 - [Infrastructure]()
     - [llama.cpp](./deployments/llama.cpp/README.md)
@@ -26,9 +27,13 @@
 - [Customization]()
     - [Fine-tuning](./customization/fine-tuning/README.md)
     - [Retrieval Augmented Generation](./customization/retrieval-augmented-generation/README.md)
-- [Predictability]()
-    - [Consistent Outputs]()
+- [Predictability](./predictability/README.md)
     - [Hallucinations]()
+    - [Consistent Outputs]()
+    - [Structured Outputs]()
+        - [Matching the JSON Schema]()
+        - [Data Objects (including Pydantic)]()
+        - [Function Calling]()
 - [Application Layer](./application-layer/README.md)
     - [Architecture]()
         - [Long-Running]()

diff --git a/src/general-concepts/load-balancing/forward-proxy/README.md b/src/general-concepts/load-balancing/forward-proxy/README.md
@@ -2,7 +2,7 @@
 
 A forward proxy is an intermediary server between the client and the origin server. Clients connect to the forward proxy server and request a resource (such as a completion) available on a different server that is otherwise inaccessible to them. The forward proxy server retrieves the resource and forwards it to the client.
 
-You can combine both forward proxy and [reverse proxy](/general-concepts/load-balancing/reverse-proxy/index.html) to create a [gateway](/general-concepts/load-balancing/gateway/index.html).
+You can combine both forward proxy and [reverse proxy](/general-concepts/load-balancing/reverse-proxy) to create a [gateway](/general-concepts/load-balancing/gateway).
 
 ## llama.cpp Forward Proxy
 

diff --git a/src/general-concepts/load-balancing/reverse-proxy/README.md b/src/general-concepts/load-balancing/reverse-proxy/README.md
@@ -6,9 +6,9 @@ While forward and reverse proxies may seem functionally similar, their differenc
 
 That means a reverse proxy hides its presence from the clients and acts as an intermediary between them and the servers. When you communicate with a reverse proxy, it is as if you communicated directly with the target server. 
 
-That is one of the primary differences between [forward proxy](/general-concepts/load-balancing/forward-proxy/index.html) and a reverse proxy.
+That is one of the primary differences between [forward proxy](/general-concepts/load-balancing/forward-proxy) and a reverse proxy.
 
-You can combine both [forward proxy](/general-concepts/load-balancing/forward-proxy/index.html) and reverse proxy to create a [gateway](/general-concepts/load-balancing/gateway/index.html).
+You can combine both [forward proxy](/general-concepts/load-balancing/forward-proxy) and reverse proxy to create a [gateway](/general-concepts/load-balancing/gateway).
 
 ## Paddler
 

diff --git a/src/general-concepts/temperature/README.md b/src/general-concepts/temperature/README.md
@@ -0,0 +1,3 @@
+# Temperature
+
+The temperature parameter in LLMs controls the randomness of the output. Lower temperatures make the output more deterministic (less creative), while higher temperatures increase variability. Even at low temperatures, some variability remains due to the probabilistic nature of the models.
diff --git a/src/predictability/README.md b/src/predictability/README.md
@@ -0,0 +1,26 @@
+# Predictability
+
+The common issue with [Large Language Models](/general-concepts/large-language-model) is the consistency and structure of outputs.
+
+## Software Engineering vs AI
+
+The last few decades of IT developments have accustomed us to extreme predictability. Each time we call a specific API endpoint or use a specific button, the same thing happens consistently, under our complete control.
+
+That is not the case with AI, which operates on probabilities. That stems from the approach to creating software. Neural network designers design just the network and the training process, but they do not design the actual reasoning. The reasoning is learned by the network during training, and it is not under the control of the designer.
+
+That is totally different from the traditional software development process, where we design the reasoning and the process, and the software just executes it. 
+
+That is why you might feed [Large Language Models](/general-concepts/large-language-model) with the same prompt multiple times and get different outputs each time. [Temperature](/general-concepts/temperature) parameter may be used to limit the "creativeness" of the model, but even setting it to zero does not guarantee predictable outputs.
+
+## Structured Outputs
+
+While LLMs not being completely predictable may cause some issues, but no technical solution is completely one-sided and we can turn that flexibility into our advantage.
+
+LLMs are extremely good in understanding natural language. In practice we can finally communicate with computers in a similar way we communicate with other people. We can create systems that interpret such unstructured inputs and react to them in a structured and predictable way. This way we can use the good parts of LLMs to our advantage and mitigate most of the unredictability issues.
+
+## Use Cases
+
+Some use cases include (but are not limited to):
+- Searching through unstructured documents (e.g., reports in .pdf, .doc, .csv, or plain text)
+- Converting emails into actionable structures (e.g., converting requests for quotes into API calls with parameters for internal systems)
+- Question answering systems that interpret the context of user queries
diff --git a/src/predictability/structured-output/matching-json-schema.md b/src/predictability/structured-output/matching-json-schema.md
@@ -0,0 +1 @@
+# Matching the JSON Schema