Skip to content

Commit

Permalink
predictability and temperature
Browse files Browse the repository at this point in the history
  • Loading branch information
mcharytoniuk committed Jul 6, 2024
1 parent 780e22f commit fe88750
Show file tree
Hide file tree
Showing 8 changed files with 43 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ For the rendered version, visit our website: https://llmops-handbook.distantmagi

## License

Creative Commons Attribution Share Alike 4.0 International
[Creative Commons Attribution Share Alike 4.0 International](./LICENSE)

## Community

Expand Down
4 changes: 2 additions & 2 deletions book.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ additional-js = [
"assets/global_mermaid.js"
]
default-theme = "ayu"
edit-url-template = "https://github.com/distantmagic/llmops-handbook/edit/master/{path}"
git-repository-url = "https://github.com/distantmagic/llmops-handbook/tree/master"
edit-url-template = "https://github.com/distantmagic/llmops-handbook/edit/main/{path}"
git-repository-url = "https://github.com/distantmagic/llmops-handbook/tree/main"
preferred-dark-theme = "ayu"
smart-punctuation = true
additional-css = ["./mdbook-admonish.css"]
Expand Down
9 changes: 7 additions & 2 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
- [Gateway](./general-concepts/load-balancing/gateway/README.md)
- [Model Parameters]()
- [Supervisor]()
- [Temperature](./general-concepts/temperature/README.md)
- [Vector Database]()
- [Infrastructure]()
- [llama.cpp](./deployments/llama.cpp/README.md)
Expand All @@ -26,9 +27,13 @@
- [Customization]()
- [Fine-tuning](./customization/fine-tuning/README.md)
- [Retrieval Augmented Generation](./customization/retrieval-augmented-generation/README.md)
- [Predictability]()
- [Consistent Outputs]()
- [Predictability](./predictability/README.md)
- [Hallucinations]()
- [Consistent Outputs]()
- [Structured Outputs]()
- [Matching the JSON Schema]()
- [Data Objects (including Pydantic)]()
- [Function Calling]()
- [Application Layer](./application-layer/README.md)
- [Architecture]()
- [Long-Running]()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

A forward proxy is an intermediary server between the client and the origin server. Clients connect to the forward proxy server and request a resource (such as a completion) available on a different server that is otherwise inaccessible to them. The forward proxy server retrieves the resource and forwards it to the client.

You can combine both forward proxy and [reverse proxy](/general-concepts/load-balancing/reverse-proxy/index.html) to create a [gateway](/general-concepts/load-balancing/gateway/index.html).
You can combine both forward proxy and [reverse proxy](/general-concepts/load-balancing/reverse-proxy) to create a [gateway](/general-concepts/load-balancing/gateway).

## llama.cpp Forward Proxy

Expand Down
4 changes: 2 additions & 2 deletions src/general-concepts/load-balancing/reverse-proxy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ While forward and reverse proxies may seem functionally similar, their differenc

That means a reverse proxy hides its presence from the clients and acts as an intermediary between them and the servers. When you communicate with a reverse proxy, it is as if you communicated directly with the target server.

That is one of the primary differences between [forward proxy](/general-concepts/load-balancing/forward-proxy/index.html) and a reverse proxy.
That is one of the primary differences between [forward proxy](/general-concepts/load-balancing/forward-proxy) and a reverse proxy.

You can combine both [forward proxy](/general-concepts/load-balancing/forward-proxy/index.html) and reverse proxy to create a [gateway](/general-concepts/load-balancing/gateway/index.html).
You can combine both [forward proxy](/general-concepts/load-balancing/forward-proxy) and reverse proxy to create a [gateway](/general-concepts/load-balancing/gateway).

## Paddler

Expand Down
3 changes: 3 additions & 0 deletions src/general-concepts/temperature/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Temperature

The temperature parameter in LLMs controls the randomness of the output. Lower temperatures make the output more deterministic (less creative), while higher temperatures increase variability. Even at low temperatures, some variability remains due to the probabilistic nature of the models.
26 changes: 26 additions & 0 deletions src/predictability/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Predictability

The common issue with [Large Language Models](/general-concepts/large-language-model) is the consistency and structure of outputs.

## Software Engineering vs AI

The last few decades of IT developments have accustomed us to extreme predictability. Each time we call a specific API endpoint or use a specific button, the same thing happens consistently, under our complete control.

That is not the case with AI, which operates on probabilities. That stems from the approach to creating software. Neural network designers design just the network and the training process, but they do not design the actual reasoning. The reasoning is learned by the network during training, and it is not under the control of the designer.

That is totally different from the traditional software development process, where we design the reasoning and the process, and the software just executes it.

That is why you might feed [Large Language Models](/general-concepts/large-language-model) with the same prompt multiple times and get different outputs each time. [Temperature](/general-concepts/temperature) parameter may be used to limit the "creativeness" of the model, but even setting it to zero does not guarantee predictable outputs.

## Structured Outputs

While LLMs not being completely predictable may cause some issues, but no technical solution is completely one-sided and we can turn that flexibility into our advantage.

LLMs are extremely good in understanding natural language. In practice we can finally communicate with computers in a similar way we communicate with other people. We can create systems that interpret such unstructured inputs and react to them in a structured and predictable way. This way we can use the good parts of LLMs to our advantage and mitigate most of the unredictability issues.

## Use Cases

Some use cases include (but are not limited to):
- Searching through unstructured documents (e.g., reports in .pdf, .doc, .csv, or plain text)
- Converting emails into actionable structures (e.g., converting requests for quotes into API calls with parameters for internal systems)
- Question answering systems that interpret the context of user queries
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Matching the JSON Schema

0 comments on commit fe88750

Please sign in to comment.