Skip to content

Commit

Permalink
fix: image path
Browse files Browse the repository at this point in the history
  • Loading branch information
monotykamary committed Dec 6, 2024
1 parent 507fd73 commit 92648b2
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 11 deletions.
2 changes: 1 addition & 1 deletion AI/Building LLM system/guardrails-in-llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Guardrails in LLM are a set of techniques and strategies designed to control and

## Types of guardrails

![Guardrails in LLM](./assets/guardrails-in-llm.webp)
![Guardrails in LLM](assets/guardrails-in-llm.webp)

1. **Input guardrails**: This involves pre-processing the input to the model to remove or modify any potentially harmful or inappropriate content. This can include filtering out profanity, hate speech, or sensitive information. Some common usecases:
- **Topical guardrails**: Limit the model's responses to a specific topic or domain to prevent it from generating off-topic or irrelevant content.
Expand Down
8 changes: 4 additions & 4 deletions AI/Building LLM system/quantization-in-llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,17 @@ Two primary approaches exist for LLM quantization:

![Linear Quantization](assets/quantization-in-llm-linear.webp)

There are many quantization schema to reduce the size of the model. One technique is called Linear Qunatization - which is used to map the floating point values to the smaller range of values by shifting and scaling. There are 2 main modes in this technique:
There are many quantization schema to reduce the size of the model. One technique is called Linear Qunatization - which is used to map the floating point values to the smaller range of values by shifting and scaling. There are 2 main modes in this technique:
- **Symmetric**: The zero-point is zero — i.e. 0.0 of the floating point range is the same as 0 in the quantized range. Typically, this is more efficient to compute at runtime but may result in lower accuracy if the floating point range is unequally distributed around the floating point 0.0.
- **Asymmetric**: Zero-point that is non-zero in value. This can result in higher accuracy but may be less efficient to compute at runtime.

In this part, we focus on the asymmetric mode.

![Asymmetric mode](./assets/quantization-in-llm-formula.webp)
![Asymmetric mode](assets/quantization-in-llm-formula.webp)

In this part, we focus on the asymmetric mode.

![Asymmetric mode](./assets/quantization-in-llm-formula.webp)
![Asymmetric mode](assets/quantization-in-llm-formula.webp)

The fundamental formula is:

Expand Down Expand Up @@ -132,4 +132,4 @@ Quantization stands as a pivotal technique in LLM optimization, enabling efficie
* https://medium.com/@lmpo/understanding-model-quantization-for-llms-1573490d44ad
* https://www.datacamp.com/tutorial/quantization-for-large-language-models
* https://medium.com/@vimalkansal/understanding-the-gguf-format-a-comprehensive-guide-67de48848256
* https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization
* https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization
12 changes: 6 additions & 6 deletions devbox/introduction/The reason for being.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
tags:
tags:
- devbox
title: "The reason for being"
date: 2024-08-01
Expand All @@ -9,34 +9,34 @@ authors:
---

## The Pursuit of Consistency
With the rise of cloud services, we first adapted Docker as a containerization tool to wrap our application into an isolated environment for running on the cloud for production. Then we begin setting up a development environment inside a container or take advantage of docker-compose to run multiple services without installing locally. This helps us share the environment easily, and run the same environment repeatably with minimum changes with just only 1-2 script files.
With the rise of cloud services, we first adapted Docker as a containerization tool to wrap our application into an isolated environment for running on the cloud for production. Then we begin setting up a development environment inside a container or take advantage of docker-compose to run multiple services without installing locally. This helps us share the environment easily, and run the same environment repeatably with minimum changes with just only 1-2 script files.

## Docker's Achilles Heel

### Bad build cache
Docker build is actually not very effective because basically, we need to rebuild the entire environment each time our code is changed when developing. This rebuild process not only affects the image layer where the code is placed but also lets other layer below it is rebuilt. It takes more time and resource to do it.

![alt text](./assets/docker_layers.png)
![alt text](assets/docker_layers.png)

### Pulling from internet is not stable
The installed application version can be specified in the `apt-get add` script, but its dependencies are not stable. Repositories can be updated, transitive dependencies are changed without changing the main package version. So once we run `apt-get update`, the produced image may not be the same as the previous build.

### Docker is heavy on non-Linux OS
On Windows and MacOS, Docker containers are run on a Linux VM. So we need to split resources from the host machine to serve this VM. It makes our computer become slow.
![alt text](./assets/docker_container_run.png)
![alt text](assets/docker_container_run.png)

### How to resolve problems?
The problems can be summed up as slow, asynchronous, and cumbersome when developing locally. We actually can resolve them by using some Docker hack. But we can also do it by avoiding using Docker (or changing your machine if you want). Among a variety solutions, we consider trying using Devbox. What's it?

## Devbox

### Do not build, do not need VM
### Do not build, do not need VM
Powered by Nix which is known as a cross-platform package manager for Unix-like systems with the ability to build any package, or application running natively on your machine with 100.000+ available Nix packages. Devbox will simply scope an isolated workspace, and then install all dependencies to help you run your project. All these dependencies are native to your OS, so just only need to run without any other middleman.

### Consistent dependencies
All packages in the Devbox workspace are actually linked from your local Nix storage, where every application is installed. They are identified by a unique hash and are linked together by a dependency tree. So it will not change over time. So the Devbox environment can be reproduced perfectly and easily to share between teammates with just a script file.

![alt text](./assets/nix_deps_graph.png)
![alt text](assets/nix_deps_graph.png)

The hash is different between versions of the same application. So we have different versions of a package in our Nix storage. And easily link it to different projects that require different versions of a package.

Expand Down

0 comments on commit 92648b2

Please sign in to comment.