diff --git a/AI/Building LLM system/guardrails-in-llm.md b/AI/Building LLM system/guardrails-in-llm.md index 5297da83..73a14993 100644 --- a/AI/Building LLM system/guardrails-in-llm.md +++ b/AI/Building LLM system/guardrails-in-llm.md @@ -16,7 +16,7 @@ Guardrails in LLM are a set of techniques and strategies designed to control and ## Types of guardrails -![Guardrails in LLM](./assets/guardrails-in-llm.webp) +![Guardrails in LLM](assets/guardrails-in-llm.webp) 1. **Input guardrails**: This involves pre-processing the input to the model to remove or modify any potentially harmful or inappropriate content. This can include filtering out profanity, hate speech, or sensitive information. Some common usecases: - **Topical guardrails**: Limit the model's responses to a specific topic or domain to prevent it from generating off-topic or irrelevant content. diff --git a/AI/Building LLM system/quantization-in-llm.md b/AI/Building LLM system/quantization-in-llm.md index f93ba054..7637f2d2 100644 --- a/AI/Building LLM system/quantization-in-llm.md +++ b/AI/Building LLM system/quantization-in-llm.md @@ -34,17 +34,17 @@ Two primary approaches exist for LLM quantization: ![Linear Quantization](assets/quantization-in-llm-linear.webp) -There are many quantization schema to reduce the size of the model. One technique is called Linear Qunatization - which is used to map the floating point values to the smaller range of values by shifting and scaling. There are 2 main modes in this technique: +There are many quantization schema to reduce the size of the model. One technique is called Linear Qunatization - which is used to map the floating point values to the smaller range of values by shifting and scaling. There are 2 main modes in this technique: - **Symmetric**: The zero-point is zero — i.e. 0.0 of the floating point range is the same as 0 in the quantized range. Typically, this is more efficient to compute at runtime but may result in lower accuracy if the floating point range is unequally distributed around the floating point 0.0. - **Asymmetric**: Zero-point that is non-zero in value. This can result in higher accuracy but may be less efficient to compute at runtime. In this part, we focus on the asymmetric mode. -![Asymmetric mode](./assets/quantization-in-llm-formula.webp) +![Asymmetric mode](assets/quantization-in-llm-formula.webp) In this part, we focus on the asymmetric mode. -![Asymmetric mode](./assets/quantization-in-llm-formula.webp) +![Asymmetric mode](assets/quantization-in-llm-formula.webp) The fundamental formula is: @@ -132,4 +132,4 @@ Quantization stands as a pivotal technique in LLM optimization, enabling efficie * https://medium.com/@lmpo/understanding-model-quantization-for-llms-1573490d44ad * https://www.datacamp.com/tutorial/quantization-for-large-language-models * https://medium.com/@vimalkansal/understanding-the-gguf-format-a-comprehensive-guide-67de48848256 -* https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization \ No newline at end of file +* https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization diff --git a/devbox/introduction/The reason for being.md b/devbox/introduction/The reason for being.md index ad955038..dbf8b611 100644 --- a/devbox/introduction/The reason for being.md +++ b/devbox/introduction/The reason for being.md @@ -1,5 +1,5 @@ --- -tags: +tags: - devbox title: "The reason for being" date: 2024-08-01 @@ -9,34 +9,34 @@ authors: --- ## The Pursuit of Consistency -With the rise of cloud services, we first adapted Docker as a containerization tool to wrap our application into an isolated environment for running on the cloud for production. Then we begin setting up a development environment inside a container or take advantage of docker-compose to run multiple services without installing locally. This helps us share the environment easily, and run the same environment repeatably with minimum changes with just only 1-2 script files. +With the rise of cloud services, we first adapted Docker as a containerization tool to wrap our application into an isolated environment for running on the cloud for production. Then we begin setting up a development environment inside a container or take advantage of docker-compose to run multiple services without installing locally. This helps us share the environment easily, and run the same environment repeatably with minimum changes with just only 1-2 script files. ## Docker's Achilles Heel ### Bad build cache Docker build is actually not very effective because basically, we need to rebuild the entire environment each time our code is changed when developing. This rebuild process not only affects the image layer where the code is placed but also lets other layer below it is rebuilt. It takes more time and resource to do it. -![alt text](./assets/docker_layers.png) +![alt text](assets/docker_layers.png) ### Pulling from internet is not stable The installed application version can be specified in the `apt-get add` script, but its dependencies are not stable. Repositories can be updated, transitive dependencies are changed without changing the main package version. So once we run `apt-get update`, the produced image may not be the same as the previous build. ### Docker is heavy on non-Linux OS On Windows and MacOS, Docker containers are run on a Linux VM. So we need to split resources from the host machine to serve this VM. It makes our computer become slow. -![alt text](./assets/docker_container_run.png) +![alt text](assets/docker_container_run.png) ### How to resolve problems? The problems can be summed up as slow, asynchronous, and cumbersome when developing locally. We actually can resolve them by using some Docker hack. But we can also do it by avoiding using Docker (or changing your machine if you want). Among a variety solutions, we consider trying using Devbox. What's it? ## Devbox -### Do not build, do not need VM +### Do not build, do not need VM Powered by Nix which is known as a cross-platform package manager for Unix-like systems with the ability to build any package, or application running natively on your machine with 100.000+ available Nix packages. Devbox will simply scope an isolated workspace, and then install all dependencies to help you run your project. All these dependencies are native to your OS, so just only need to run without any other middleman. ### Consistent dependencies All packages in the Devbox workspace are actually linked from your local Nix storage, where every application is installed. They are identified by a unique hash and are linked together by a dependency tree. So it will not change over time. So the Devbox environment can be reproduced perfectly and easily to share between teammates with just a script file. -![alt text](./assets/nix_deps_graph.png) +![alt text](assets/nix_deps_graph.png) The hash is different between versions of the same application. So we have different versions of a package in our Nix storage. And easily link it to different projects that require different versions of a package.