Skip to content

Commit

Permalink
First draft
Browse files Browse the repository at this point in the history
  • Loading branch information
lordofthejars committed Oct 14, 2024
1 parent 7e7cdca commit 5d1930c
Show file tree
Hide file tree
Showing 4 changed files with 150 additions and 16 deletions.
14 changes: 10 additions & 4 deletions documentation/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
* xref:01-setup.adoc[1. Setup]
** xref:01-setup.adoc#prerequisite[Prerequisites]
** xref:01-setup.adoc#minikube[Setup Minikube]
* xref:02-deploy.adoc[2. Deploy Service]
** xref:02-deploy.adoc#package[Build Service]
** xref:02-deploy.adoc#deploy[Deploy Dervice]
* xref:02-deploy.adoc[2. Serving Models]
** xref:02-deploy.adoc#initinstructlab[Initializing InstructLab]
** xref:02-deploy.adoc#downservtest[Downloading, serving, and testing a model with InstructLab]
* xref:03-Train.adoc[3. Model Alignment and Training]
** xref:03-train.adoc#addknow[Adding knowledge and skills to an LLM]
** xref:03-train.adoc#gensynth[Generating synthetic data for model training]
** xref:03-train.adoc#modeltrain[Model training with InstructLab]
** xref:03-train.adoc#convserv[Converting and serving the aligned model]
** xref:03-train.adoc#testnewmodel[Testing the new model]
6 changes: 3 additions & 3 deletions documentation/modules/ROOT/pages/02-deploy.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
= Serving Models
include::_attributes.adoc[]

[#service]
[#initinstructlab]
== Initializing InstructLab

With ilab installed, we can initialize our tuning environment with the `ilab config init` command.
Expand Down Expand Up @@ -43,7 +43,7 @@ File is placed by default at `<home>/.config/instructlab/config.yaml`.

In this example, we use *merlinite-7b* as a model, but you could use *Granite*, *Mistral*, *Llama*, or any other supported model (`gguf` format).

[#package]
[#downservtest]
== Downloading, serving, and testing a model with InstructLab

=== Downlaoding a model
Expand Down Expand Up @@ -126,7 +126,7 @@ You'll receive a polite answer saying that has no knowledge to answe this questi
[.console-input]
[source, bash,subs="+macros,+attributes"]
----
What is the price of a new Flux capacitor for DeLorean car?
what is the cost of repairing flux capacitor?
----

[.console-output]
Expand Down
129 changes: 124 additions & 5 deletions documentation/modules/ROOT/pages/03-train.adoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
= Model alignment and training
= Model Alignment and Training
include::_attributes.adoc[]

Large Language Models, while impressive in their ability to conversate and recall training information, sometimes they arent aware of specific details due to their large training set of data.
Large Language Models, while impressive in their ability to conversate and recall training information, sometimes they aren't aware of specific details due to their large training set of data.

Lets learn how to contribute the correct information to this model using InstructLab!
Let's learn how to contribute the correct information to this model using InstructLab!

[#service]
[#addknow]
== Adding knowledge and skills to an LLM

In a new terminal window, navigate to the taxonomy directory (`<home>/.local/share/instructlab/taxonomy`) that was cloned during the initialization step.
Expand Down Expand Up @@ -166,6 +166,7 @@ knowledge/trivia/delorean/qna.yaml
Taxonomy in ... is valid :)
----

[#gensynth]
== Generating synthetic data for model training

So we've added some task-specific knowledge—now what? T
Expand All @@ -175,4 +176,122 @@ he next step is to use InstructLab's synthetic data generation pipeline to creat
The key insight behind InstructLab's LAB method is that we can use the base model itself to massively expand a small set of human-provided examples.
By prompting the model to generate completions conditioned on your examples, we can produce a synthetic dataset that's much larger and more diverse than what you could feasibly write by hand.

We can run the `ilab data generate` command to begin generating synthetic data (by default, 100 data points). Remember, we still need to be serving the model with ilab model serve in another terminal instance.
We can run the `ilab data generate` command to begin generating synthetic data (by default, 100 data points). Remember, we still need to be serving the model with `ilab model serve` in another terminal instance.

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
ilab data generate --pipeline simple
----

During this process, we can visualize the example queries and answers, light preprocessing, and prompts being fed to the base model to generate a large number of candidate completions.
The generated completions are filtered and post-processed to remove low-quality or irrelevant outputs.
This is a critical step, as the model can sometimes generate nonsensical or factually incorrect responses.

On a typical laptop CPU, synthetic data generation can take anywhere from a few minutes to a few hours, depending on the number of examples and generation parameters.
Using a GPU will significantly speed things up.
The end result is a set of JSONL files in the specified output directory, with a train/validation/test split.
Each line contains an example with input and target completion fields, and feel free to vim to understand how things are working under the hood.

Stop the `ilab serve` command in the first terminal to save resources for the following section.

[#modeltrain]
== Model training with InstructLab

Once the synthetic data generation is complete, it's time to actually tune the model on the synthetic data with `ilab model train`.

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
ilab model train --pipeline=simple
----

This command will download the necessary model files (if not already available) and begin the alignment phase. On an M1 Mac, it should take 5-15 minutes.

[#convserv]
== Converting and serving the aligned model

the tuned model weights will be saved in the `<home>/.local/share/instructlab/checkpoints/instructlab-granite-7b-lab-mlx-q` directory.

The command `ilab model convert` will convert the model to the GGUF format, creating a quantized version of the model to share on HuggingFace, use locally, etc. Be sure to you first stop the terminal instance that is serving the model (with ilab serve).

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
ilab model convert --model-dir <home>/.local/share/instructlab/checkpoints/instructlab-granite-7b-lab-mlx-q
----

Replace `<home>` with your home directory.

Once finished, we'll have a new directory in instructlab with our aligned and 4-bit quantized model, for example, `instructlab-merlinite-7b-lab-trained`.

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
ilab model serve --model-path instructlab-granite-7b-lab-trained/instructlab-granite-7b-lab-Q4_K_M.gguf
----

[#testnewmodel]
== Testing the new model

With the model served, we can switch over to our other terminal instance and use `ilab model chat` to converse with the model and verify the new knowledge, also pointing to the new quantized model.

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
ilab model chat
----

And then let's repeat the question:

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
what is the cost of repairing flux capacitor?
----

[.console-output]
[source, bash,subs="+macros,+attributes"]
----
──────────────────────────────────────────────────────────────────────────╮
│ A Flux Capacitor in a DeLorean DMC-12 from the Back to the Future movies has an estimated budget of around 10,000 USD in 1985, which corresponds to approximately 33,746 USD in 2021. │
│ This cost includes various components such as a super capacitor array with 1000 farads each, resistor arra
....
----

It recalls the exact information we provided in the taxonomy repository. We've successfully performed model alignment on consumer-grade hardware and tailored this LLM for our specific use case.

`exit` from the ilab chat model
In the same way, you can use `curl` to access this service.

[.console-input]
[source, bash,subs="+macros,+attributes"]
----
curl -X 'POST' \
'http://localhost:8000/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "what is the cost of repairing flux capacitor"
}
]
}'
----

And the response:

[.console-output]
[source, json,subs="+macros,+attributes"]
----
{"id":"chatcmpl-ea9a8028-78d9-4a39-ab82-03a88c4f88da","object":"chat.completion","created":1728913151,"model":"gpt-3.5-turbo","choices":[{"index":0,"message":{"content":"The Flux Capacitor, a crucial component of the time-traveling DeLorean in the Back to the Future series, is not a real machine and does not have an established cost for repairs. However, if we were to consider the hypothetical costs of creating such a device, several factors would come into play:. **Quantum Bands:** The Flux Capacitor requires superconducting quantum loops to function, which would need to be cooled to near abso.....","role":"assistant"},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":27,"completion_tokens":464,"total_tokens":491}}%
----

You can access an inferenced model doing a REST call, using the OpenAI format.
17 changes: 13 additions & 4 deletions documentation/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,18 @@ InstructLab provides tools to enhance LLMs with additional knowledge and skills

[.tile]
.xref:01-setup.adoc[Get Started]
* xref:01-setup.adoc#minikube[Minikube]
* xref:01-setup.adoc#prerequisite[Prerequisites]


[.tile]
.xref:02-deploy.adoc[Serving Models]
* xref:02-deploy.adoc#initinstructlab[Initializing InstructLab]
* xref:02-deploy.adoc#downservtest[Downloading, serving, and testing a model with InstructLab]

[.tile]
.xref:02-deploy.adoc[Deploying]
* xref:02-deploy.adoc#package[Package the Application]
* xref:02-deploy.adoc#deploy[Deploy the Application]
.xref:03-train.adoc[Model Alignment and Training]
* xref:03-train.adoc#addknow[Adding knowledge and skills to an LLM]
* xref:03-train.adoc#gensynth[Generating synthetic data for model training]
* xref:03-train.adoc#modeltrain[Model training with InstructLab]
* xref:03-train.adoc#convserv[Converting and serving the aligned model]
* xref:03-train.adoc#testnewmodel[Testing the new model]

0 comments on commit 5d1930c

Please sign in to comment.