-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add install and build instructions, refactor docs structure #22
Add install and build instructions, refactor docs structure #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far!
src/README.md
Outdated
|
||
## Supported models | ||
|
||
1. chatglm | ||
1. https://huggingface.co/THUDM/chatglm2-6b - refer to | ||
[chatglm2-6b - AttributeError: can't set attribute](../../../llm_bench/python/doc/NOTES.md#chatglm2-6b---attributeerror-cant-set-attribute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need to keep troubleshooting hint links in the "Supported Models" table.
But if we need such links (now there are only for chatglm2-6b
and Qwen-7B-Chat-Int4
models) - I would suggest to place them in the end of SUPPORTED_MODELS.md
file in "Troubleshooting" section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have them. These are for llm_bench
@@ -1,6 +1,17 @@ | |||
## GenAI Pipeline Repository | |||
# OpenVINO™ GenAI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is TM
required? The benefit of not having that is that it's potentially easier to scan the repo for non-ASCII chars. But we don't have it for now, so no need to change.
src/docs/BUILD.md
Outdated
git clone https://github.com/openvinotoolkit/openvino.genai.git | ||
cd openvino.genai | ||
git submodule update --init --recursive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git clone https://github.com/openvinotoolkit/openvino.genai.git | |
cd openvino.genai | |
git submodule update --init --recursive | |
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git | |
cd openvino.genai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ | ||
cmake --build ./build/ --config Release --target package -j | ||
cmake --install ./build/ --config Release --prefix ov | ||
./ov/samples/cpp/build_samples.sh -i ./s\ pace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s pace
is used in tests to verify our cmake vars handle it correctly. You shouldn't encourage having spaces.
I'm also not sure whether installation is part of the build process. Samples can be run from the build folder. They are installed to ov
in tests to verify, that it will work from the package.
src/docs/INSTALL_ARCHIVE.md
Outdated
```sh | ||
curl -L https://storage.openvinotoolkit.org/repositories/openvino_genai/packages/2024.1/linux/l_openvino_genai_toolkit_ubuntu22_2024.1.0.15008.f4afc983258_x86_64.tgz --output openvino_genai_2024.1.0.tgz | ||
tar -xf openvino_genai_2024.1.0.tgz | ||
sudo mv l_openvino_genai_toolkit_ubuntu24_2024.1.0.15008.f4afc983258_x86_64 /opt/intel/openvino_genai_2024.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, replace 2024.1.0
with 2024.2.0
everywhere. I'll be published for 2024.2.0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
sudo mv l_openvino_genai_toolkit_ubuntu22_2024.1.0.15008.f4afc983258_x86_64 /opt/intel/openvino_genai_2024.1.0 | ||
``` | ||
|
||
For other operating systems, please refer to the guides in documentation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you cover ubuntu20 and 22 separately, you should add TODO for ubuntu24, which is going to be supported starting with 2024.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added placeholder command for ubuntu24 and TODO for updating links to archives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically install commands for different ubuntu versions are quite similar, only archive links differ. Does it make sense to keep only one (generic) ubuntu instructions and provide links to docs for other OS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think one is better
LLM return logits with probabilities of each token, these probabilities can be converted to tokens/words with different technics: greedy decoding, beam search decoding, random sampling, etc. This requires writing user unfriendly post-processing even for the simplest scenario of greedy decoding. In order to make live easier we we combined all decoding scenarios into a single function call, where the decoding method and parameters are specified by arguments. In this PR we provide a user friendly API for text generation inspired by `generate` method from HuggingFace transformers library. - [x] enable calling tokenizers/detokenizers from LLMPipeline - [ ] add callback for streaming mode - done partially, need to improve - [x] rewritten samples with the current approach: [causal_lm/cpp/generate_pipeline/generate_sample.cpp#L73-L83](https://github.com/pavel-esir/openvino.genai/blob/generate_pipeline/text_generation/causal_lm/cpp/generate_pipeline/generate_sample.cpp#L73-L83) - [x] Multibatch greedy decoding - [ ] Speculative decoding - [ ] Grouped Beam Search decoding: ready for batch 1, need to rebase multibatch support after merging openvinotoolkit#349 - [x] Random sampling Example 1: Greedy search generation ``` LLMPipeline pipe(model_path, device); // Will try to load config from generation_config.json. // but if not found default velues for gready search will be used GenerationConfig config = pipe.generation_config(); cout << pipe(prompt, config.max_new_tokens(20)); ``` Example 2: TextStreaming mode ``` LLMPipeline pipe(model_path, device); GenerationConfig config = pipe.generation_config(); auto text_streamer = TextStreamer{pipe}; auto text_streamer_callback = [&text_streamer](std::vector<int64_t>&& tokens, LLMPipeline& pipe){ text_streamer.put(tokens[0]); }; pipe(prompt, config.max_new_tokens(20).set_callback(text_streamer_callback)); text_streamer.end(); ``` CVS-132907 CVS-137920 --------- Co-authored-by: Wovchena <vladimir.zlobin@intel.com> Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> Co-authored-by: Alexander Suvorov <alexander.suvorov@intel.com> Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com> Co-authored-by: Xiake Sun <xiake.sun@intel.com> Co-authored-by: wenyi5608 <93560477+wenyi5608@users.noreply.github.com> Co-authored-by: Ekaterina Aidova <ekaterina.aidova@intel.com> Co-authored-by: guozhong wang <guozhong.wang@intel.com> Co-authored-by: Chen Peter <peter.chen@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
99504da
to
4dae776
Compare
Closing as branch is retargeted |
Previous PR for comments reference: pavel-esir#22 --------- Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>
No description provided.