-
Notifications
You must be signed in to change notification settings - Fork 262
[JS] Add GenAI Node.js bindings #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 67 commits
0d73757
2d71b6b
0c0b41c
7e80306
67d5a40
689e352
4e3405a
6853446
e433a19
ae806da
8632882
9ceb3b5
599713b
222940d
901dd90
09356ef
96d53b2
785bf8d
e8c53f2
82ae75f
963fc1d
1d952b6
e909f03
c5f5f85
6d7cbcd
d999569
4479e1b
b722d39
3b4b3c0
17f2069
27a97d8
b3eba5e
6f0e0b4
855c893
df51af9
a365ef8
6d0b501
f241cda
1e97e70
1f2324b
dda494d
394d425
3186142
2fbd888
2daa0f3
0a0c399
35c1c61
b2cb5dd
13c0558
432e3de
30803b4
b04ee6c
d7e0d2b
d1ce555
ea0ff4a
7f8ee36
6434daf
d03b5cd
f9d720b
6d36185
2215590
82881ba
a1cdda0
4a63106
02b28ca
ea69261
5c00a91
e0d444a
6564fb3
bdcd6e2
9ffe6de
6fb891b
2a87a73
fc4f8f6
41e6d41
6b53381
9dd0ebb
733e858
cb02e76
c57272a
433d280
ad83f13
1208f36
52d6156
3e19100
15f44b7
c8eee1e
11f080c
3ef358c
b941174
84e3205
57719d5
72f0841
10da9b8
0ac705d
c781e68
517086e
f9b4044
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,3 +3,4 @@ | |
# | ||
|
||
option(ENABLE_PYTHON "Enable Python API build" ON) | ||
option(ENABLE_JS "Enable JS API build" OFF) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
node_modules |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# JavaScript chat_sample that supports most popular models like LLaMA 3 | ||
|
||
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample fearures `Pipeline.LLMPipeline` and configures it for the chat scenario. | ||
|
||
## Download and convert the model and tokenizers | ||
|
||
To convert model you have to use python package `optimum-intel`. | ||
The `--upgrade-strategy eager` option is needed to ensure `optimum-intel` is upgraded to the latest version. | ||
|
||
Install [../../export-requirements.txt](../../export-requirements.txt) to convert a model. | ||
|
||
```sh | ||
pip install --upgrade-strategy eager -r ../../export-requirements.txt | ||
optimum-cli export openvino --trust-remote-code --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0 | ||
``` | ||
|
||
## Run: | ||
|
||
Compile GenAI JavaScript bindings archive first using the instructions in [../../../src/js/README.md](../../../src/js/README.md#build-bindings). | ||
|
||
Run `npm install` in current folder and then run the sample: | ||
Wovchena marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
`node chat_sample.js TinyLlama-1.1B-Chat-v1.0` | ||
|
||
Discrete GPUs (dGPUs) usually provide better performance compared to CPUs. It is recommended to run larger models on a dGPU with 32GB+ RAM. For example, the model meta-llama/Llama-2-13b-chat-hf can benefit from being run on a dGPU. Modify the source code to change the device for inference to the GPU. | ||
|
||
See https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md#supported-models for the list of supported models. | ||
|
||
### Troubleshooting | ||
|
||
#### Unicode characters encoding error on Windows | ||
|
||
Example error: | ||
``` | ||
UnicodeEncodeError: 'charmap' codec can't encode character '\u25aa' in position 0: character maps to <undefined> | ||
``` | ||
|
||
If you encounter the error described in the example when sample is printing output to the Windows console, it is likely due to the default Windows encoding not supporting certain Unicode characters. To resolve this: | ||
1. Enable Unicode characters for Windows cmd - open `Region` settings from `Control panel`. `Administrative`->`Change system locale`->`Beta: Use Unicode UTF-8 for worldwide language support`->`OK`. Reboot. | ||
2. Enable UTF-8 mode by setting environment variable `PYTHONIOENCODING="utf8"`. | ||
|
||
#### Missing chat template | ||
|
||
If you encounter an exception indicating a missing "chat template" when launching the `ov::genai::LLMPipeline` in chat mode, it likely means the model was not tuned for chat functionality. To work this around, manually add the chat template to tokenizer_config.json of your model. | ||
The following template can be used as a default, but it may not work properly with every model: | ||
``` | ||
"chat_template": "{% for message in messages %}{% if (message['role'] == 'user') %}{{'<|im_start|>user\n' + message['content'] + '<|im_end|>\n<|im_start|>assistant\n'}}{% elif (message['role'] == 'assistant') %}{{message['content'] + '<|im_end|>\n'}}{% endif %}{% endfor %}", | ||
``` |
vishniakov-nikolai marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
import readline from 'readline'; | ||
import { Pipeline } from 'genai-node'; | ||
|
||
main(); | ||
|
||
function streamer(subword) { | ||
process.stdout.write(subword); | ||
} | ||
|
||
async function main() { | ||
const MODEL_PATH = process.argv[2]; | ||
|
||
if (!MODEL_PATH) { | ||
console.error('Please specify path to model directory\n' | ||
+ 'Run command must be: `node chat_sample.js *path_to_model_dir*`'); | ||
process.exit(1); | ||
} | ||
|
||
const device = 'CPU'; // GPU can be used as well | ||
|
||
// Create interface for reading user input from stdin | ||
const rl = readline.createInterface({ | ||
input: process.stdin, | ||
output: process.stdout, | ||
}); | ||
|
||
const pipe = await Pipeline.LLMPipeline(MODEL_PATH, device); | ||
const config = { 'max_new_tokens': 100 }; | ||
|
||
await pipe.startChat(); | ||
Wovchena marked this conversation as resolved.
Show resolved
Hide resolved
|
||
promptUser(); | ||
|
||
// Function to prompt the user for input | ||
function promptUser() { | ||
rl.question('question:\n', handleInput); | ||
} | ||
|
||
// Function to handle user input | ||
async function handleInput(input) { | ||
input = input.trim(); | ||
|
||
// Check for exit command | ||
if (!input) { | ||
await pipe.finishChat(); | ||
rl.close(); | ||
process.exit(0); | ||
} | ||
|
||
await pipe.generate(input, config, streamer); | ||
console.log('\n----------'); | ||
|
||
if (!rl.closed) promptUser(); | ||
} | ||
} |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"name": "genai-node-demo", | ||
"version": "1.0.0", | ||
"license": "Apache-2.0", | ||
"type": "module", | ||
"devDependencies": { | ||
"genai-node": "../../../src/js/" | ||
}, | ||
"engines": { | ||
"node": ">=21.0.0" | ||
}, | ||
"scripts": { | ||
"test": "node tests/usage.test.js" | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
import { env } from 'process'; | ||
import { spawn } from 'child_process'; | ||
|
||
const MODEL_PATH = env.MODEL_PATH; | ||
const prompt = 'Tell me exactly, no changes, print as is: "Hello world"'; | ||
const expected = 'Hello world'; | ||
|
||
if (!MODEL_PATH) | ||
throw new Error( | ||
'Please environment variable MODEL_PATH to the path of the model directory' | ||
); | ||
|
||
const runTest = async () => { | ||
return new Promise((resolve, reject) => { | ||
const script = spawn('node', ['chat_sample.js', MODEL_PATH]); | ||
let output = ''; | ||
|
||
// Collect output from stdout | ||
script.stdout.on('data', (data) => { | ||
output += data.toString(); | ||
}); | ||
|
||
// Capture errors | ||
script.stderr.on('data', (data) => { | ||
reject(data.toString()); | ||
}); | ||
|
||
// Send input after detecting the question prompt | ||
script.stdout.once('data', (data) => { | ||
if (data.toString().startsWith('question:')) { | ||
script.stdin.write(`${prompt}\n`); // Provide input | ||
script.stdin.end(); // Close stdin to signal EOF | ||
} | ||
}); | ||
|
||
// Check results when the process exits | ||
script.on('close', (code) => { | ||
if (code !== 0) { | ||
return reject(`Process exited with code ${code}`); | ||
} | ||
|
||
// Log the output | ||
console.log(`Result output: ${output}`); | ||
|
||
// Validate the output | ||
if (output.includes(expected)) { | ||
resolve('Test passed!'); | ||
} else { | ||
reject('Test failed: Output did not match expected result.'); | ||
} | ||
}); | ||
}); | ||
}; | ||
|
||
runTest() | ||
.then((message) => { | ||
console.log(message); | ||
process.exit(0); | ||
}) | ||
.catch((err) => { | ||
console.error(err); | ||
process.exit(1); | ||
}); |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -147,13 +147,42 @@ if(MSVC OR APPLE) | |
set(ARCH_DIR ${ARCH_DIR}/${CMAKE_BUILD_TYPE}) | ||
endif() | ||
|
||
# Put binaries at the top level for NPM package | ||
if(CPACK_GENERATOR STREQUAL "NPM") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it make sense to use js bindings from build tree directly? If not, you should enforce There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can try, but I am not sure about this change.
But @ilya-lavrenov, could you suggest the optimal way to handle this behavior? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looks like if we follow @Wovchena we will not break anything for GenAI specifically |
||
set(LIBRARY_DESTINATION .) | ||
set(ARCHIVE_DESTINATION .) | ||
set(RUNTIME_DESTINATION .) | ||
|
||
# setting RPATH / LC_RPATH depending on platform | ||
if(LINUX) | ||
# to find libopenvino.so in the same folder | ||
set(rpaths "$ORIGIN") | ||
elseif(APPLE) | ||
# to find libopenvino.dylib in the same folder | ||
set(rpaths "@loader_path") | ||
endif() | ||
|
||
if(rpaths) | ||
set_target_properties(${TARGET_NAME} PROPERTIES INSTALL_RPATH "${rpaths}") | ||
endif() | ||
else() | ||
set(LIBRARY_DESTINATION runtime/lib/${ARCH_DIR}) | ||
set(ARCHIVE_DESTINATION runtime/lib/${ARCH_DIR}) | ||
set(RUNTIME_DESTINATION runtime/bin/${ARCH_DIR}) | ||
endif() | ||
|
||
install(TARGETS ${TARGET_NAME} EXPORT OpenVINOGenAITargets | ||
LIBRARY DESTINATION runtime/lib/${ARCH_DIR} COMPONENT core_genai | ||
LIBRARY DESTINATION ${LIBRARY_DESTINATION} COMPONENT core_genai | ||
NAMELINK_COMPONENT core_genai_dev | ||
ARCHIVE DESTINATION runtime/lib/${ARCH_DIR} COMPONENT core_genai_dev | ||
RUNTIME DESTINATION runtime/bin/${ARCH_DIR} COMPONENT core_genai | ||
ARCHIVE DESTINATION ${ARCHIVE_DESTINATION} COMPONENT core_genai_dev | ||
RUNTIME DESTINATION ${RUNTIME_DESTINATION} COMPONENT core_genai | ||
INCLUDES DESTINATION runtime/include) | ||
|
||
# samples do not need to be built for NPM package | ||
vishniakov-nikolai marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if(CPACK_GENERATOR STREQUAL "NPM") | ||
return() | ||
endif() | ||
|
||
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/include/ | ||
DESTINATION runtime/include COMPONENT core_genai_dev) | ||
install(FILES ${CMAKE_CURRENT_BINARY_DIR}/openvino/genai/version.hpp | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
.vscode | ||
bin | ||
bin.* | ||
build | ||
thirdparty | ||
node_modules | ||
tests/models |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
.vscode | ||
bin.* | ||
build | ||
include | ||
src | ||
tests | ||
|
||
.eslintrc.js | ||
CMakeLists.txt | ||
tsconfig.json | ||
TODO.md | ||
build.sh | ||
|
||
**/*.tsbuildinfo | ||
*.tgz |
Uh oh!
There was an error while loading. Please reload this page.