You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would suggest the following steps for when we want to rerun the text generation with new models:
Do before re-generating
Update vLLM and other necessary packages, so we can also update the Python version.
Currently running everything with coder Python 1.87.2 on UCloud which has Python 3.10. There have been 9 updates since then to the UCloud App.
Look into whether vLLM have added a "min_token" parameter.
Currently, I compute the length of strings and re-generate in a for loop for n = 20 times to avoid getting generations below the desired amount of tokens for each task. There is no need for this hacky solution if there is a built-in solution now.
Consider using entire model-names instead of the current short-hands
E.g. use stabilityai/StableBeluga-7B or StableBeluga-7B instead of beluga7b
Remove model names as prefixes for completions column
Back when I started the project, I somehow thought it was a good idea to add the model name as prefix to the "beluga7b_completions" which I ultimately remove in the folder make_dataset to standardise formats across models. It should just be called completions
After re-generating
Remove HF pipeline
At the time of coding, I also created the possibility of using the HF interface also. I think for simplicity we should remove this. There is no need for the scope of the project (esp. if we want to split the repos at some point).
Run embeddings with smaller model
I used nvidia/NV-Embed-v2 because it scored the highest on MTEB, but it is a heavy model - is it overkill for a baseline? I changed from FP32 to FP16 precision to make it less memory hungry, and could run it with a batch-size of 16 on the new nvidia L40 gpus
The text was updated successfully, but these errors were encountered:
I would suggest the following steps for when we want to rerun the text generation with new models:
Do before re-generating
Update vLLM and other necessary packages, so we can also update the Python version.
Look into whether vLLM have added a "min_token" parameter.
Consider using entire model-names instead of the current short-hands
Remove model names as prefixes for completions column
After re-generating
Remove HF pipeline
Run embeddings with smaller model
The text was updated successfully, but these errors were encountered: