Ollama Copilot allows users to integrate their Ollama code completion models into Neovim, giving GitHub Copilot-like tab completions.
Offers Suggestion Streaming which will stream the completions into your editor as they are generated from the model.
- Debouncing for subsequent completion requests to avoid overflows of Ollama requests which lead to CPU over-utilization.
- Full control over triggers, using textChange events instead of Neovim client requests.
- Language server which can provide code completions from an Ollama model
- Ghost text completions which can be inserted into the editor
- Streamed ghost text completions which populate in real-time
To use Ollama-Copilot, you need to have Ollama installed github.com/ollama/ollama:
curl -fsSL https://ollama.com/install.sh | sh
Also, the language server runs on Python, and requires two libraries (Can also be found in python/requirements.txt)
pip install pygls ollama
Make sure you have the model you want to use installed, a catalog can be found here: ollama.com/library
# To view your available models:
ollama ls
# To pull a new model
ollama pull <Model name>
Lazy:
-- Default configuration
{"Jacob411/Ollama-Copilot", opts={}}
-- Custom configuration (defaults shown)
{
'jacob411/Ollama-Copilot',
opts = {
model_name = "deepseek-coder:base",
stream_suggestion = false,
python_command = "python3",
filetypes = {'python', 'lua','vim', "markdown"},
ollama_model_opts = {
num_predict = 40,
temperature = 0.1,
},
keymaps = {
suggestion = '<leader>os',
reject = '<leader>or',
insert_accept = '<Tab>',
},
}
},
For more Ollama customization, see github.com/ollama/ollama/blob/main/docs/modelfile.md
Ollama copilot language server will attach when you enter a buffer and can be viewed using:
:LspInfo
Smaller models (<3 billion parameters) work best for tab completion tasks, providing low latency and minimal CPU usage.
- deepseek-coder - 1.3B
- starcoder - 1B
- codegemma - 2B
- starcoder2 - 3B
Contributions are welcome! If you have any ideas for new features, improvements, or bug fixes, please open an issue or submit a pull request.
I am hopeful to add more on the model side as well, as I am interested in finetuning the models and implementing RAG techniques, moving outside of using just Ollama.
This project is licensed under the MIT License.