Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream tag v0.6.2 (revision 22904ca) #6

Open
wants to merge 41 commits into
base: apolo
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
c7212ac
fix(LLM): mistral ignoring assistant messages (#1954)
pabloogc May 30, 2024
b687dc8
feat: bump dependencies (#1987)
jaluma Jul 5, 2024
19a7c06
feat(docs): update doc for ipex-llm (#1968)
shane-huang Jul 8, 2024
fc13368
feat(llm): Support for Google Gemini LLMs and Embeddings (#1965)
uw4 Jul 8, 2024
2612928
feat(vectorstore): Add clickhouse support as vectore store (#1883)
Proger666 Jul 8, 2024
067a5f1
feat(docs): Fix setup docu (#1926)
martinzrrl Jul 8, 2024
dde0224
fix(docs): Fix concepts.mdx referencing to installation page (#1779)
mtulio Jul 8, 2024
187bc93
(feat): add github button (#1989)
fern-support Jul 9, 2024
15f73db
docs: update repo links, citations (#1990)
jaluma Jul 9, 2024
01b7ccd
fix(config): make tokenizer optional and include a troubleshooting do…
jaluma Jul 17, 2024
4523a30
feat(docs): update documentation and fix preview-docs (#2000)
jaluma Jul 18, 2024
43cc31f
feat(vectordb): Milvus vector db Integration (#1996)
Jacksonxhx Jul 18, 2024
90d211c
Update README.md (#2003)
imartinez Jul 18, 2024
2c78bb2
docs: add PR and issue templates (#2002)
jaluma Jul 18, 2024
b626697
docs: update welcome page (#2004)
jaluma Jul 18, 2024
05a9862
Add proper param to demo urls (#2007)
imartinez Jul 22, 2024
dabf556
fix: ffmpy dependency (#2020)
jaluma Jul 29, 2024
20bad17
feat(llm): autopull ollama models (#2019)
jaluma Jul 29, 2024
d4375d0
fix(ui): gradio bug fixes (#2021)
jaluma Jul 29, 2024
d080969
added llama3 prompt (#1962)
hirschrobert Jul 29, 2024
65c5a17
chore(docker): dockerfiles improvements and fixes (#1792)
qdm12 Jul 30, 2024
1020cd5
fix: light mode (#2025)
jaluma Jul 31, 2024
e54a8fe
fix: prevent to ingest local files (by default) (#2010)
jaluma Jul 31, 2024
9027d69
feat: make llama3.1 as default (#2022)
jaluma Jul 31, 2024
40638a1
fix: unify embedding models (#2027)
jaluma Jul 31, 2024
8119842
feat(recipe): add our first recipe `Summarize` (#2028)
jaluma Jul 31, 2024
5465958
fix: nomic embeddings (#2030)
jaluma Aug 1, 2024
50b3027
docs: update docs and capture (#2029)
jaluma Aug 1, 2024
cf61bf7
feat(llm): add progress bar when ollama is pulling models (#2031)
jaluma Aug 1, 2024
e44a7f5
chore: bump version (#2033)
jaluma Aug 2, 2024
6674b46
chore(main): release 0.6.0 (#1834)
github-actions[bot] Aug 2, 2024
dae0727
fix(deploy): improve Docker-Compose and quickstart on Docker (#2037)
jaluma Aug 5, 2024
1d4c14d
fix(deploy): generate docker release when new version is released (#2…
jaluma Aug 5, 2024
1c665f7
fix: Adding azopenai to model list (#2035)
itsliamdowd Aug 5, 2024
f09f6dd
fix: add built image from DockerHub (#2042)
jaluma Aug 5, 2024
ca2b8da
chore(main): release 0.6.1 (#2041)
github-actions[bot] Aug 5, 2024
b16abbe
fix: update matplotlib to 3.9.1-post1 to fix win install
jaluma Aug 7, 2024
4ca6d0c
fix: add numpy issue to troubleshooting (#2048)
jaluma Aug 7, 2024
b1acf9d
fix: publish image name (#2043)
jaluma Aug 7, 2024
7fefe40
fix: auto-update version (#2052)
jaluma Aug 8, 2024
22904ca
chore(main): release 0.6.2 (#2049)
github-actions[bot] Aug 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat(llm): autopull ollama models (zylon-ai#2019)
* chore: update ollama (llm)

* feat: allow to autopull ollama models

* fix: mypy

* chore: install always ollama client

* refactor: check connection and pull ollama method to utils

* docs: update ollama config with autopulling info
  • Loading branch information
jaluma authored Jul 29, 2024
commit 20bad17c9857809158e689e9671402136c1e3d84
16 changes: 10 additions & 6 deletions fern/docs/pages/installation/installation.mdx
Original file line number Diff line number Diff line change
@@ -130,16 +130,20 @@ Go to [ollama.ai](https://ollama.ai/) and follow the instructions to install Oll

After the installation, make sure the Ollama desktop app is closed.

Install the models to be used, the default settings-ollama.yaml is configured to user `mistral 7b` LLM (~4GB) and `nomic-embed-text` Embeddings (~275MB). Therefore:

Now, start Ollama service (it will start a local inference server, serving both the LLM and the Embeddings):
```bash
ollama pull mistral
ollama pull nomic-embed-text
ollama serve
```

Now, start Ollama service (it will start a local inference server, serving both the LLM and the Embeddings):
Install the models to be used, the default settings-ollama.yaml is configured to user mistral 7b LLM (~4GB) and nomic-embed-text Embeddings (~275MB)

By default, PGPT will automatically pull models as needed. This behavior can be changed by modifying the `ollama.autopull_models` property.

In any case, if you want to manually pull models, run the following commands:

```bash
ollama serve
ollama pull mistral
ollama pull nomic-embed-text
```

Once done, on a different terminal, you can install PrivateGPT with the following command:
33 changes: 24 additions & 9 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

32 changes: 31 additions & 1 deletion private_gpt/components/embedding/embedding_component.py
Original file line number Diff line number Diff line change
@@ -71,16 +71,46 @@ def __init__(self, settings: Settings) -> None:
from llama_index.embeddings.ollama import ( # type: ignore
OllamaEmbedding,
)
from ollama import Client # type: ignore
except ImportError as e:
raise ImportError(
"Local dependencies not found, install with `poetry install --extras embeddings-ollama`"
) from e

ollama_settings = settings.ollama

# Calculate embedding model. If not provided tag, it will be use latest
model_name = (
ollama_settings.embedding_model + ":latest"
if ":" not in ollama_settings.embedding_model
else ollama_settings.embedding_model
)

self.embedding_model = OllamaEmbedding(
model_name=ollama_settings.embedding_model,
model_name=model_name,
base_url=ollama_settings.embedding_api_base,
)

if ollama_settings.autopull_models:
if ollama_settings.autopull_models:
from private_gpt.utils.ollama import (
check_connection,
pull_model,
)

# TODO: Reuse llama-index client when llama-index is updated
client = Client(
host=ollama_settings.embedding_api_base,
timeout=ollama_settings.request_timeout,
)

if not check_connection(client):
raise ValueError(
f"Failed to connect to Ollama, "
f"check if Ollama server is running on {ollama_settings.api_base}"
)
pull_model(client, model_name)

case "azopenai":
try:
from llama_index.embeddings.azure_openai import ( # type: ignore
23 changes: 21 additions & 2 deletions private_gpt/components/llm/llm_component.py
Original file line number Diff line number Diff line change
@@ -146,15 +146,32 @@ def __init__(self, settings: Settings) -> None:
"repeat_penalty": ollama_settings.repeat_penalty, # ollama llama-cpp
}

self.llm = Ollama(
model=ollama_settings.llm_model,
# calculate llm model. If not provided tag, it will be use latest
model_name = (
ollama_settings.llm_model + ":latest"
if ":" not in ollama_settings.llm_model
else ollama_settings.llm_model
)

llm = Ollama(
model=model_name,
base_url=ollama_settings.api_base,
temperature=settings.llm.temperature,
context_window=settings.llm.context_window,
additional_kwargs=settings_kwargs,
request_timeout=ollama_settings.request_timeout,
)

if ollama_settings.autopull_models:
from private_gpt.utils.ollama import check_connection, pull_model

if not check_connection(llm.client):
raise ValueError(
f"Failed to connect to Ollama, "
f"check if Ollama server is running on {ollama_settings.api_base}"
)
pull_model(llm.client, model_name)

if (
ollama_settings.keep_alive
!= ollama_settings.model_fields["keep_alive"].default
@@ -172,6 +189,8 @@ def wrapper(*args: Any, **kwargs: Any) -> Any:
Ollama.complete = add_keep_alive(Ollama.complete)
Ollama.stream_complete = add_keep_alive(Ollama.stream_complete)

self.llm = llm

case "azopenai":
try:
from llama_index.llms.azure_openai import ( # type: ignore
4 changes: 4 additions & 0 deletions private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
@@ -290,6 +290,10 @@ class OllamaSettings(BaseModel):
120.0,
description="Time elapsed until ollama times out the request. Default is 120s. Format is float. ",
)
autopull_models: bool = Field(
False,
description="If set to True, the Ollama will automatically pull the models from the API base.",
)


class AzureOpenAISettings(BaseModel):
32 changes: 32 additions & 0 deletions private_gpt/utils/ollama.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import logging

try:
from ollama import Client # type: ignore
except ImportError as e:
raise ImportError(
"Ollama dependencies not found, install with `poetry install --extras llms-ollama or embeddings-ollama`"
) from e

logger = logging.getLogger(__name__)


def check_connection(client: Client) -> bool:
try:
client.list()
return True
except Exception as e:
logger.error(f"Failed to connect to Ollama: {e!s}")
return False


def pull_model(client: Client, model_name: str, raise_error: bool = True) -> None:
try:
installed_models = [model["name"] for model in client.list().get("models", {})]
if model_name not in installed_models:
logger.info(f"Pulling model {model_name}. Please wait...")
client.pull(model_name)
logger.info(f"Model {model_name} pulled successfully")
except Exception as e:
logger.error(f"Failed to pull model {model_name}: {e!s}")
if raise_error:
raise e
9 changes: 6 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -22,7 +22,7 @@ llama-index-readers-file = "^0.1.27"
llama-index-llms-llama-cpp = {version = "^0.1.4", optional = true}
llama-index-llms-openai = {version = "^0.1.25", optional = true}
llama-index-llms-openai-like = {version ="^0.1.3", optional = true}
llama-index-llms-ollama = {version ="^0.1.5", optional = true}
llama-index-llms-ollama = {version ="^0.2.2", optional = true}
llama-index-llms-azure-openai = {version ="^0.1.8", optional = true}
llama-index-llms-gemini = {version ="^0.1.11", optional = true}
llama-index-embeddings-ollama = {version ="^0.1.2", optional = true}
@@ -62,16 +62,19 @@ ffmpy = {git = "https://github.com/EuDs63/ffmpy.git", rev = "333a19ee4d21f32537c
# Optional Google Gemini dependency
google-generativeai = {version ="^0.5.4", optional = true}

# Optional Ollama client
ollama = {version ="^0.3.0", optional = true}

[tool.poetry.extras]
ui = ["gradio", "ffmpy"]
llms-llama-cpp = ["llama-index-llms-llama-cpp"]
llms-openai = ["llama-index-llms-openai"]
llms-openai-like = ["llama-index-llms-openai-like"]
llms-ollama = ["llama-index-llms-ollama"]
llms-ollama = ["llama-index-llms-ollama", "ollama"]
llms-sagemaker = ["boto3"]
llms-azopenai = ["llama-index-llms-azure-openai"]
llms-gemini = ["llama-index-llms-gemini", "google-generativeai"]
embeddings-ollama = ["llama-index-embeddings-ollama"]
embeddings-ollama = ["llama-index-embeddings-ollama", "ollama"]
embeddings-huggingface = ["llama-index-embeddings-huggingface"]
embeddings-openai = ["llama-index-embeddings-openai"]
embeddings-sagemaker = ["boto3"]
1 change: 1 addition & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
@@ -117,6 +117,7 @@ ollama:
embedding_api_base: http://localhost:11434 # change if your embedding model runs on another ollama
keep_alive: 5m
request_timeout: 120.0
autopull_models: true

azopenai:
api_key: ${AZ_OPENAI_API_KEY:}