Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BionicGPT configuration for ollama backend #38

Open
FrantaNautilus opened this issue Sep 28, 2024 · 2 comments
Open

BionicGPT configuration for ollama backend #38

FrantaNautilus opened this issue Sep 28, 2024 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@FrantaNautilus
Copy link

In the documentation the Bionic GPT it is mentioned that it works with ollama and OpenAPI compatible backends and it is demonstrated running a local gemma model. I could not find information on how to properly configure the settings of the Bionic GPT frontend, only thing I could find was the official documentation (https://bionic-gpt.com/docs/running-locally/ollama/). I tried to follow the steps by adding a model with name listed by ollama list, domain from harbor url ollama and set api key to olllama. Then I added an assistant with this LLM as a backend. When I submit a message into the chat I am getting connection refused error:

Transport(
    reqwest::Error {
        kind: Request,
        url: Url {
            scheme: "http",
            cannot_be_a_base: false,
            username: "",
            password: None,
            host: Some(
                Domain(
                    "localhost",
                ),
            ),
            port: Some(
                33821,
            ),
            path: "/chat/completions",
            query: None,
            fragment: None,
        },
        source: Error {
            kind: Connect,
            source: Some(
                ConnectError(
                    "tcp connect error",
                    Os {
                        code: 111,
                        kind: ConnectionRefused,
                        message: "Connection refused",
                    },
                ),
            ),
        },
    },
)

and harbor logs bionicgpt outputs:

WARN[0000] The "HARBOR_WHISPER_VERSION" variable is not set. Defaulting to a blank string. 
WARN[0000] The "HARBOR_WHISPER_HOST_PORT" variable is not set. Defaulting to a blank string. 
harbor.bionicgpt  |   - name: base
harbor.bionicgpt  |     static_layer:
harbor.bionicgpt  |       {}
harbor.bionicgpt  |   - name: admin
harbor.bionicgpt  |     admin_layer:
harbor.bionicgpt  |       {}
harbor.bionicgpt  | [2024-09-28 17:31:17.326][11][info][config] [source/server/configuration_impl.cc:125] loading tracing configuration
harbor.bionicgpt  | [2024-09-28 17:31:17.327][11][info][config] [source/server/configuration_impl.cc:85] loading 0 static secret(s)
harbor.bionicgpt  | [2024-09-28 17:31:17.327][11][info][config] [source/server/configuration_impl.cc:91] loading 2 cluster(s)
harbor.bionicgpt  | [2024-09-28 17:31:17.328][11][info][config] [source/server/configuration_impl.cc:95] loading 1 listener(s)
harbor.bionicgpt  | [2024-09-28 17:31:17.333][11][warning][misc] [source/common/protobuf/message_validator_impl.cc:21] Deprecated field: type envoy.extensions.filters.http.ext_authz.v3.ExtAuthz Using the default now-deprecated value AUTO for enum 'envoy.extensions.filters.http.ext_authz.v3.ExtAuthz.transport_api_version' from file ext_authz.proto. This enum value will be removed from Envoy soon so a non-default value must now be explicitly set. Please see https://www.envoyproxy.io/docs/envoy/latest/version_history/version_history for details. If continued use of this field is absolutely necessary, see https://www.envoyproxy.io/docs/envoy/latest/configuration/operations/runtime#using-runtime-overrides-for-deprecated-features for how to apply a temporary and highly discouraged override.
harbor.bionicgpt  | [2024-09-28 17:31:17.333][11][info][lua] [source/extensions/filters/http/lua/lua_filter.cc:170] envoy_on_request() function not found. Lua filter will not hook requests.
harbor.bionicgpt  | [2024-09-28 17:31:17.333][11][info][lua] [source/extensions/filters/http/lua/lua_filter.cc:170] envoy_on_request() function not found. Lua filter will not hook requests.
harbor.bionicgpt  | [2024-09-28 17:31:17.334][11][info][config] [source/server/configuration_impl.cc:107] loading stats configuration
harbor.bionicgpt  | [2024-09-28 17:31:17.334][11][info][main] [source/server/server.cc:732] starting main dispatch loop
harbor.bionicgpt  | [2024-09-28 17:31:20.795][11][info][runtime] [source/common/runtime/runtime_impl.cc:425] RTDS has finished initialization
harbor.bionicgpt  | [2024-09-28 17:31:20.795][11][info][upstream] [source/common/upstream/cluster_manager_impl.cc:191] cm init: all clusters initialized
harbor.bionicgpt  | [2024-09-28 17:31:20.795][11][info][main] [source/server/server.cc:713] all clusters initialized. initializing init manager
harbor.bionicgpt  | [2024-09-28 17:31:20.795][11][info][config] [source/server/listener_manager_impl.cc:888] all dependencies initialized. starting workers
harbor.bionicgpt  | [2024-09-28 17:31:20.796][11][warning][main] [source/server/server.cc:610] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections
@av
Copy link
Owner

av commented Sep 28, 2024

This was one of the most, if not the most complicated services to setup 😅

There's a fully-fledged reverse proxy and lots of sub-services requiring carefull routing. All-in-all, I wouldn't be surprising if that setup stopped working completely, just out of the drift with upstream docker images and the expected way it has to be put into compose.

I should more appropriately mark that BionicGPT is only partially supported (due to settings only stored in the DB, so any pre-provisioned configs should be sent there).


With that said, the posted error with reqwest (Rust HTTP call lib) indicates that it comes from within the container. So, indeed, within the container that address won't be resolvable.

In cases like this, -i flag can be used with harbor url to obtain a URL that can be used within the harbor network for containers to talk to one another. Here's a sample:

 everlier@pop-os:~$ ▼ h url ollama
http://localhost:33821

 everlier@pop-os:~$ ▼ h url -i ollama
http://harbor.ollama:11434

 everlier@pop-os:~$ ▼ h url -a ollama
http://192.168.0.136:33821

There's also -a for the (supposed) address on the LAN - could be used to access the service from another machine and is used by harbor qr, for example.

@av av added the documentation Improvements or additions to documentation label Sep 28, 2024
@FrantaNautilus
Copy link
Author

Thank you again, I was able to get the chat working by setting the domain of model to http://harbor.ollama:11434/v1 which is the output of url -i ollama with added /v1, without it I keep getting the following error:

InvalidStatusCode(
    404,
    Response {
        url: Url {
            scheme: "http",
            cannot_be_a_base: false,
            username: "",
            password: None,
            host: Some(
                Domain(
                    "harbor.ollama",
                ),
            ),
            port: Some(
                11434,
            ),
            path: "/chat/completions",
            query: None,
            fragment: None,
        },
        status: 404,
        headers: {
            "content-type": "text/plain",
            "date": "Sun, 29 Sep 2024 19:37:16 GMT",
            "content-length": "18",
        },
    },
)

For some reason, not all models work. I was able to get gemma2:9b working, but qwen2.5:7b does not work. Only thing that I could find which differentiates the working models from the broken ones is the presence of dot . in the model name.

I also tried to get RAG working by setting up dataset and modifying the assistant to have access to this dataset. When using the default embedding I am getting the following error:

Api Error: error sending request for url (http://embeddings-api/embeddings)

and when I create new embedding model and a new dataset with this model for embedding and subsequently editing the assistant to use this dataset I am getting different error that no chunks were received and this quickly changes to:

Api Error: error decoding response body

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants