Merge pull request #139 from feiskyer/custom-model-support

Support custom model names for local or self-hosted LLMs
feiskyer · Jun 2, 2024 · 87d3988 · 87d3988
2 parents f977b26 + 8ecb34c
commit 87d3988
Show file tree

Hide file tree

Showing 3 changed files with 36 additions and 16 deletions.
diff --git a/README.md b/README.md
@@ -69,6 +69,17 @@ Refer following sections for more details of how to configure various openai ser
     "chatgpt.gpt3.apiBaseUrl": "<base-url>",
 ```
 
+### Configuring Custom Model Names
+
+To use a custom model name for local or self-hosted LLMs compatible with OpenAI, set the `chatgpt.gpt3.model` configuration to `"custom"` and specify your custom model name in the `chatgpt.gpt3.customModel` configuration.
+
+Example configuration for a custom model name:
+
+```json
+    "chatgpt.gpt3.model": "custom",
+    "chatgpt.gpt3.customModel": "my-custom-model-name",
+```
+
 ## How to install locally
 
 - Install `vsce` if you don't have it on your machine (The Visual Studio Code Extension Manager)

diff --git a/package.json b/package.json
@@ -413,12 +413,13 @@
             "text-ada-001",
             "code-davinci-002",
             "code-cushman-001",
+            "custom",
             "claude-3-opus-20240229",
             "claude-3-sonnet-20240229",
             "claude-3-haiku-20240307"
           ],
           "default": "gpt-3.5-turbo",
-          "markdownDescription": "OpenAI models to use for your prompts. [Documentation](https://beta.openai.com/docs/models/models). \n\n**If you face 400 Bad Request please make sure you are using the right model for your integration method.**",
+          "markdownDescription": "OpenAI models to use for your prompts. [Documentation](https://beta.openai.com/docs/models/models). \n\n**If you face 400 Bad Request please make sure you are using the right model for your integration method.** \n\nFor local or self-hosted LLMs compatible with OpenAI, you can select `custom` and specify your custom model name in `chatgpt.gpt3.customModel`.",
           "order": 33,
           "enumItemLabels": [
             "OpenAI API Key - gpt-3.5-turbo",
@@ -445,6 +446,7 @@
             "OpenAI API Key - text-ada-001",
             "OpenAI API Key - code-davinci-002",
             "OpenAI API Key - code-cushman-001",
+            "Custom Model",
             "Claude 3 - claude-3-opus-20240229",
             "Claude 3 - claude-3-sonnet-20240229",
             "Claude 3 - claude-3-haiku-20240307"
@@ -474,66 +476,73 @@
             "text-ada-001",
             "code-davinci-002",
             "code-cushman-001",
+            "Select this option to specify a custom model name for local or self-hosted LLMs compatible with OpenAI.",
             "claude-3-opus-20240229",
             "claude-3-sonnet-20240229",
             "claude-3-haiku-20240307"
           ]
         },
+        "chatgpt.gpt3.customModel": {
+          "type": "string",
+          "default": "",
+          "markdownDescription": "Specify your custom model name here if you selected `custom` in `chatgpt.gpt3.model`. This allows you to use a custom model name for local or self-hosted LLMs compatible with OpenAI.",
+          "order": 34
+        },
         "chatgpt.gpt3.maxTokens": {
           "type": "number",
           "default": 1024,
           "markdownDescription": "The maximum number of tokens to generate in the completion. \n\nThe token count of your prompt plus max_tokens cannot exceed the model's context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096). [Documentation](https://beta.openai.com/docs/api-reference/completions/create#completions/create-max_tokens) \n\n**Please enable OpenAI API Key method to use this setting.**",
-          "order": 34
+          "order": 35
         },
         "chatgpt.gpt3.temperature": {
           "type": "number",
           "default": 1,
           "markdownDescription": "What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer.\n\nIt is recommended altering this or top_p but not both. [Documentation](https://beta.openai.com/docs/api-reference/completions/create#completions/create-temperature) \n\n**Please enable OpenAI API Key method to use this setting.**",
-          "order": 35
+          "order": 36
         },
         "chatgpt.gpt3.top_p": {
           "type": "number",
           "default": 1,
           "markdownDescription": "An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. \n\nIt is recommended altering this or temperature but not both. [Documentation](https://beta.openai.com/docs/api-reference/completions/create#completions/create-top_p) \n\n**Please enable OpenAI API Key method to use this setting.**",
-          "order": 36
+          "order": 37
         },
         "chatgpt.response.showNotification": {
           "type": "boolean",
           "default": false,
           "description": "Choose whether you'd like to receive a notification when ChatGPT bot responds to your query.",
-          "order": 37
+          "order": 38
         },
         "chatgpt.response.autoScroll": {
           "type": "boolean",
           "default": true,
           "description": "Whenever there is a new question or response added to the conversation window, extension will automatically scroll to the bottom. You can change that behavior by disabling this setting.",
-          "order": 38
+          "order": 39
         },
         "chatgpt.telemetry.disable": {
           "type": "boolean",
           "default": false,
           "markdownDescription": "Specify if you want to disable the telemetry. This extension also respects your default vs-code telemetry setting `telemetry.telemetryLevel`. We check both settings for telemetry. **Important**: No user data is tracked, we only use telemetry to see what is used, and what isn't. This allows us to make accurate decisions on what to add or enhance to the extension.",
-          "order": 39
+          "order": 40
         },
         "chatgpt.gpt3.googleCSEApiKey": {
           "type": "string",
           "markdownDescription": "Google search API key.",
-          "order": 40
+          "order": 41
         },
         "chatgpt.gpt3.googleCSEId": {
           "type": "string",
           "markdownDescription": "Google custom search ID.",
-          "order": 41
+          "order": 42
         },
         "chatgpt.gpt3.serperKey": {
           "type": "string",
           "markdownDescription": "API key of Serper search API.",
-          "order": 42
+          "order": 43
         },
         "chatgpt.gpt3.bingKey": {
           "type": "string",
           "markdownDescription": "API key of Bing search API.",
-          "order": 43
+          "order": 44
         }
       }
     }
@@ -561,7 +570,7 @@
     "@typescript-eslint/parser": "^7.11.0",
     "@vscode/test-electron": "^2.3.9",
     "@vscode/vsce": "^2.24.0",
-    "esbuild": "^0.21.4",
+    "esbuild": "^0.20.1",
     "eslint": "^8.56.0",
     "glob": "^10.3.10",
     "mocha": "^10.3.0",
@@ -573,7 +582,7 @@
   "dependencies": {
     "@langchain/anthropic": "^0.1.10",
     "@types/minimatch": "^5.1.2",
-    "axios": "^1.7.2",
+    "axios": "^1.6.7",
     "cheerio": "^1.0.0-rc.12",
     "delay": "^6.0.0",
     "eventsource-parser": "^1.1.2",
@@ -597,4 +606,4 @@
   "resolutions": {
     "clone-deep": "^4.0.1"
   }
-}
+}
diff --git a/src/chatgpt-view-provider.ts b/src/chatgpt-view-provider.ts
@@ -699,7 +699,7 @@ export default class ChatGptViewProvider implements vscode.WebviewViewProvider {
 						</div>
 
 						<button id="stop-button" class="btn btn-primary flex items-end p-1 pr-2 rounded-md ml-5">
-							<svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" class="w-5 h-5 mr-2"><path stroke-linecap="round" stroke-linejoin="round" d="M9.75 9.75l4.5 4.5m0-4.5l-4.5 4.5M21 12a9 9 0 11-18 0 9 9 0 0118 0z" /></svg>Stop responding</button>
+							<svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" class="w-5 h-5 mr-2"><path stroke-linecap="round" stroke-linejoin="round" d="M12 4.5v15m7.5-7.5h-15m0 0L7.5 12m4.5 4.5V3" /></svg>Stop responding</button>
 					</div>
 
 					<div class="p-4 flex items-center pt-2">
@@ -712,7 +712,7 @@ export default class ChatGptViewProvider implements vscode.WebviewViewProvider {
 								onInput="this.parentNode.dataset.replicatedValue = this.value"></textarea>
 						</div>
 						<div id="chat-button-wrapper" class="absolute bottom-14 items-center more-menu right-8 border border-gray-200 shadow-xl hidden text-xs">
-							<button class="flex gap-2 items-center justify-start p-2 w-full" id="clear-button"><svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" class="w-4 h-4"><path stroke-linecap="round" stroke-linejoin="round" d="M12 4.5v15m7.5-7.5h-15" /></svg>&nbsp;New chat</button>
+							<button class="flex gap-2 items-center justify-start p-2 w-full" id="clear-button"><svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" class="w-4 h-4"><path stroke-linecap="round" stroke-linejoin="round" d="M12 6.75a.75.75 0 110-1.5.75.75 0 010 1.5zM12 12.75a.75.75 0 110-1.5.75.75 0 010 1.5zM12 18.75a.75.75 0 110-1.5.75.75 0 010 1.5z" /></svg>&nbsp;New chat</button>
 							<button class="flex gap-2 items-center justify-start p-2 w-full" id="settings-button"><svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" class="w-4 h-4"><path stroke-linecap="round" stroke-linejoin="round" d="M9.594 3.94c.09-.542.56-.94 1.11-.94h2.593c.55 0 1.02.398 1.11.94l.213 1.281c.063.374.313.686.645.87.074.04.147.083.22.127.324.196.72.257 1.075.124l1.217-.456a1.125 1.125 0 011.37.49l1.296 2.247a1.125 1.125 0 01-.26 1.431l-1.003.827c-.293.24-.438.613-.431.992a6.759 6.759 0 010 .255c-.007.378.138.75.43.99l1.005.828c.424.35.534.954.26 1.43l-1.298 2.247a1.125 1.125 0 01-1.369.491l-1.217-.456c-.355-.133-.75-.072-1.076.124a6.57 6.57 0 01-.22.128c-.331.183-.581.495-.644.869l-.213 1.28c-.09.543-.56.941-1.11.941h-2.594c-.55 0-1.02-.398-1.11-.94l-.213-1.281c-.062-.374-.312-.686-.644-.87a6.52 6.52 0 01-.22-.127c-.325-.196-.72-.257-1.076-.124l-1.217.456a1.125 1.125 0 01-1.369-.49l-1.297-2.247a1.125 1.125 0 01.26-1.431l1.004-.827c.292-.24.437-.613.43-.992a6.932 6.932 0 010-.255c.007-.378-.138-.75-.43-.99l-1.004-.828a1.125 1.125 0 01-.26-1.43l1.297-2.247a1.125 1.125 0 011.37-.491l1.216.456c.356.133.751.072 1.076-.124.072-.044.146-.087.22-.128.332-.183.582-.495.644-.869l.214-1.281z" /><path stroke-linecap="round" stroke-linejoin="round" d="M15 12a3 3 0 11-6 0 3 3 0 016 0z" /></svg>&nbsp;Update settings</button>
 							<button class="flex gap-2 items-center justify-start p-2 w-full" id="export-button"><svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" class="w-4 h-4"><path stroke-linecap="round" stroke-linejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5M16.5 12L12 16.5m0 0L7.5 12m4.5 4.5V3" /></svg>&nbsp;Export to markdown</button>
 						</div>