Merge branch 'develop'

yaroslavyaroslav · Oct 5, 2024 · 7b4ca45 · 7b4ca45
2 parents 76be79b + 8e8ee06
commit 7b4ca45
Show file tree

Hide file tree

Showing 18 changed files with 335 additions and 128 deletions.
diff --git a/.ruff.toml b/.ruff.toml
@@ -1,4 +1,4 @@
-line-length = 100
+line-length = 110
 
 [format]
 quote-style = "single"
diff --git a/Default.sublime-commands b/Default.sublime-commands
@@ -23,15 +23,11 @@
 	{
 		"caption": "OpenAI: New Message",
 		"command": "openai",
-		"args": {
-			"mode": "chat_completion"
-		}
 	},
 	{
 		"caption": "OpenAI: New Message With Sheets",
 		"command": "openai",
 		"args": {
-			"mode": "chat_completion",
 			"files_included": true
 		}
 	},

diff --git a/README.md b/README.md
@@ -1,24 +1,20 @@
-[![Star on GitHub][img-stars]][stars]
+[![Star on GitHub][img-stars]][stars] ![Package Control][img-downloads]
 
 # OpenAI Sublime Text Plugin
 ## tldr;
 
-OpenAI Completion is a Sublime Text plugin that uses LLM models to provide first class code assistant support within the editor.
+Cursor level of AI assistance for Sublime Text. I mean it.
 
-It's not locked with just OpenAI anymore. [llama.cpp](https://github.com/ggerganov/llama.cpp) server and [ollama](https://ollama.com) supported as well.
+Works with all OpenAI'ish API: [llama.cpp](https://github.com/ggerganov/llama.cpp) server, [ollama](https://ollama.com) or whatever third party LLM hosting.
 
-![](static/media/ai_chat_left.png)
-
-> [!NOTE]
-> I think this plugin is in its finite state. Meaning there's no further development of it I have in plans. I still have plans to fix bugs and review PR if any, but those tons of little enhancement that could be applied here to fix minor issues and roughness and there likely never would.
-
-> What I do have in plans is to implement ST front end for [plandex](https://github.com/plandex-ai/plandex) tool based on some parts of this plugin codebase, to get (and to bring) a fancy and powerful agentish capabilities to ST ecosystem. So stay tuned.
+![](static/media/ai_chat_right_phantom.png)
 
 ## Features
 
 - Code manipulation (append, insert and edit) selected code with OpenAI models.
+- **Phantoms** Get non-disruptive inline right in view answers from the model.
 - **Chat mode** powered by whatever model you'd like.
-- **GPT-4 support**.
+- **gpt-o1 support**.
 - **[llama.cpp](https://github.com/ggerganov/llama.cpp)**'s server, **[Ollama](https://ollama.com)** and all the rest OpenAI'ish API compatible.
 - **Dedicated chats histories** and assistant settings for a projects.
 - **Ability to send whole files** or their parts as a context expanding.
@@ -75,7 +71,7 @@ You can separate a chat history and assistant settings for a given project by ap
 {   
     "settings": {
         "ai_assistant": {
-            "cache_prefix": "your_name_project"
+            "cache_prefix": "your_project_name"
         }
     }
 }
@@ -100,16 +96,31 @@ To send the whole file(s) in advance to request you should `super+button1` on th
 
 Image handle can be called by `OpenAI: Handle Image` command.
 
-It expects an absolute path of image to be selected in a buffer on the command call (smth like `/Users/username/Documents/Project/image.png`). In addition command can be passed by input panel to proceed the image with special treatment. `png` and `jpg` images are only supported.
+It expects an absolute path to image to be selected in a buffer or stored in clipboard on the command call (smth like `/Users/username/Documents/Project/image.png`). In addition command can be passed by input panel to proceed the image with special treatment. `png` and `jpg` images are only supported.
 
-> [!WARNING]
-> Userflow don't expects that image url would be passed by that input panel input, it has to be selected in buffer. I'm aware about the UX quality of this design decision, but yet I'm too lazy to develop it further to some better state.
+> [!NOTE]
+> Currently plugin expects the link or the list of links separated by a new line to be selected in buffer or stored in clipboard **only**.
 
 ### In buffer llm use case
 
-1. You can pick one of the following modes: `append`, `replace`, `insert`. They're quite self-descriptive. They should be set up in assistant settings to take effect.
-2. Select some text (they're useless otherwise) to manipulate with and hit `OpenAI: New Message`.
-4. The plugin will response accordingly with **appending**, **replacing** or **inserting** some text.
+#### Phantom use case
+
+Phantom is the overlay UI placed inline in the editor view (see the picture below). It doesn't affects content of the view. 
+
+1. You can set `"prompt_mode": "phantom"` for AI assistant in its settings.
+2. [optional] Select some text to pass in context in to manipulate with.
+3. Hit `OpenAI: New Message` or `OpenAI: Chat Model Select` and ask whatever you'd like in popup input pane.
+4. Phantom will appear below the cursor position or the beginning of the selection while the streaming LLM answer occurs.
+5. You can apply actions to the llm prompt, they're quite self descriptive and follows behavior deprecated in buffer commands.
+6. You can hit `ctrl+c` to stop prompting same as with in `panel` mode.
+
+![](static/media/phantom_example.png)
+
+> [!WARNING]
+> The following in buffer commands are deprecated and will be removed in 5.0 release.
+> 1. You can pick one of the following modes: `append`, `replace`, `insert`. They're quite self-descriptive. They should be set up in assistant settings to take effect.
+> 2. Select some text (they're useless otherwise) to manipulate with and hit `OpenAI: New Message`.
+> 4. The plugin will response accordingly with **appending**, **replacing** or **inserting** some text.
 
 > [!IMPORTANT]
 >  Yet this is a standalone mode, i.e. an existing chat history won't be sent to a server on a run.
@@ -138,10 +149,6 @@ The OpenAI Completion plugin has a settings file where you can set your OpenAI A
 }
 ```
 
-### ollama setup specific
-
-If you're here it meaning that a model that you're using with ollama talking shit. This is because `temperature` property of a model which is 1 somewhat [doubles](https://github.com/ollama/ollama/blob/69be940bf6d2816f61c79facfa336183bc882720/openai/openai.go#L454) on ollama's side, so it becomes 2, which is a little bit too much for a good model's response. So you to make things work you have to set temperature to 1.
-
 ### Advertisement disabling
 
 To disable advertisement you have to add `"advertisement": false` line into an assistant setting where you wish it to be disabled.
@@ -188,3 +195,5 @@ You can setup it up by overriding the proxy property in the `OpenAI completion`
 
 [stars]: https://github.com/yaroslavyaroslav/OpenAI-sublime-text/stargazers
 [img-stars]: static/media/star-on-github.svg
+[downloads]: https://packagecontrol.io/packages/OpenAI%20completion
+[img-downloads]: https://img.shields.io/packagecontrol/dt/OpenAI%2520completion.svg
diff --git a/dependencies.json b/dependencies.json
@@ -1,7 +1,8 @@
 {
     "*": {
         "*": [
-            "requests"
+            "requests",
+            "mdpopups"
         ]
     }
 }
diff --git a/main.py b/main.py
@@ -16,6 +16,7 @@
 from .plugins.openai import Openai  # noqa: E402, F401
 from .plugins.openai_panel import OpenaiPanelCommand  # noqa: E402, F401
 from .plugins.output_panel import SharedOutputPanelListener  # noqa: E402, F401
+from .plugins.phantom_streamer import PhantomStreamer # noqa: E402, F401
 from .plugins.settings_reloader import ReloadSettingsListener  # noqa: E402, F401
 from .plugins.stop_worker_execution import (  # noqa: E402
     StopOpenaiExecutionCommand,  # noqa: F401

diff --git a/messages.json b/messages.json
@@ -23,5 +23,6 @@
     "4.0.0": "messages/4.0.0.md",
     "4.0.1": "messages/4.0.1.md",
     "4.1.0": "messages/4.1.0.md",
-    "4.1.1": "messages/4.1.1.md"
+    "4.1.1": "messages/4.1.1.md",
+    "4.2.0": "messages/4.2.0.md"
 }
diff --git a/messages/4.2.0.md b/messages/4.2.0.md
@@ -0,0 +1,36 @@
+## Featues
+
+- New in buffer mode `phantom`
+- `stream` toggle for responses brought back
+- images handling UX improved
+- advertisement logic improved
+
+## Deprecated
+- `append`, `replace`, `insert` in prompt modes is deprecated and will be removed in 5.0 release.
+
+## Detaied description
+
+### Phantom mode
+
+Phantom is the overlay UI placed inline in the editor view (see the picture below). It doesn't affects content of the view. 
+
+1. You can set `"prompt_mode": "phantom"` for AI assistant in its settings.
+2. [optional] Select some text to pass in context in to manipulate with.
+3. Hit `OpenAI: New Message` or `OpenAI: Chat Model Select` and ask whatever you'd like in popup input pane.
+4. Phantom will appear below the cursor position or the beginning of the selection while the streaming LLM answer occurs.
+5. You can apply actions to the llm prompt, they're quite self descriptive and follows behavior deprecated in buffer commands.
+6. You can hit `ctrl+c` to stop prompting same as with in `panel` mode.
+
+### Stream toggle
+
+You can toggle streaming behavior of a model response with `"stream": false` setting in per assistant basis. That's pretty much it, the default value is `true`.
+
+### Images handling UX improved
+
+Images paths can now be fetched from the clipboard in addition to be extracted from the selection in a given view. It could be either a single image path [and nothing more than that] or a list of such paths separated with a new line, e.g. `/Users/username/Documents/Project/image0.png\n/Users/username/Documents/Project/image1.png`.
+
+Please note the parser that is trying to deduct whether the content of your clipboard is an [list of] image[s] is made by AI and quite fragile, so don't expect too much from it.
+
+### Advertisement logic improvement
+
+Advertisements now appear only when users excessively utilize the plugin, such as by processing too many tokens or sending/receiving an excessive number of messages.
diff --git a/openAI.sublime-settings b/openAI.sublime-settings
@@ -67,13 +67,16 @@
 
             // Mode of how plugin should prompts its output, available options:
             //  - panel: prompt would be output in output panel, selected text wouldn't be affected in any way.
-            //  - append: prompt would be added next to the selected text.
-            //  - insert: prompt would be inserted instead of a placeholder within a selected text.
-            //  - replace: prompt would overwrite selected text.
+            //  - append: [DEPRECATED] prompt would be added next to the selected text.
+            //  - insert: [DEPRECATED] prompt would be inserted instead of a placeholder within a selected text.
+            //  - replace: [DEPRECATED] prompt would overwrite selected text.
+            //  - phantom: llm prompts in phantom view in a non-dstruptive way to the buffer content, each such phantom provides way to copy to clipboard, append, replace or paste in new created tab all the content with a single click.
             //
             // All cases but `panel` required to some text be selected beforehand.
             // The same in all cases but `panel` user type within input panel will be treated by a model
             // as `system` command, e.g. instruction to action.
+            //
+            // NOTE: Please pay attention that append, insert and replace are deprecated and will be removed in 5.0 release in favor to phantom mode.
             "prompt_mode": "panel", // **REQUIRED**
 
             // The model which will generate the chat completion.
@@ -121,6 +124,11 @@
             // Does not affect editing mode.
             "max_tokens": 2048,
 
+            // Since o1 (September 2024) OpenAI deprecated max_token key,
+            // Use this field to set the cap instead. The default value set here is recommended by OpenAI
+            // _minimal_ value for this particular model. https://platform.openai.com/docs/guides/reasoning/allocating-space-for-reasoning
+            "max_completion_tokens": 25000,
+
             // An alternative to sampling with temperature, called nucleus sampling,
             // where the model considers the results of the tokens with `top_p` probability mass.
             // So 0.1 means only the tokens comprising the top 10% probability mass are considered.
@@ -134,6 +142,12 @@
             // docs: https://platform.openai.com/docs/api-reference/parameter-details
             "frequency_penalty": 0,
 
+            // Toggles whether to stream the response from the server or to get in atomically
+            // after llm finishes its prompting.
+            //
+            // By default this is true.
+            "stream": true,
+
             // Number between -2.0 and 2.0.
             /// Positive values penalize new tokens based on whether they appear in the text so far,
             // increasing the model's likelihood to talk about new topics.
@@ -142,6 +156,13 @@
         },
 
         // Instructions //
+        {
+            "name": "Insert instruction example",
+            "prompt_mode": "phantom",
+            "chat_model": "gpt-4o-mini", // works unreliable with gpt-3.5-turbo yet.
+            "assistant_role": "Insert code or whatever user will request with the following command instead of placeholder with respect to senior knowledge of in Python 3.8 and Sublime Text 4 plugin API",
+            "max_tokens": 4000,
+        },
         {
             "name": "Insert instruction example",
             "prompt_mode": "insert",
@@ -190,23 +211,9 @@
             "temperature": 1,
             "max_tokens": 2048,
         },
-        {
-            "name": "UIKit & Combine",
-            "prompt_mode": "panel",
-            "chat_model": "gpt-4o-mini",
-            "assistant_role": "You are senior UIKit and Combine code assistant",
-            "max_tokens": 4000,
-        },
-        {
-            "name": "Social Researcher",
-            "prompt_mode": "panel",
-            "chat_model": "gpt-4o-mini",
-            "assistant_role": "You are senior social researcher",
-            "max_tokens": 4000,
-        },
         {
             "name": "Corrector",
-            "prompt_mode": "replace",
+            "prompt_mode": "phantom",
             "chat_model": "gpt-4o-mini",
             "assistant_role": "Fix provided text with the correct and sounds English one, you are strictly forced to skip any changes in such its part that have not rules violation within them, you're strictly forbidden to wrap response into something and to provide any explanation.",
             "max_tokens": 1000,
@@ -218,19 +225,5 @@
             "assistant_role": "1. You are to provide clear, concise, and direct responses.\n2. Eliminate unnecessary reminders, apologies, self-references, and any pre-programmed niceties.\n3. Maintain a casual tone in your communication.\n4. Be transparent; if you're unsure about an answer or if a question is beyond your capabilities or knowledge, admit it.\n5. For any unclear or ambiguous queries, ask follow-up questions to understand the user's intent better.\n6. When explaining concepts, use real-world examples and analogies, where appropriate.\n7. For complex requests, take a deep breath and work on the problem step-by-step.\n8. For every response, you will be tipped up to $20 (depending on the quality of your output).\n\nIt is very important that you get this right. Multiple lives are at stake.\n",
             "max_tokens": 4000,
         },
-        {
-            "name": "Bash & Git assistant",
-            "prompt_mode": "panel",
-            "chat_model": "gpt-4o-mini",
-            "assistant_role": "You are bash and git senior assistant",
-            "max_tokens": 4000,
-        },
-        {
-            "name": "Pytorch assistant",
-            "prompt_mode": "panel",
-            "chat_model": "gpt-4o-mini",
-            "assistant_role": "You are senior Pytorch and LLM/SD code assistant",
-            "max_tokens": 4000,
-        },
     ]
 }
diff --git a/plugins/ai_chat_event.py b/plugins/ai_chat_event.py
@@ -36,9 +36,5 @@ def is_ai_chat_tab_active(self, window: Window) -> bool:
         return active_view.name() == 'AI Chat' if active_view else False
 
     def get_status_message(self, cacher: Cacher) -> str:
-        tokens = cacher.read_tokens_count()
-        prompt = tokens['prompt_tokens'] if tokens else 0
-        completion = tokens['completion_tokens'] if tokens else 0
-        total = prompt + completion
-
-        return f'[⬆️: {prompt:,} + ⬇️: {completion:,} = {total:,}]'
+        prompt, completion = cacher.read_tokens_count()
+        return f'[⬆️: {prompt:,} + ⬇️: {completion:,} = {prompt + completion:,}]'
diff --git a/plugins/assistant_settings.py b/plugins/assistant_settings.py
@@ -10,6 +10,7 @@ class PromptMode(Enum):
     append = 'append'
     insert = 'insert'
     replace = 'replace'
+    phantom = 'phantom'
 
 
 @dataclass
@@ -19,25 +20,30 @@ class AssistantSettings:
     url: str | None
     token: str | None
     chat_model: str
-    assistant_role: str
-    temperature: int
-    max_tokens: int
-    top_p: int
-    frequency_penalty: int
-    presence_penalty: int
+    assistant_role: str | None
+    temperature: int | None
+    max_tokens: int | None
+    max_completion_tokens: int | None
+    top_p: int | None
+    frequency_penalty: int | None
+    presence_penalty: int | None
     placeholder: str | None
+    stream: bool | None
     advertisement: bool
 
 
 DEFAULT_ASSISTANT_SETTINGS: Dict[str, Any] = {
     'placeholder': None,
+    'assistant_role': None,
     'url': None,
     'token': None,
-    'temperature': 1,
-    'max_tokens': 2048,
-    'top_p': 1,
-    'frequency_penalty': 0,
-    'presence_penalty': 0,
+    'temperature': None,
+    'max_tokens': None,
+    'max_completion_tokens': None,
+    'top_p': None,
+    'frequency_penalty': None,
+    'presence_penalty': None,
+    'stream': True,
     'advertisement': True,
 }
 
@@ -47,4 +53,3 @@ class CommandMode(Enum):
     refresh_output_panel = 'refresh_output_panel'
     create_new_tab = 'create_new_tab'
     reset_chat_history = 'reset_chat_history'
-    chat_completion = 'chat_completion'
diff --git a/plugins/buffer.py b/plugins/buffer.py
@@ -1,4 +1,9 @@
+from __future__ import annotations
+
+from typing import Dict
+
 from sublime import Edit, Region, View
+from sublime_types import Point
 from sublime_plugin import TextCommand
 
 
@@ -25,10 +30,10 @@ def run(self, edit: Edit, position: int, text: str):  # type: ignore
 
 
 class ReplaceRegionCommand(TextCommand):
-    def run(self, edit: Edit, region, text: str):  # type: ignore
+    def run(self, edit: Edit, region: Dict[str, Point], text: str):  # type: ignore
         self.view.replace(edit=edit, region=Region(region['a'], region['b']), text=text)
 
 
 class EraseRegionCommand(TextCommand):
-    def run(self, edit: Edit, region):  # type: ignore
+    def run(self, edit: Edit, region: Dict[str, Point]):  # type: ignore
         self.view.erase(edit=edit, region=Region(region['a'], region['b']))