GitHub - tonypeng1/Personal-ChatGPT

Personal LLM Chat APP

This LLM chat app is an open-source app developed using Streamlit and Docker. It is powered by various LLM APIs and has extra features to customize the user experience. It supports image and text prompt input and online search citations with URL links. A chat session is persistent in a MySQL database for retrieval and search. A retrieved chat session can be saved to an .HTML file locally or deleted from the database.

The short video below demonstrates some of the features. https://youtu.be/cHsequP0Wsw

APP Features

Version 2.15 of this APP has upgrade gpt-5.1-2025-11-13 and gpt-5.1-2025-11-13-thinking to gpt-5.2-2025-12-11 and gpt-5.2-2025-12-11-thinking, respectively, which in the former the thinking effort is set to low while in the latter high. GPT-5.2 is better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long contexts, using tools, and handling complex, multi-step projects.

Version 2.14 of this APP has made the following changes:

Upgrade both gpt-5-mini-2025-08-07 and gpt-5-mini-2025-08-07-thinking to gpt-5.1-2025-11-13 and gpt-5.1-2025-11-13-thinking, where in the former the reasoning effort is set to low while in the latter to high. GPT-5.1 is a smarter, more conversational model that is warmer, more intelligent, and better at following your instructions. It is also faster on simple tasks, more persistent on complex ones.
Upgrade both gemini-2.0-flash and gemini-2.5-pro to gemini-3-pro-preview and gemini-3-pro-preview-thinking, where in the former the thinking level is set to low while in the latter to high. Gemini-3-Pro is best for multimodal understanding and agentic and vibe coding, delivering richer visualizations and deeper interactivity — all built on a foundation of state-of-the-art reasoning.
Implement streaming mode in both gpt-5.1-2025-11-13 and gpt-5.1-2025-11-13-thinking models.
Add web-search capabillity to the gemini-3-pro-preview-thinking model.
Upgrade google-genai package to version 1.52.0 and mistralai to 1.9.11.

At present, this APP has six models that support web-search result citations:

gpt-5.2-2025-12-11
gpt-5.2-2025-12-11-thinking
claude-sonnet-4-5-20250929
gemini-3-pro-preview
gemini-3-pro-preview-thinking
perplexity-sonar-pro

Version 2.13 of this APP has replaced the o3-mini-high reasoning model with the gpt-5-mini-2025-08-07 model with its reasoning effort level set to high (listed as gpt-5-mini-2025-08-07-thinking in the model options).

Version 2.12 of this APP has upgraded the Claude model from claude-sonnet-4-20250514 to claude-sonnet-4-5-20250929, which is the strongest model for coding, building complex agent, and using computers. And it shows substantial gains in reasoning and math.

Version 2.11 of this APP has made the following two changes:

Upgrade the Gemini model gemini-2.5-pro-preview-05-06 to the more stable gemini-2.5-pro.
Upgrade the Gwen model Qwen3-235b-a22b to qwen3-235b-a22b-2507, which supports a native 262K context length and does not implement "thinking mode". Compared to its base variant, this version delivers significant gains in knowledge coverage, long-context reasoning, coding benchmarks, and alignment with open-ended tasks.

Version 2.10.1 of this APP improves the display of nested ordered or unordered lists in the saved .HTML file for most models.

Version 2.10 of this APP has made the following two changes:

Upgrades the OpenAI model from gpt-4.1-2025-04-14 to gpt-5-mini-2025-08-07, which is proficient in code generation, bug fixing, and refactoring, instruction following, and tool calling. gpt-5-mini-2025-08-07 is a more cost-efficient and faster version of GPT-5 with a larger token limit per minute. The response is in NON-STREAMING mode to potentially prvent organization/identity verification.
Further improve the distinction between inline math vs currency.

Version 2.9.3 of this APP has made the following four improvements:

Make URL of a web page in the perplexity-sonar-pro model's resourcs of search results also CLICKABLE in the saved .HTML file.
Add handling of executable code in the gemini-2.0-flash model's response.
Try extracting the actual HTML <title> of a web page and display it as the title of a CLICKABLE link (rather than just using a simplified label) in the gemini-2.0-flash model's resourcs of search results.
Correctly display dollar amounts (e.g., $2.82 and $2.77) in chat session HTML output as well as in the saved .HTML file.

Version 2.9.2 of this APP adds a "Copied!" feedback when the copy button of a code block is clicked.

Version 2.9.1 of this APP improves source citation in gemini-2.0-flash and claude-sonnet-4-20250514 models.

Version 2.9 of this APP has made the following four changes:

Upgrade the Claude model claude-3-7-sonnet-20250219 to claude-sonnet-4-20250514, which is a significant upgrade to Claude Sonnet 3.7. It delivers superior coding while responding more precisely to your instructions.
Upgrade the Claude thinking model claude-3-7-sonnet-20250219-thinking to claude-sonnet-4-20250514-thinking, which returns a summary of Claude’s full thinking process and leverages a more sophisticated reasoning strategy.
Upgrade the Deepseek reasoning model to DeepSeek-R1-0528, which improves benchmark performance with enhanced front-end capabilities and reduced hallucinations.
Add a spinner when waiting for the response of a thinking model and further improve nested outpur in a .HTML file.

Version 2.8 of this APP has made the following four changes:

Upgrade the thinking model gemini-2.5-pro-preview-03-25 to gemini-2.5-pro-preview-05-06, which marks best-in-class frontend web developmenta and major leap in video understanding with knowledge cutoff in January 2025 ahd a context window of 1M tokens.
Add the web search tool to the model claude-3-7-sonnet-20250219, which gives Claude direct access to real-time web content, allowing it to answer questions with up-to-date information.
Improve web-search result citation of the model gemini-2.0-flash by directly extracting the URLs and titles from model response rather than relying on prompt engineering. Same practice is used in the newly added search tool of the model claude-3-7-sonnet-20250219.
Improve the rendering of mathematical expressions in the model gpt-4.1-2025-04-14.

At present, this APP has four models that support web-search result citations:

gpt-4.1-2025-04-14
claude-3-7-sonnet-20250219
gemini-2.0-flash
perplexity-sonar-pro

Version 2.7 of this APP has made the following three changes:

Improve the rendering of mathematical expressions for all models in chat messages and in saved .HTML files.
Improve the rendering of nested lists in saved .HTML files.
Upgrade the model Qwen2.5-Max to Qwen3-235b-a22b, which supports prompt-driven seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within a single model. For maximum control, use explicit prompting patterns rather than relying on automatic mode switching.
Improve the mathmatical

At present, this APP has three models that support web-search result citations:

gpt-4o-2024-11-20
gemini-2.0-flash
perplexity-sonar-pro

Version 2.6 of this APP has made the following two changes:

Upgrade the model gpt-4o-2024-11-20 to gpt-4.1-2025-04-14,which makes significant advancements in coding, instruction following, and long-context processing with a 1-million-token context window.
Upgrade the thinking model gemini-2.0-flash-thinking-exp-01-21 to gemini-2.5-pro-preview-03-25, which shows advanced coding capability and is state-of-the-art across a range of benchmarks requiring enhanced reasoning.

At present, this APP has four models with thinking/reasoning capability:

o3-mini-high
claude-3-7-sonnet-20250219-thinking
gemini-2.5-pro-preview-03-25
DeepSeek-R1

Version 2.5 of this APP has made the following changes:

Add web search capability to the OpenAI model gpt-4o-2024-11-20. The search_context_size is set to high to provide the most comprehensive context retrieved from the web to help the tool formulate a response. Need to upgrade the openai package to enable this feature.
Change the prompts to the model gemini-2.0-flash and perplexity-sonar-pro to improve the format of web-search result citation.
Add a MIT license file.

At present, this APP has three models that support web-search result citations:

gpt-4o-2024-11-20
gemini-2.0-flash
perplexity-sonar-pro

Version 2.4 of this APP has made the following changes:

Add the OpenAI reasoning model o3-mini-high, which is hosted by OpenRouter.ai. With its reasoning effort set to high, this model delivers exceptional STEM capabilities - with particular strength in science, math, and coding.
Add the Anthropic thinking model claude-3-7-sonnet-20250219-thinking. Extended thinking gives Claude 3.7 Sonnet enhanced reasoning capabilities for complex tasks, while also providing transparency into its step-by-step thought process. The maximum token limit for this model is 20,000, with a thinking token budget of 16,000.
Upgrade the Antropic model claude-3-5-sonnet-20241022 to claude-3-7-sonnet-20250219, which shows particularly strong improvements in coding and front-end web development.
Upgrade the Alibaba model Qwen2.5-Coder-32B-Instruct to Qwen2.5-Max, a large-scale Mixture-of-Expert (MoE) model that outperforms DeepSeek V3 in benchmarks such as Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond. This model is hosted by OpenRouter.ai.

In version 2.4 of this APP, you can use the following 6 multimodal LLM models with both image and text input:

gpt-4o-2024-11-20
claude-3-7-sonnet-20250219
claude-3-7-sonnet-20250219-thinking
pixtral-large-latest
gemini-2.0-flash
gemini-2.0-flash-thinking-exp-01-21

Version 2.3 of this APP has made the following changes:

Add the reasoning model Deepseek-R1, which is hosted by Together.ai. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
Upgrade the gemini-2.0-flash-exp to the latest production ready gemini-2.0-flash, which delivers next-gen features and improved capabilities, including superior speed, native tool use, multimodal generation, and a 1M token context window.
upgrade the gemini-2.0-flash-thinking-exp to the latest gemini-2.0-flash-thinking-exp-01-21, which delivers enhanced abilities across math, science, and multimodal reasoning.
Upgrde the perplexity-llama-3.1-sonar-huge-128k-online to perplexity-sonar-pro, which offers premier search offering with search grounding that supports advanced queries and follow-ups. The legacy model llama-3.1-sonar-huge-128k-online will be deprecated and will no longer be available to use after 2/22/2025.
Increase the default max number of tokens a model can generate from 4000 to 6000.

Version 2.2 of this APP has made one change:

Show the model name of an assistant when saving a chat session to an .HTML file.

Version 2.1.1 of this APP has added the capability to prompt the following 5 multimodal LLM models with both image and text:

gpt-4o-2024-11-20
claude-3-5-sonnet-20241022
pixtral-large-latest
gemini-2.0-flash-exp
gemini-2.0-flash-thinking-exp

To include an image in a prompt, follow the steps below:

Click the From Clipboard button in the left pane to show the Click to Paste from Clipboard button in the central pane.
Use the screen captioning tool of your computer to capture an image from your screen.
Click the Click to Paste from Clipboard button in the central pane to paste the image into the chat window (after browser permission is granted). This function is tested in Chrome and Edge.
Type your question and click the Send button to submit the question.

A session that contains both image and text can be saved to a local .HTML file (after first loading the session) by clicking the Save it to a .html file. If this APP is run in the personal_chatgpt folder by typping the command streamlit run personal_chatgpt.py, the associated images will be saved to a newly created folder images in the personal_chatgpt folder. If this APP is run in Docker, the images will be saved to the Downloads folder of your computer.
To get the summary title of a session, an image is first sent to the pixtral-large-latest model (as an OCR model) to extract its text content. Starting from this version the free API of OCRSpace's OCR engine2 is no longer used as the OCR model.
At present, this APP has two models that support web-search result citations:

gemini-2.0-flash-exp
perplexity-llama-3.1-sonar-huge-128k-online

Version 1.10.0 of this APP has made two changes:

Add a new model gemini-2.0-flash-thinking-exp that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stronger reasoning capabilities in its responses than the Gemini 2.0 Flash Experimental model. This model currenlty does not support tool usage like Google Search (from the Gemini 2.0 Flash web site).
Prompt the gemini-2.0-flash-exp model to provide web-link citations of its Google Search results. Citation format is similar to that from the perplexity-llama-3.1-sonar-huge-128k-online model.

Version 1.9.0 of this APP has made one change:

Leverage the GoogleSearch tool of gemini-2.0-flash-exp to improve the accuracy and recency of responses from the model. The model can decide when to use Google Search. Install a new google-genai package from Pypi.

Version 1.8.0 of this APP has made one change:

Change the Gemini model to gemini-2.0-flash-exp that delivers improvements to multimodal understanding, coding, complex instruction following, and function calling.

Version 1.7.0 of this APP has made two changes:

Change the OpenAI model to gpt-4o-2024-11-20 with better creative writing ability and better at working with uploaded files, providing deeper insights & more thorough responses (from an OpenAI X post).
Change the Gemini model to gemini-exp-1121 with significant gains on coding performance, stronger reasoning capabilities and improved visual understanding (from a Google AI X post). This model currently does not support Grounding with Google Search (as of Nov 28, 2024).

Version 1.6.0 of this APP has made one change:

Add a new model Qwen2.5-Coder-32B-Instruct. This 32B model is developed by Alibaba Cloud and is available at together.

Version 1.5.0 of this APP has made one change:

Add a new model llama-3.1-nemotron-70b-instruct from Nvidia. A free API key is avaiiable at https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct by clicking "Build with this NIM".

Version 1.4.0 of this APP has made two changes:

Change the perplexity model to llama-3.1-sonar-huge-128k-online with the features of source citation and clickable URL links.
Change the Gemini model to gemini-1.5-pro-002 with the feature of Grounding with Google Search.

This version currently also has the following features:

Switch between the LLM model of claude-3-5-sonnet-20241022 and mistral-large-latest anytime in a chat session.
Extract text from a screenshot image. This feature uses the Streamlit component "streamlit-paste-button" to paste an image from the clipboard after user consent (tested on Google Chrome and Microsoft Edge). Then, the image is sent to the free API of OCRSpace's OCR engine2 to extract the text with automatic Western language detection.
Extract the folder structure and file contents of a .zip file.
Show the name of a model in the response of a LLM API call.
Select the behavior of your model as either deterministic, conservative, balanced, diverse, or creative.
Select the maximum number of tokens the model creates for each API call.
Select a date range and a previous chat session in that range (by choosing from a session summary) and reload the messages of this chat session from a local MySQL database.
Search a previous chat session using the MySQL full-text boolean search with keywords to reload a chat session.
Save the messages of a session as an HTML file on the local computer.
Upload a file (or multiple files) from the local computer with a question (optional) and send it to an API call.
Delete the messages of a loaded chat session from the database, and
Finally, delete the contents in all tables in the database, if that is what you want to do.

API Keys and Database Credentials

To run this app, create a secrets.toml file under the directory .streamlit in the project folder Personal-ChatGPT and enter your LLM API keys and MySQL database credentials as shown in the code block below. The database credentials can be left as is to match those in the Docker compose.py file in the project folder.

# .streamlit/secrets.toml
OPENAI_API_KEY = "my_openai_key"
GOOGLE_API_KEY = "my_gemini_key"
MISTRAL_API_KEY = "my_mistral_key"
ANTHROPIC_API_KEY = "my_claude_key"
TOGETHER_API_KEY = "my_together_key"
PERPLEXITY_API_KEY = "my_perplexity_key"
NVIDIA_API_KEY = "my_nvidia_key"
OPENROUTER_API_KEY = "my_openrouter_key"

[mysql]
host = "mysql"
user = "root"
password = "my_password"
database = "chat"

If you have a MySQL database on your computer, another way to run this app without Docker is to cd into the personal-chatgpt directory, where there is a separate .streamlit folder, and create a secrets.toml file in this .streamlit folder. Put in the password of your local MySQL database rather than the "my_password" as shown in the previous code block. Remember first to create a database with the name chat. The app can be run by typing the following command in the sub-directory personal-chatgpt in your virtual environment.

streamlit run personal_chatgpt.py

Clone the GitHub Repository and APP Installation

To clone the GitHub directory type the command as follows.

git clone https://github.com/tonypeng1/Personal-ChatGPT.git

To create a Python virtual environment, check out version 2.15 of this APP, and install the project,

cd Personal-ChatGPT
python3 -m venv .venv
source .venv/bin/activate
git checkout v2.15
python3 -m pip install --upgrade pip setuptools wheel
python3 -m pip install -e .

To create and run a Docker image, type the following commands in the project directory Personal-ChatGPT where there is a file called Dockerfile.

docker build -t streamlit-mysql:2.15 .
docker compose up

Medium Article

More descriptions of this APP can be found in the Medium article, https://medium.com/@tony3t3t/personal-llm-chat-app-using-streamlit-e3996312b744

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.streamlit		.streamlit
mysql-init		mysql-init
personal-chatgpt		personal-chatgpt
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose.yml		compose.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personal LLM Chat APP

APP Features

API Keys and Database Credentials

Clone the GitHub Repository and APP Installation

Medium Article

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

tonypeng1/Personal-ChatGPT

Folders and files

Latest commit

History

Repository files navigation

Personal LLM Chat APP

APP Features

API Keys and Database Credentials

Clone the GitHub Repository and APP Installation

Medium Article

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages