A Discord chatbot that uses GPT-4, 3.5, 3, or LLaMA for text generation and ElevenLabs, Azure TTS, or Silero for voice chat.
This is actively being improved! Pull requests and issues are very welcome!
- Realistic voice chat support with ElevenLabs, Azure TTS, Play.ht, Silero, or Bark models
NOTE: The voice chat is only stable if one person is speaking at a time
- Image recognition support with BLIP
- Long term message recalling using OpenAI's embeddings to detect similar topics talked about in the past
- Custom bot identity and name
- Support for all OpenAI text completion and chat completion models
- Support for local LLaMA (GGML) models
- Local OpenAI Whisper support for speech recognition (as well as Google and Azure speech recognition)
- Chat-optimized commands
NOTE: Please only use this on small private servers. Right now it is set up for testing only, meaning anyone on the server can invoke its commands. Also, the bot will join voice chat whenever someone else joins!
Setup a server to run the bot on so it can run when your computer is off 24/7. For this guide, I will be using DigitalOcean, but you can use any server host you want. Skip this section if you already have a server or want to run it locally.
-
Create a DigitalOcean account here
-
Create a droplet
- Open your dashboard
- Click "Create" -> "Droplets"
- Select whatever region is closest to you and doesn't have any notes.
- Choose an image>Ubuntu>Ubuntu 20.04 (LTS) x64
- Droplet Type>Basic
- CPU options>Premium Intel (Regular is $1 cheaper but much slower.)
- 2 GB / 1 Intel CPU / 50 GB Disk / 2 TB Transfer (You need at least 2GB of Ram & 50GB of storage is more than enough storage for this bot.)
- Choose Authentication Method>Password>Pick a password
- Enable backups if you want. (This will cost extra but allow you to go back to a previous version of your server if you mess something up.)
- Create Droplet
- Connect to your droplet
- Open your dashboard
- Find your droplet>more>access console
- Log in as...root>Launch Droplet Console
-
At least 2gb of RAM
-
ffmpeg
sudo apt-get install ffmpeg
- Dev version of Python
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.9-dev
Tested on Python 3.9 but may work with other versions
- Pip
sudo apt-get install python3-pip
- PortAudio
sudo apt-get install portaudio19-dev
Clone the project files and cd into the directory
git clone https://github.com/hc20k/LLMChat.git
cd LLMChat
Simply run
python3.9 update.py -y
# -y installs required dependencies without user interaction
# Change python.x if using a different version of Python
to install all required dependencies. You will be asked if you want to install the optional dependencies for voice and/or image recognition in the script.
NOTE: It's healthy to run
update.py
after a new commit is made, because requirements may be added.
If you were having trouble with the update.py
script, you can install the dependencies manually using these commands.
Clone the project files and cd into the directory
git clone https://github.com/hc20k/LLMChat.git
cd LLMChat
Manually install the dependencies
pip install -r requirements.txt
# for voice support (ElevenLabs, bark, Azure, whisper)
pip install -r optional/voice-requirements.txt
# for BLIP support
pip install -r optional/blip-requirements.txt
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
# for LLaMA support
pip install -r optional/llama-requirements.txt
Create the bot:
- Visit Discord Developer Portal
- Applications -> New Application
- Generate Token for the app (discord_bot_token)
- Select App (Bot) -> Bot -> Reset Token (Save this token for later)
- Select App (Bot) -> Bot -> and turn on all intents
- Add Bot to server(s)
- Select App (Bot) -> OAuth2 -> URL Generator -> Select Scope: Bot, applications.commands
- Select Permissions: Administrator
NOTE: Administrator is required for the bot to work properly. If you want to use the bot without Administrator permissions, you can manually select the permissions you want to give the bot.
- Open the generated link and add the bot to your desired server(s)
Copy the config file
cp config.example.ini config.ini
Edit the config file
nano config.ini
speech_recognition_service =
:
whisper
- run OpenAI's Whisper locally. (Free)google
- use Google's online transcription service. (Free)azure
- use Microsoft Azure's online transcription service. ($)
tts_service =
:
elevenlabs
- use ElevenLabs for TTS. ($) (Further configuration in theElevenLabs
section required)azure
- use Azure cognitive services for TTS. ($) (Further configuration in theAzure
section required)silero
- uses local Silero models via pytorch. (Free)play.ht
- uses Play.ht for TTS. API key needed. ($)bark
- uses local Bark models for TTS. Optimal graphics card needed. (Free)
audiobook_mode =
true
- the bot will read its responses to the user from the text chat.false
- the bot will listen in VC and respond with voice.
llm =
:
openai
- use OpenAI's API for LLM ($ Fast))llama
- use a local LLaMA (GGML) model (Free, requires llama installation and is slower)
blip_enabled =
- true - the bot will recognize images and respond to them (requires BLIP, installed from update.py)
- false - the bot will not be able to recognize images
key =
- Your OpenAI API key
model =
use_embeddings =
- true - the bot will log and remember past messages and use them to generate new responses (more expensive)
- false - the bot will not log past messages and will generate responses based on the past few messages (less expensive)
bot_api_key =
- The token you generated for the bot in the Discord Developer Portal
active_channels =
- A list of text and voice channel ids the bot should interact with, seperated by commas
- Example:
1090126458803986483,922580158454562851
orall
(Bot will interact with every channel)
Supply your API keys & desired voice for the service you chose for tts_service
After changing the configuration files, start the bot
python3.9 main.py
Or run the bot in the background useing screen to keep it running after you disconnect from a server.
screen -S name python3.9 main.py
# Press `Ctrl+a` then `d` to detach from the running bot.
/configure
- Allows you to set the chatbot's name, identity description, and optional reminder text (a context clue sent further along in the transcript so the AI will consider it more)/model
- Allows you to change the current model. If you're in OpenAI mode, it will allow you to select from the OpenAI models. If you're in LLaMA mode, it will allow you to select a file from theLLaMA.search_path
folder./avatar [url]
- Allows you to easily set the chatbot's avatar to a specific URL./message_context_count
- (default 20) Sets the amount of messages that are sent to the AI for context. Increasing this number will increase the amount of tokens you'll use./audiobook_mode
- (defaultfalse
) Allows you to changeBot.audiobook_mode
without manually editing the config.
/reload_config
- Reloads all of the settings in the config.ini./purge
- Deletes all of the messages in the current channel. DANGEROUS. I should probably disable this but I use it during testing./system [message]
- Allows you to send a message as thesystem
role. Only supported for OpenAI models >= gpt-3.5-turbo./retry
- Allows you to re-infer the last message, in case you didn't like it.
/print_info
- Prints some info about the bot. (Its name, identity, and model as well as your name and identity)/your_identity
- Allows you to set your own name and identity (What the chatbot knows about you)