- Generates a text description of a strange/humorous image using LLaMA.
- Generates a voiceover script describing that image using LLaMA.
- Enhances the text description and uses Stable Diffusion to generate the image.
- Generates the voiceover audio using Coqui TTS.
- Stitches the image and audio into a video using ffmpeg.
- Generates a title for the video using BLIP.
Uploads the video to YouTube.WIP
- Git for Windows
- Conda
- MSVC Build Tools
- ffmpeg (in this dir or on your
PATH
)
Clone these to the ./models
directory in this working copy.
NOTE After installing MSVC Build Tools, ensure the following components are also installed:
You must also add the Windows 10 SDK path to your PATH
environment variable. For example, C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64
.
I had to run pip install --force-reinstall regex
at some point...
Always use Git Bash for terminal commands below.
conda env create -f environment.yaml
conda activate autoytpoo
python autoytpoo.py