Multi-Modal Model Python Project
This project is a multi-modal model that accepts audio, images, and text as inputs, generating corresponding audio, images, and text outputs.
- Streamlit Interface : Coming Soon
- Input Modalities: Audio, Images, Text, videos , emojis, multi inputs
- Output Modalities: Audio, Images, Text, Videos , emojis , segmented images, images objects detection coordinates, multi outputs
- Python 3.x
- Dependencies listed in
requirements.txt
git clone https://github.com/Kind-Unes/Multi-Model-V1.git
cd 'MultiMODEL Template'
pip install -r requirements.txt
python model.py