This Streamlit app generates AI images featuring Person within a specified scene. It uses Gemini for prompt modification and Replicate for image generation.
- Prompt Modification with Gemini: Takes a user-provided prompt and modifies it using Gemini to seamlessly integrate Person into the scene. Ensures Person is visible and actively participating.
- Image Generation with Replicate: Leverages a custom Replicate model (
harikrishnad1997/flux-1-hari-ft:37b22168a51d814b49bc8629cca6caaa6789a8a7b65cdd5123310fe5a5c5fecc
) fine-tuned to include Person. - Customizable Settings: Allows users to adjust the number of inference steps, guidance scale, and number of output images.
- Image Display and Download: Displays the generated images within the app and provides download links.
- Error Handling: Includes error handling for missing dependencies, API keys, and image loading failures.
- Clone the repository:
git clone https://github.com/anthonysandesh/Image_Gen_App.git
cd Image_Gen_App
- Create a virtual environment (recommended):
python3 -m venv .venv
source .venv/bin/activate
- Install the required packages:
pip install -r requirements.txt
- Set up API Keys:
- Replicate API Token: Obtain your Replicate API token from your Replicate account settings and set it as an environment variable
REPLICATE_API_TOKEN
or in the Streamlit secrets file (.streamlit/secrets.toml
). Example in.streamlit/secrets.toml
:
REPLICATE_API_TOKEN="YOUR_REPLICATE_API_TOKEN"
GEMINI_API_KEY="YOUR_GEMINI_API_KEY"
- Gemini API Key: Obtain your Gemini API key and set it as an environment variable
GEMINI_API_KEY
or in the Streamlit secrets file (.streamlit/secrets.toml
).
- Run the Streamlit app:
streamlit run app.py
-
Enter a prompt: In the app, enter a description of the scene you want to generate.
-
Adjust settings (optional): Use the advanced settings to customize the image generation process.
-
Generate images: Click the "Generate Images" button.
-
View and download: The generated images will be displayed. You can download them using the provided links.
- User enters the prompt: "Winning the Italian GP as a Ferrari Driver"
- The app, using Gemini, modifies the prompt to something like: "Person, a Ferrari driver, celebrates victory at the Italian Grand Prix, the checkered flag waving behind him as he raises his arms in triumph on the podium, surrounded by cheering fans."
- The app uses the modified prompt to generate images with the Replicate model.
- The generated images, showing Person as an F1 driver winning the Italian GP, are displayed to the user.
Image 1 | Image 2 |
---|---|
![]() |
![]() |
Image 3 | Image 4 |
![]() |
![]() |
This project is licensed under the MIT License.