VisionFlux 🎬

VisionFlux is a cutting-edge AI video generation platform that bridges the gap between high-performance cloud computing and a sleek, cinematic local user interface. It leverages Stable Diffusion for text-to-image generation and RIFE (Real-Time Intermediate Flow Estimation) for frame interpolation, creating smooth, high-quality short films from simple text prompts.

🚀 Project Overview

VisionFlux was built to solve a specific challenge: How to run resource-intensive AI video generation models (requiring 12GB+ VRAM) accessible to users without high-end local hardware.

The solution involves a hybrid architecture:

Frontend: A local, highly responsive React application with a "Netflix-style" cinematic aesthetic.
Backend: A Python FastAPI server that runs on Google Colab (or any high-end GPU server), utilizing free cloud GPUs for the heavy lifting.
Tunneling: Ngrok creates a secure tunnel, allowing the local frontend to communicate seamlessly with the remote Colab backend.

🛠️ Tech Stack

Frontend

Framework: React (Vite)
Language: TypeScript
Styling: Tailwind CSS, CSS Modules
UI Components: shadcn/ui, Radix UI
Animations: CSS Keyframes, Framer Motion (planned)
State Management: React Hooks

Backend (Colab/Python)

Server: FastAPI, Uvicorn
AI Models:
- Stable Diffusion: For generating keyframes from text prompts.
- RIFE: For interpolating frames to create smooth motion (60fps-like smoothness).
Libraries: PyTorch, Diffusers, OpenCV, Pillow, NumPy.
Infrastructure: Google Colab (T4 GPU), Ngrok.

💡 Key Features

Text-to-Video: Transform text prompts into animated sequences.
Cinematic UI: Dark mode, glassmorphism, and immersive video backgrounds.
Smart Interpolation: Uses RIFE to fill in gaps between generated frames, resulting in fluid motion rather than a slideshow effect.
Cloud-Local Bridge: Seamlessly connects a local web app to a remote cloud GPU.
Downloadable Assets: Save generated videos as GIFs or MP4s (planned).

🚧 Challenges & Solutions

1. The "Cold Start" & Connectivity Problem

Challenge: Connecting a local localhost frontend to a dynamic Google Colab instance that changes IP every session. Solution: Implemented a dynamic connection tab where users paste their unique Ngrok URL. The frontend stores this in localStorage for persistence during the session.

2. CORS (Cross-Origin Resource Sharing) Hell

Challenge: Browsers block requests from localhost to a remote Ngrok domain due to security policies. Solution: Configured CORSMiddleware in FastAPI to allow all origins (*) and, crucially, added an explicit OPTIONS handler to satisfy browser preflight checks, resolving 405 Method Not Allowed errors.

3. Video Smoothness

Challenge: Stable Diffusion generates static images. Simply sequencing them creates a jerky, flickering video. Solution: Integrated RIFE (Real-Time Intermediate Flow Estimation). We generate "key" frames with Stable Diffusion and then use RIFE to hallucinate intermediate frames, smoothing out the transitions significantly.

📦 Installation & Setup

Prerequisites

Node.js & npm
Python 3.10+ (for local backend only)
Google Account (for Colab backend)

1. Frontend Setup

cd frontend
npm install
npm run dev

The app will open at http://localhost:8080/.

2. Backend Setup (Google Colab)

Open the provided VisionFlux_Backend.ipynb (or create a new notebook).
Paste the server script (found in backend/colab_server_optimized.py).
Run the cell.
Copy the Ngrok Public URL (e.g., https://xxxx.ngrok-free.app).

3. Connecting

Open the VisionFlux frontend.
Go to the Create page.
Paste the Ngrok URL in the Connection tab.
Start generating!

📂 Project Structure

VISIONFlux/
├── frontend/                 # React Application
│   ├── src/
│   │   ├── components/       # UI Components (Showcase, Footer, etc.)
│   │   ├── pages/            # Page Views (Create, Index)
│   │   └── App.tsx           # Main Router
│   └── tailwind.config.js    # Styling Config
│
├── backend/                  # Python Server Logic
│   ├── colab_server_optimized.py # The script to run in Colab
│   ├── app.py                # Local development server
│   └── LOCAL_SETUP.md        # Guide for local GPU setup
│
└── README.md                 # This file

🔮 Future Roadmap

User Accounts: Save generation history.
Advanced Settings: Control guidance scale, seed, and negative prompts.
Upscaling: Integrate Real-ESRGAN for 4K output.
Audio: Generate background music based on the prompt.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
output		output
.gitattributes		.gitattributes
.gitignore		.gitignore
Backend_2_0 (1).ipynb		Backend_2_0 (1).ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisionFlux 🎬

🚀 Project Overview

🛠️ Tech Stack

Frontend

Backend (Colab/Python)

💡 Key Features

🚧 Challenges & Solutions

1. The "Cold Start" & Connectivity Problem

2. CORS (Cross-Origin Resource Sharing) Hell

3. Video Smoothness

📦 Installation & Setup

Prerequisites

1. Frontend Setup

2. Backend Setup (Google Colab)

3. Connecting

📂 Project Structure

🔮 Future Roadmap

About

Uh oh!

Releases

Packages

Languages

Swathi-88/VISIONFLUX

Folders and files

Latest commit

History

Repository files navigation

VisionFlux 🎬

🚀 Project Overview

🛠️ Tech Stack

Frontend

Backend (Colab/Python)

💡 Key Features

🚧 Challenges & Solutions

1. The "Cold Start" & Connectivity Problem

2. CORS (Cross-Origin Resource Sharing) Hell

3. Video Smoothness

📦 Installation & Setup

Prerequisites

1. Frontend Setup

2. Backend Setup (Google Colab)

3. Connecting

📂 Project Structure

🔮 Future Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages