A Twin-Stream Deepfake Detection System using Vision Transformers (ViT-B/16 / MobileNetV3) to analyze both Facial and Background inconsistencies.
- Twin-Stream Analysis: Separately analyzes the Face and Background of an image.
- Transformer Models: Uses
ViT-B/16(Vision Transformer) for high-accuracy feature extraction. - Smart Thresholding:
- Face Priority: Strict check on facial artifacts (Threshold: 0.4).
- Background Check: Low sensitivity check for obvious background anomalies (Threshold: 0.05).
- Real-Time UI: Modern Next.js interface with instant feedback.
The system employs a Twin-Stream Network approach:
graph TD
A[Input Image] --> B{Segmentation}
B -->|Extract| C[Face Region]
B -->|Extract| D[Background Region]
C --> E[Face Transformer ViT]
D --> F[Background Transformer ViT]
E --> G[Face Score]
F --> H[Background Score]
G --> I{Decision Logic}
H --> I
I -->|Face < 0.4| J[DEEPFAKE DETECTED]
I -->|BG < 0.05| J
I -->|Else| K[AUTHENTIC MEDIA]
style J fill:#ff4d4d,stroke:#333,stroke-width:2px,color:white
style K fill:#00cc66,stroke:#333,stroke-width:2px,color:white
| Authentic Media | Deepfake Detected |
|---|---|
![]() |
![]() |
Frontend/: Next.js web application (The User Interface).Backend/: Flask API (The AI Engine).Backend/Model Files/: Contains the trained Keras models (face_transformer.keras,bg_transformer.keras).
- Python 3.8+
- Node.js 18+
cd Backend
pip install -r requirements.txtcd Frontend/be_fr-master
npm install
# OR
pnpm installSimply double-click run_project.bat in the main folder.
- It automatically starts the Backend server.
- It starts the Frontend UI.
- It opens your default browser to the application.
1. Start Backend:
cd Backend
python backend.py2. Start Frontend:
cd Frontend/be_fr-master
npm run dev- Open the application (http://localhost:3000).
- Upload an image (drag & drop or click to select).
- The system will automatically:
- Segment the Face and Background.
- Analyze both using the Transformer models.
- Display the result ("Authentic Media" or "Deepfake Detected").
The system has been evaluated on the DeepfakeTIMIT and Celeb-DF datasets.
| Metric | Score |
|---|---|
| Face Model Accuracy | 67.0% |
| Background Model Accuracy | 83.0% |
| Combined System Accuracy | 81.0% |
| Accuracy Comparison | Confusion Matrix |
|---|---|
![]() |
![]() |
Since this project is a proof-of-concept, here is how it can be scaled and improved in the future:
- Cloud Deployment: Deploying the Flask backend to AWS Lambda/GCP for auto-scaling.
- Video Support: Extending the pipeline to process video files frame-by-frame.
- Browser Extension: Building a Chrome extension to automatically scan images on social media.
- Mobile App: Porting the UI to React Native for mobile deepfake detection.
- API Gateway: Creating a public API for other developers to use our detection engine.
This project is licensed under the MIT License - see the LICENSE file for details.




