TikTok_Responsible_AI_Filtration

Overview

This application provides an API service to download a video from a URL, convert it to MP3, transcribe the audio using OpenAI's Whisper model, and return the transcription in JSON format using Anthropic's Claude model. Additionally, the application allows users to download the transcription as a text file.

Link to Devpost: https://devpost.com/software/safestream-responsibleai-for-personalized-content-filtering

Features

Download video from a given URL.
Convert video to MP3 format.
Transcribe audio using OpenAI's Whisper model.
Generate JSON transcription using Anthropic's Claude model.
Return transcription as a JSON response.
Download transcription as a text file.
Backend Support from: https://github.com/harjyotbagga/TikTokMask

Installation

Prerequisites

Python 3.8 or higher
pip (Python package installer)
FFmpeg (for video to audio conversion)

Steps

Clone the repository:

git clone git@github.com:sarthakg04/TikTok_Responsible_AI_Filtration.git
cd your-repo

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install required packages:
```
pip install -r requirements.txt
```
Set up environment variables by creating a .env file in the root directory of the project:
```
API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
```
Ensure FFmpeg is installed and available in your PATH.

Usage

Running the Application

python app.py

API Endpoints

Upload Video for Transcription
- Endpoint: /upload
- Method: POST
- Request Body: JSON with the following structure:
```
{
  "url": "video_url"
}
```
- Response: JSON with the transcription text or an error message.
Download Transcription
- Endpoint: /transcription/<filename>
- Method: GET
- Description: Returns the transcription text file for the given filename.
Get JSON Transcription
- Endpoint: /getjson/<filename>
- Method: GET
- Description: Returns the transcription in JSON format for the given filename.

File Structure

.
├── app.py                  # Main application file
├── compute_engine.py       # Custom module for generating JSON from transcription
├── requirements.txt        # Required Python packages
├── .env                    # Environment variables
├── uploads/                # Directory to store downloaded videos and transcriptions
└── data/
    └── prompt.txt          # Prompt file for generating JSON

Demo

Warning: These images are intended for sample purposes only.

Here are some demo images showing the application in action:

OUTPUT JSON FOR MASKING

{
  "contentPreferences": {
    "hateSpeech": {
      "enabled": true,
      "subCategories": {
        "homophobia": false,
        "otherHateSpeech": true,
        "racism": false,
        "sexism": false,
        "transphobia": false,
        "xenophobia": true
      }
    },
    "offensiveContent": {
      "enabled": true,
      "subCategories": {
        "disturbingImages": false,
        "graphicContent": false,
        "hateSpeech": true,
        "insensitiveComments": true,
        "offensiveJokes": false,
        "profanity": false
      }
    },
    "politicalContent": {
      "enabled": true,
      "subCategories": {
        "centrism": false,
        "electionCoverage": false,
        "extremism": true,
        "leftWing": false,
        "politicalDebates": false,
        "rightWing": true
      }
    },
    "racialContent": {
      "enabled": true,
      "subCategories": {
        "culturalAppropriation": false,
        "racialDiscrimination": true,
        "racialEquality": false,
        "racialHistory": true,
        "racistRemarks": false
      }
    },
    "religiousContent": {
      "enabled": true,
      "subCategories": {
        "atheism": false,
        "buddhism": false,
        "christianity": false,
        "hinduism": true,
        "interfaithDialogues": false,
        "islam": true
      }
    },
    "sexualityAndGenderIssues": {
      "enabled": true,
      "subCategories": {
        "feminism": false,
        "genderIdentity": false,
        "lgbtq+": false,
        "relationshipAdvice": false,
        "sexEducation": false,
        "sexualHealth": false
      }
    }
  }
}

Acknowledgements

OpenAI Whisper model for speech recognition.
Anthropic Claude model for natural language understanding and generation.
MoviePy for video processing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TikTok_Responsible_AI_Filtration

Overview

Features

Installation

Prerequisites

Steps

Usage

Running the Application

API Endpoints

File Structure

Demo

OUTPUT JSON FOR MASKING

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
images		images
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
compute_engine.py		compute_engine.py
requirements.txt		requirements.txt

License

sarthakg04/TikTok_Responsible_AI_Filtration

Folders and files

Latest commit

History

Repository files navigation

TikTok_Responsible_AI_Filtration

Overview

Features

Installation

Prerequisites

Steps

Usage

Running the Application

API Endpoints

File Structure

Demo

OUTPUT JSON FOR MASKING

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages