ACMA (AI Content Moderation Analysis) is an advanced AI-driven content moderation system designed to detect and analyze toxicity, inappropriate visuals, violence, and other harmful content in various media types including text, images, audio, and video. This system helps maintain safe online environments by enforcing community guidelines, legal compliance, and ethical standards while respecting user privacy and freedom of expression.
This project was developed as a final year project to demonstrate the application of machine learning and computer vision techniques in content moderation.
- Text Moderation: Detects toxic language and analyzes sentiment using NLP techniques
- Image Moderation: Extracts text from images using OCR and classifies visual content for inappropriate material (porn, hentai, sexy content) and violence detection
- Audio Moderation: Transcribes speech to text and analyzes for toxicity
- Video Moderation: Processes video frames to detect nudity, violence, and toxic content in audio tracks
- Toxicity detection using machine learning classifiers
- Image classification for NSFW content using deep learning models
- Violence detection in images and videos
- OCR (Optical Character Recognition) for text extraction from images
- Speech-to-text conversion for audio analysis
- Sentiment analysis for text content
- Real-time content analysis through web interface
- Backend: Python Flask
- Machine Learning: TensorFlow, Keras, scikit-learn
- Computer Vision: OpenCV, EasyOCR
- Natural Language Processing: NLTK
- Audio Processing: SpeechRecognition, MoviePy
- Frontend: HTML, CSS (Tailwind CSS), JavaScript
- Data Processing: NumPy, Joblib
graph TD
A[User Input: <br/> Text, Image, Audio, or Video] --> B{Input Type?}
B -->|Text| C[Preprocess Text <br/> Remove special characters]
C --> D[TF-IDF Vectorization]
D --> E[Toxicity Classifier <br/> Predict Toxic/Non-Toxic]
E --> F[Sentiment Analysis <br/> Positive/Negative/Neutral]
F --> G[Return Result]
B -->|Image| H[OCR Text Extraction <br/> From Image]
H --> I[Preprocess Extracted Text]
I --> J[TF-IDF Vectorization]
J --> K[Toxicity Check on Text]
K --> L[Image Classification <br/> NSFW Detection <br/> porn/hentai/sexy]
L --> M[Violence Detection <br/> in Image]
M --> N{All Checks Pass ?}
N -->|Yes| O[Return: Can be published]
N --> |No| P[Return: Cannot be published]
O --> G
P --> G
B --> |Audio| Q[Speech to Text <br/> Transcription]
Q --> R[Preprocess Transcribed Text]
R --> S[TF-IDF Vectorization]
S --> T[Toxicity Classifier]
T --> G
B --> |Video| U[Extract Frames <br/> Every 3 seconds]
U --> V[Classify Each Frame <br/> NSFW Detection]
V --> W[Calculate Average <br/> NSFW Percentages]
W --> X[Extract Audio Track]
X --> Y[Speech to Text <br/> Transcription]
Y --> Z[Preprocess Text]
Z --> AA[TF-IDF Vectorization]
AA --> BB[Toxicity Check on Audio]
BB --> CC{Video Safe?}
CC --> |Yes| DD[Can be Published]
CC --> |No| EE[Cannot be Published]
DD --> G
EE --> G
G --> FF[Display Result <br/> to User]
├── app.py # Main Flask application
├── toxicity_classifier.pkl # Trained toxicity detection model
├── tfidf_vectorizer.pkl # TF-IDF vectorizer for text processing
├── IMG_MODEL.299x299.h5 # Image & Video classification model
├── VIOLENCE_DETECTION.h5 # Violence detection model
├── template/ # HTML templates
│ ├── index.html # Home page
│ ├── predict.html # Content analysis interface
│ └── aboutus.html # About page
├── static/ # Static assets
│ ├── img/ # Images and icons
│ ├── scripts/ # JavaScript files
│ └── styles/ # CSS stylesheets
└── uploads/ # Directory for uploaded files
- Input text is preprocessed (removing special characters)
- Features are extracted using TF-IDF vectorization
- Toxicity classifier predicts if content is toxic
- Sentiment analysis determines positive/negative/neutral sentiment
- OCR extracts any text from the image
- Extracted text is analyzed for toxicity
- Image is classified using deep learning model for inappropriate content
- Violence detection model checks for violent content
- Content is flagged as "Cannot be published" if any checks fail
- Speech recognition converts audio to text
- Transcribed text is analyzed for toxicity using the same text pipeline
- Video frames are extracted at regular intervals
- Each frame is analyzed for nudity/inappropriate content
- Audio track is extracted and analyzed for toxicity
- Video is flagged if any frame or audio contains prohibited content
- Python 3.7+
- pip package manager
- Hardware: Windows 10 or higher, 8 GB RAM at least, CPU of 2 GHz or higher frequency, GPU is recommended for fast performance, HDD/SSD 500 GB.
- Software: Python version 3.11.5 (recommended) or higher and TensorFlow version 2.12.0 (recommended).
- Libraries: flask, tensorflow, keras, easyocr, opencv, re, speech_recognition, joblib, numpy, nltk, moviepy.
- IDE: VS Code or Python IDE for running the project.
Install the required packages using the provided requirements.txt file:
pip install -r requirements.txtDownload required NLTK data:
import nltk
nltk.download('vader_lexicon')Ensure all model files are present in the root directory:
toxicity_classifier.pkltfidf_vectorizer.pklIMG_MODEL.299x299.h5VIOLENCE_DETECTION.h5
- Open the project folder (ACMA).
- Right-click and open VS Code.
- Run the Flask application in terminal:
python app.py- 'Ctrl + Click' & 'Follow the link':
http://localhost:5000 - You will be directed to the ACMA front end where you can analyze the content.
- Home Page: Overview of the system and its features
- Classify Page: Upload content for analysis
- Enter text directly or upload files (images, audio, video)
- Click "Test" to analyze the content
- View results showing detected text and toxicity status
The system provides a REST API endpoint:
POST /detect_toxicity
Parameters:
text(optional): Text content to analyzefile(optional): File upload (image, audio, or video)
Response format:
{
"text": "extracted or input text",
"toxicity_result": "Toxic/Non-Toxic/Cannot be published"
}The vision of ACMA is to create safer online communities by:
- Maintaining user safety and well-being
- Enforcing community guidelines
- Ensuring legal compliance
- Protecting privacy and ethical standards
- Leveraging AI and machine learning for efficient moderation
- Balancing content control with freedom of expression
This project may contain explicit language, adult themes, or sensitive material, including audio, video, images, and text. Such content is included solely for testing purposes within the project.
Developers: Aditya Singh, Harshit Saxena, Ayush Sharma, Ayush Vishnoi
College: MIT Moradabad, India
© 2023-Present ACMA - All rights reserved.
This project demonstrates the integration of multiple AI technologies for comprehensive content moderation. For educational and research purposes.
