A deployable, AI-powered classroom monitoring solution using Computer Vision and Natural Language Processing. It captures attendance, tracks student attentiveness, and auto-generates lecture notes from classroom sessions.
✅ Face Detection and Recognition for Attendance
✅ Attentiveness Tracking
✅ Audio Transcription & Summarization
✅ Deployable Server-Based Pipeline
✅ Streamlit-Based Demo UI
- Computer Vision: YuNet (OpenCV), YOLOv11 for face detection
- Deep Learning: Custom facial points-based model (planned)
- Transcription: Vosk Small EN (vosk-model-small-en-us-0.15)
- Summarization: LLM via Groq (
llama3-8b-8192) - Database: Cloud-based CSV (PostgreSQL planned)
- Web UI: Streamlit
- Deployment: Lightning AI (UI), Flask + ngrok (server)
graph TD;
Camera-->Face_Detection;
Face_Detection-->Attendance;
Face_Detection-->Attentiveness;
Microphone-->Transcription;
Transcription-->Summarization;
Attendance-->Cloud_Database;
Attentiveness-->Cloud_Database;
Summarization-->Cloud_Database;
Cloud_Database-->Dashboard;
Dashboard-->User;
-
Clone the repository:
git clone https://github.com/manodeepray/minor_project cd minor_project -
(Recommended) Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate # For Linux/macOS venv\Scripts\activate # For Windows
-
Install dependencies:
pip install -r requirements.txt
-
Install system packages:
- Linux
sudo apt install ffmpeg -y
- Windows (Powershell)
Download and install from FFmpeg.org
- Linux
-
(Optional) Install
tmuxfor background terminal session management:sudo apt install tmux
-
Create a
.envfile and add your Groq API key:GROQ_API_KEY="YOUR_API_KEY"
-
Add your dataset (folder of names with their face images).
-
Update the dataset path in
src/training/train_face_rec.py. -
Generate YOLO-compatible dataset:
python src/training/get_training_data.py
-
Train the model:
bash scripts/train.sh
-
Ensure
tmuxis installed (optional). -
Start the servers:
bash scripts/run_servers.sh
Or manually in separate terminals:
python processor.py python server.py
-
Expose your local Flask server to edge devices:
ngrok http 5000
To change the summarization LLM or prompt behavior, edit:
# src/pipelines/core/llm_integration.py
GROQ_LLM_MODEL_ID = "llama3-8b-8192"Modify the prompt inside:
# src/pipelines/core/llm_integration.py
def generate_notes(...):
prompt = f"""
Please summarize the main points and create structured class notes, including key topics, subpoints, and any important details in an .md format
1. I want to store them in a .md file
2. If no context, write: "No transcription found"
context:
{transcription}
"""-
Full system demo:
streamlit run apps/demo_app.py
-
Explore saved data:
streamlit run apps/files_ui.py
-
Face Recognition Accuracy: 91.67% (Top-1)
-
Attentiveness Tracking:
Attentiveness = (frames student is visible) / (total frames in session) -
Lecture Note Quality:
Subjective human evaluations based on prompt quality and LLM output
🔹 Integrate custom facial points-based model for better recognition
🔹 Switch to more robust LLMs and refine summarization prompts
🔹 Enable real-time alerts for low attentiveness scenarios
🔹 Improve PostgreSQL support for real-time dashboards
- Manodeep Ray (Project Lead)
This project is licensed under the Creative Commons (CC) License.