STT | MSSV | Họ và Tên | Chức vụ |
---|---|---|---|
1 | 22520775 | Nguyễn Xuân Linh | Nhóm trưởng |
2 | 22521015 | Huỳnh Văn Nhật | Thành viên |
3 | 22521581 | Nguyễn Thanh Trường | Thành viên |
- Course Name: Information Retrieval.
- Course Code: CS336.
- Class Code: CS336.P11.
- Academic Year: HK1 (2024 - 2025).
- Lecturer: Th.S Đỗ Văn Tiến.
This repository contains the sysytem for a Multimodal Video Retrieval. The backend integrates with Elasticsearch for searching text data and uses backup files as a fallback mechanism. The system supports querying OCR, ASR. Additionally, it utilizes CLIP and FAISS for efficient image and text retrieval.
- Search text data in Elasticsearch.
- Fallback to backup files if Elasticsearch is unavailable.
- Fuzzy search support in backup data.
- Handles video metadata including video paths and frame ranges.
- Utilizes CLIP for image and text feature extraction.
- Uses FAISS for efficient similarity search and retrieval.
Below is a brief description of the project's folder structure:
IR
├── App
│ ├── AIC_Backend
│ ├── AIC_Frontend
├── ASR
│ ├── asr-feature-extract.ipynb
├── Extract_frame
│ ├── extract_frame.py
│ ├── extract_representative_frames.py
├── Load_to_elastic
│ ├── load-elastic-asr.ipynb
│ ├── load-elastic-ocr.ipynb
├── Scene_text
│ ├── Detection
│ ├── Recognition
├── .gitignore
├── requirements.txt
- IR: Root folder of the project.
- App: Main folder containing the core components of the application.
- AIC_Backend: Folder containing backend source code.
- AIC_Frontend: Folder containing frontend source code.
- ASR: Folder related to Automatic Speech Recognition (ASR) processing.
- asr-feature-extract.ipynb: Notebook for extracting ASR features.
- Extract_frame: Folder containing scripts for frame extraction.
- extract_frame.py: Script to extract frames from videos.
- extract_representative_frames.py: Script to extract representative frames.
- Load_to_elastic: Folder containing scripts for loading data into Elasticsearch.
- load-elastic-asr.ipynb: Notebook for loading ASR data into Elasticsearch.
- load-elastic-ocr.ipynb: Notebook for loading OCR data into Elasticsearch.
- Scene_text: Folder related to text processing in images.
- Detection: Folder containing text detection source code.
- Recognition: Folder containing text recognition source code.
- .gitignore: File to ignore unnecessary files and folders in Git.
- requirements.txt: File listing the necessary libraries to run the project.
- Python 3.12
- Conda for environment management
- Faiss GPU if needed
git clone https://github.com/xlinh2301/IR_2024_CS336.P11.git
cd App/AIC_Backend
link: https://drive.google.com/file/d/1kJmzaSRtawGoxAGuw5lQjzHJGO6fdcFr/view?usp=drive_link
Create a Conda environment with the required dependencies:
conda create --name video_search_backend python
conda activate video_search_backend
pip install -r requirements.txt
conda install -c conda-forge -c nvidia faiss-gpu
uvicorn main:app --reload
cd App/AIC2024
npm install