Information Retrieval

STT	MSSV	Họ và Tên	Chức vụ
1	22520775	Nguyễn Xuân Linh	Nhóm trưởng
2	22521015	Huỳnh Văn Nhật	Thành viên
3	22521581	Nguyễn Thanh Trường	Thành viên

COURSE INTRODUCTION

Course Name: Information Retrieval.
Course Code: CS336.
Class Code: CS336.P11.
Academic Year: HK1 (2024 - 2025).
Lecturer: Th.S Đỗ Văn Tiến.

Overview

This repository contains the sysytem for a Multimodal Video Retrieval. The backend integrates with Elasticsearch for searching text data and uses backup files as a fallback mechanism. The system supports querying OCR, ASR. Additionally, it utilizes CLIP and FAISS for efficient image and text retrieval.

Features

Search text data in Elasticsearch.
Fallback to backup files if Elasticsearch is unavailable.
Fuzzy search support in backup data.
Handles video metadata including video paths and frame ranges.
Utilizes CLIP for image and text feature extraction.
Uses FAISS for efficient similarity search and retrieval.

Project Folder Structure

Below is a brief description of the project's folder structure:

IR
├── App
│   ├── AIC_Backend
│   ├── AIC_Frontend
├── ASR
│   ├── asr-feature-extract.ipynb
├── Extract_frame
│   ├── extract_frame.py
│   ├── extract_representative_frames.py
├── Load_to_elastic
│   ├── load-elastic-asr.ipynb
│   ├── load-elastic-ocr.ipynb
├── Scene_text
│   ├── Detection
│   ├── Recognition
├── .gitignore
├── requirements.txt

Description of Key Folders and Files

IR: Root folder of the project.
App: Main folder containing the core components of the application.
- AIC_Backend: Folder containing backend source code.
- AIC_Frontend: Folder containing frontend source code.
- ASR: Folder related to Automatic Speech Recognition (ASR) processing.
  - asr-feature-extract.ipynb: Notebook for extracting ASR features.
- Extract_frame: Folder containing scripts for frame extraction.
  - extract_frame.py: Script to extract frames from videos.
  - extract_representative_frames.py: Script to extract representative frames.
- Load_to_elastic: Folder containing scripts for loading data into Elasticsearch.
  - load-elastic-asr.ipynb: Notebook for loading ASR data into Elasticsearch.
  - load-elastic-ocr.ipynb: Notebook for loading OCR data into Elasticsearch.
- Scene_text: Folder related to text processing in images.
  - Detection: Folder containing text detection source code.
  - Recognition: Folder containing text recognition source code.
- .gitignore: File to ignore unnecessary files and folders in Git.
- requirements.txt: File listing the necessary libraries to run the project.

Requirements

Python 3.12
Conda for environment management
Faiss GPU if needed

Setup for Backend

1. Clone the Repository

git clone https://github.com/xlinh2301/IR_2024_CS336.P11.git
cd App/AIC_Backend

2. Download data and move in app/data

link: https://drive.google.com/file/d/1kJmzaSRtawGoxAGuw5lQjzHJGO6fdcFr/view?usp=drive_link

3. Setup with Conda Environment

Create a Conda environment with the required dependencies:

conda create --name video_search_backend python
conda activate video_search_backend

Install Dependencies

pip install -r requirements.txt
conda install -c conda-forge -c nvidia faiss-gpu

4. Run the Application

uvicorn main:app --reload

Setup for Frontend

cd App/AIC2024
npm install

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Information Retrieval

COURSE INTRODUCTION

Overview

Features

Project Folder Structure

Description of Key Folders and Files

Requirements

Setup for Backend

1. Clone the Repository

2. Download data and move in app/data

3. Setup with Conda Environment

Install Dependencies

4. Run the Application

Setup for Frontend

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
ASR		ASR
App		App
Extract_frame		Extract_frame
Load_to_elastic		Load_to_elastic
Scene_text		Scene_text
.gitignore		.gitignore
README.md		README.md
requirement.txt		requirement.txt

xlinh2301/CS336-UIT-2024

Folders and files

Latest commit

History

Repository files navigation

Information Retrieval

COURSE INTRODUCTION

Overview

Features

Project Folder Structure

Description of Key Folders and Files

Requirements

Setup for Backend

1. Clone the Repository

2. Download data and move in app/data

3. Setup with Conda Environment

Install Dependencies

4. Run the Application

Setup for Frontend

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages