Scene Understanding with YouTube 8M Dataset

Overview

The YouTube 8M dataset, released in June 2019, provides segment-level annotations with human-verified labels on approximately 237,000 segments across 1,000 classes. This dataset was derived from the validation set of the YouTube-8M dataset.

Dataset Statistics

Frame Level Data Size: 1.71 TB
Number of Shards: 3,844

Data Schema

The data is organized with the following schema:

"video-id": Unique identifier for each video.
"labels": A list of labels associated with that video.

Each frame in the dataset includes the following features:

"rgb": Float array of length 1,024.
"audio": Float array of length 128.

Implementation Details

We have provided images to illustrate the architecture and visual aspects of our implementation.

Architecture Overview

The diagram illustrates the architecture of our implementation, showcasing the flow and components used to process and analyze the YouTube 8M dataset.

Context-Gated DBoF Model

Visualising the results

We use ipywidgets to have real-time playback of our predictions

References

Dataset: YouTube 8M Dataset
YouTube-8M: A Large-Scale Video Classification Benchmark: Paper
Learnable pooling with Context Gating for video classification: Antoine Miech, Ivan Laptev, and Josef Sivic. Paper
Context-gated dbof models for YouTube-8M: Paul Natsev. 2018. PDF
LinkedIn spark-tfrecord: GitHub Repository
Kafka in Action: Building a Distributed Multi-Video Processing Pipeline with Python and Confluent: Article

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
code		code
data		data
.DS_Store		.DS_Store
Project report.pdf		Project report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scene Understanding with YouTube 8M Dataset

Overview

Dataset Statistics

Data Schema

Implementation Details

Architecture Overview

Context-Gated DBoF Model

Visualising the results

References

About

Releases

Packages

Languages

yashwanth-alapati/Youtube-8M-Video-Understanding

Folders and files

Latest commit

History

Repository files navigation

Scene Understanding with YouTube 8M Dataset

Overview

Dataset Statistics

Data Schema

Implementation Details

Architecture Overview

Context-Gated DBoF Model

Visualising the results

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages