Skip to content

dzanchetta/yt8m-tfm-project

Repository files navigation

TFM Project for UPC

About the project

In a nutshell, this projects attempts to improve the Youtube search experience by considering the Author Authority, Sentiment Analysis on the Comments, and so on. In order to do so, it has been proposed to process the transcripts of the selected videos, comments, and other video content such as tags, like count, dislike count, etc, etc. We considered the YouTube-8M dataset, which contains over 6.1 million Video Ids.

Datasets

Through youtube-dl project, the following number of data has been obtained:

  • Total of files with transcript data (only English language): 14366 files
  • Total of files with video content: approximately 620000 files

All these files are available at Google Drive (only group members have access).

[Constructing]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages