Youtube Summarizer with Claude-2, OpenAI & Whisper

Overview

What does it do?

The YouGPTube Summarizer is a Python-based application that utilizes advanced machine learning models from OpenAI and Anthropics to summarize YouTube videos. Given a YouTube URL, it downloads the video, extracts the audio, transcribes it, and then summarizes the content. The summarization can be done using either OpenAI's GPT-3.5 or Anthropics' Claude model.

Why is it useful?

Ever felt overwhelmed by the amount of content in a lengthy YouTube video and wished for a concise summary? The YouGPTube Summarizer can help you get the essence of a video in a fraction of the time it takes to watch it. Moreover, by using Whisper API for transcription, it can transcribe videos in multiple languages and generate summaries, thus breaking the language barrier.

Video Demo

ezgif.com-gif-maker.mp4

The Generations here are 3x fasten up! it took around 3-4 minutes for a video of 16 minutes, but its pretty accurate.

Tech Implementation

Under the hood, the application uses several Python libraries such as streamlit for the web interface, librosa for audio processing, openai for transcription and summarization, and yt_dlp for YouTube video downloading. It has different functions to handle tasks like audio downloading, chunking, transcribing, and summarizing.

Prerequisites

Install ffmpeg

The program uses ffmpeg for audio processing. Make sure to install it in your system. You can install it using the package manager for your system.

For Ubuntu:

sudo apt-get install ffmpeg

For macOS:

brew install ffmpeg

Python Dependencies

All Python dependencies are listed in requirements.txt. You can install them using pip:

pip install -r requirements.txt

API Keys

You'll need to obtain API keys for OpenAI and Anthropics (Claude). Store these keys in .env.example and rename the file to .env.

You can also automatically rename the .env.example file by running the following command:

mv .env.example .env

Running the Application

To run the app, navigate to the directory where the code is located and run:

streamlit run <filename>.py

Code Documentation

Importing Libraries

streamlit: For creating the web interface
os, shutil: For file and directory operations
librosa: For audio processing
openai: For OpenAI API calls
soundfile as sf: For audio file processing
yt_dlp: For downloading YouTube videos
anthropic: For Anthropics (Claude) API
dotenv: For loading environment variables

Functions

`find_audio_files(path, extension=".mp3")`

Finds all audio files in the given path with the specified extension.

`youtube_to_mp3(youtube_url: str, output_dir: str) -> str`

Downloads the YouTube video from the given URL and saves it as an mp3 file in the specified directory.

`chunk_audio(filename, segment_length: int, output_dir)`

Chunks the given audio file into segments of specified length (in seconds) and saves them in the specified directory.

`transcribe_audio(audio_files: list, output_file=None, model="whisper-1") -> list`

Transcribes the given audio files using OpenAI's Whisper model.

`summarize_openai(chunks: list[str], system_prompt: str, model="gpt-3.5-turbo", output_file=None)`

Summarizes the given list of text chunks using OpenAI's GPT-3.5 model.

`summarize_claude(chunks: list[str], system_prompt: str, model="claude-2", output_file=None)`

Summarizes the given list of text chunks using Anthropics' Claude model.

`summarize_youtube_video(youtube_url, outputs_dir, progress_bar, progress_text, summarization_function)`

Main function that orchestrates the summarization process.

Streamlit UI (`main()`)

Streamlit interface for user inputs and displaying summaries.

Customization

You can customize the summarization by changing the system_prompt. This allows you to tailor the summary to your specific needs.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
packages.txt		packages.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Youtube Summarizer with Claude-2, OpenAI & Whisper

Overview

What does it do?

Why is it useful?

Video Demo

Tech Implementation

Prerequisites

Install ffmpeg

Python Dependencies

API Keys

Running the Application

Code Documentation

Importing Libraries

Functions

`find_audio_files(path, extension=".mp3")`

`youtube_to_mp3(youtube_url: str, output_dir: str) -> str`

`chunk_audio(filename, segment_length: int, output_dir)`

`transcribe_audio(audio_files: list, output_file=None, model="whisper-1") -> list`

`summarize_openai(chunks: list[str], system_prompt: str, model="gpt-3.5-turbo", output_file=None)`

`summarize_claude(chunks: list[str], system_prompt: str, model="claude-2", output_file=None)`

`summarize_youtube_video(youtube_url, outputs_dir, progress_bar, progress_text, summarization_function)`

Streamlit UI (`main()`)

Customization

About

Releases

Packages

Languages

License

agniiva/YoutubeGPTClaude

Folders and files

Latest commit

History

Repository files navigation

Youtube Summarizer with Claude-2, OpenAI & Whisper

Overview

What does it do?

Why is it useful?

Video Demo

Tech Implementation

Prerequisites

Install ffmpeg

Python Dependencies

API Keys

Running the Application

Code Documentation

Importing Libraries

Functions

find_audio_files(path, extension=".mp3")

youtube_to_mp3(youtube_url: str, output_dir: str) -> str

chunk_audio(filename, segment_length: int, output_dir)

transcribe_audio(audio_files: list, output_file=None, model="whisper-1") -> list

summarize_openai(chunks: list[str], system_prompt: str, model="gpt-3.5-turbo", output_file=None)

summarize_claude(chunks: list[str], system_prompt: str, model="claude-2", output_file=None)

summarize_youtube_video(youtube_url, outputs_dir, progress_bar, progress_text, summarization_function)

Streamlit UI (main())

Customization

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`find_audio_files(path, extension=".mp3")`

`youtube_to_mp3(youtube_url: str, output_dir: str) -> str`

`chunk_audio(filename, segment_length: int, output_dir)`

`transcribe_audio(audio_files: list, output_file=None, model="whisper-1") -> list`

`summarize_openai(chunks: list[str], system_prompt: str, model="gpt-3.5-turbo", output_file=None)`

`summarize_claude(chunks: list[str], system_prompt: str, model="claude-2", output_file=None)`

`summarize_youtube_video(youtube_url, outputs_dir, progress_bar, progress_text, summarization_function)`

Streamlit UI (`main()`)

Packages