Skip to content

MSU-AI/transcript-editor

Repository files navigation

Video Editing Web Application

This web application is designed to make video editing easy and intuitive. It uses a combination of front-end and back-end technologies to provide users with a powerful tool that allows them to edit their videos quickly and efficiently.

Subsystems

The application is made up of several subsystems, each of which is responsible for a different aspect of the editing process. These subsystems work together seamlessly to provide a complete video editing solution.

User Interface

The user interface of the application is built using HTML, CSS, and JavaScript. It provides an intuitive and easy-to-use interface that allows users to upload their videos and begin editing them immediately. The user interface also features an interactive transcript that displays the video's transcribed audio. Users can click on any word in the transcript to jump to that specific location in the video.

Transcription

We initially wanted to use Amazon Transcribe for the transcription, but Amazon charged for usage of the service. The transcription subsystem is powered by Open AI's Whisper model. Specifically, a fork of Whisper that tracks the time of each word in the video. When a user uploads a video, the system automatically transcribes the audio and displays it on the website. The interactive transcript makes it easy for users to navigate through their video and find specific parts of the audio.

Video Timeline

Originally, we planned to use a JavaScript library called amalia.js for the video timeline. However, due to the library's newness, we ultimately decided to build our own timeline using CSS and JavaScript. The timeline allows users to edit their videos quickly and easily, without having to worry about the technical details of the editing process.

Video Editing

The video editing process is handled by the backend using Python's MoviePy library. Users can delete specific words in the transcript to remove portions of the video they don't want to include. The process recombines the video parts that weren't deleting, creating a new video. This process is simple and intuitive, allowing users to make edits quickly and easily.

Technologies Used

HTML
CSS
JavaScript
Python
Django
Open AI Whisper Model
MoviePy Library

We chose these technologies for their flexibility, reliability, and ease of use. HTML, CSS, and JavaScript are widely used and well-documented, making it easy to build a robust and responsive user interface. Python and Django are powerful backend technologies that make it easy to handle complex video editing processes. The Open AI Whisper Model is state-of-the-art technology that provides accurate transcriptions of audio, while the MoviePy Library is a fast and efficient tool for video editing. Conclusion

This web application provides users with a powerful and easy-to-use video editing solution. Its combination of front-end and back-end technologies allows users to edit their videos quickly and easily, without having to worry about the technical details of the editing process. We believe that this application has the potential to revolutionize the way people edit videos, and we look forward to seeing how it will be used in the future.