Skip to content

Files

Latest commit

ac9c7a6 · Jan 29, 2024

History

History
65 lines (40 loc) · 2.38 KB

File metadata and controls

65 lines (40 loc) · 2.38 KB

ConvoCraft: AI-Powered Dialogue Generator with GPT-2 Language Model

This project involves the analysis of dialogues from the Game of Thrones (GoT) series using NLP techniques. The main objective is to explore how dialogues relate to characters in the series and derive insights from these relationships. Additionally, the project investigates the feasibility of using neural networks to generate dialogue scripts.

Required Libraries to Install

Ensure you have the following Python libraries installed:

  • numpy
  • pandas
  • matplotlib
  • scikit-learn
  • seaborn
  • fastai
  • networkx
  • nltk
  • gensim

Dataset

Raw dialogues were obtained from Genius.com. The dataset consists of CSV files containing character dialogues, seasons, and chapters. For detailed data cleaning and preprocessing steps, refer to the Final_Pipeline notebook.

File Description

  • Final_Pipeline.ipynb: Comprehensive data preparation, analysis, modeling, and scoring process.
  • GoT dialogue generator.ipynb: Use a pre-trained model to generate random GoT dialogues.
  • GoT: Raw chapter scripts obtained from Genius.com.
  • CSV: Cleaned CSV files containing character dialogues.

Network Graph

Character interactions were visualized using a network graph generated from the dialogue data.

Network Graph

Dialogue Generation

Dialogue generation using GPT-2 language model is demonstrated in the GoT dialogue generator.ipynb notebook.

Model Training

For model training, refer to the instructions in the runmeforallwork.ipynb notebook.

GPT-2 Language Model Dialogue Generation Performance

Perplexity

Perplexity

Model Loss

Model Loss

Generated Dialogue Samples

Dialogue Sample 1

Dialogue Sample 2

Project Contributors