Skip to content

MBTI_Prediction_Model utilizes sentiment analysis and deep learning techniques to predict MBTI personality types from social media text, optimizing model performance through Hyperband tuning.

Notifications You must be signed in to change notification settings

lesleyzhao/MBTI_Prediction_Model

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

SentimentAnalysis_MBTIPrediction

Project Description

In this era of modernization, every individual and company is utilizing the Myers-Briggs Type Indicator (MBTI) to determine the personality type of themselves or their current and future employees. Most state of the art Machine Learning tools and algorithms have not been deployed for MBTI personality types prediction and all research on MBTI personality types prediction from textual data, i.e. from social media posts is sparse. This contributes to our purpose of initiating this project which is identifying the best performing recurrent neural network on MBTI personality type prediction.

In this project we will be focusing on the following:

  • A sentiment analysis on the last 50 social media posts of every sampled individual in the data and the application of a weighting metric on the sentiment score accounting for which of the following categories:
    • E (Extrovert) vs. I (Introvert)
    • S (Sensing) vs. N (Intuition)
    • T (Thinking) vs. F ( Feeling)
    • J (Judging) vs. P (Perceiving)
  • The development and application of multiple RNN variant models to make predictions on MBTI personality types with the obtained results from the sentiment analysis.

Content of repository:

Total of 1 files consisting of:

  • DataPrepocessing:
    • Data Cleaning: Removing unnecessary words and symbols
    • Word Lemmatization: Tokenize and lemmatize words to original form
    • Distribution Identification: Draw distribution data for each MBTI personality class
    • Train-Validation-Test Split on Dataset: Training-75%, Validation-15%, Test-15%
    • Bag of Words Techniques
    • Kaggle Dataset: https://www.kaggle.com/datasets/datasnaek/mbti-type
  • PieChart
    • Determine the distribution of type of words: Nouns, Verbs, etc
  • WordCloud
    • Determine words with the higher weights on each MBTI personality class
  • Hyperparameter Optimization with Hyperband
    • Determine the best configurations of number of hidden units, batch size, dropout, and learning rate for RNN, LSTM, GRU and BiLSTM models to enable best accuracy
    • HyperbandScheduler()
  • RNN vs. RNN with Attention Layer
    • Build RNN and RNN with Attention Layer with the configurations with best accuracy obtained from running hyperband
    • Compare the performance of both models and determine if there exist improvement after adding Attention Layer
  • LSTM vs. LSTM with Attention Layer
    • Build LSTM and LSTM with Attention Layer with the configurations with best accuracy obtained from running hyperband
    • Compare the performance of both models and determine if there exist improvement after adding Attention Layer
  • GRU vs. GRU with Attention Layer
    • Build GRU and GRU with Attention Layer with the configurations with best accuracy obtained from running hyperband
    • Compare the performance of both models and determine if there exist improvement after adding Attention Layer
  • BiLSTM vs. BiLSTM with Attention Layer
    • Build BiLSTM and BiLSTM with Attention Layer with the configurations with best accuracy obtained from running hyperband
    • Compare the performance of both models and determine if there exist improvement after adding Attention Layer
  • Comparison of models and model variants with Attention Layer
    • Model with Best Accuracy
    • Performance improvement with Attention Layer

Result

  • WordCloud

    • entj 4 32 29 PM ENTJ
    • entj 4 09 12 PM ENTP
    • esfj 4 28 13 PM ESTJ
    • enfj (1) ENFJ
    • enfp ENFP
    • esfp ESFP
    • estj ESTJ
    • estp ESTP
    • infj INFJ
    • infp INFP
    • intj INTJ
    • intp INTP
    • isfj ISFJ
    • isfp ISFP
    • istj ISTJ
    • istp ISTP
  • Pie Chart

    • esfj ESFJ
    • entp ENTP
    • enfj ENFJ
    • enfp ENFP
    • entj ENTJ
    • esfp ESFP
    • estj ESTJ
    • estp ESTP
    • infj INFJ
    • infp INFP
    • intj INTJ
    • intp INTP
    • isfj ISFJ
    • isfp ISFP
    • istj ISTJ
    • istp. ISTP
  • Accuracy Comparison

    • RNN: 59.4%
    • RNN with Attention: 59.7%
    • LSTM: 58%
    • LSTM with Attention:60.5%
    • GRU: 59.2%
    • GRU with Attention: 60.3%
    • BiLSTM: 59.8%
    • BiLSTM with Attention: 61.3%
  • Best Model in terms of accuracy

    • Accuracy on training: BiLSTM
    • Accuracy on test: BiLSTM with Attention
  • Improvement of performance with Attention Layer

References

About

MBTI_Prediction_Model utilizes sentiment analysis and deep learning techniques to predict MBTI personality types from social media text, optimizing model performance through Hyperband tuning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%