In this notebook I combine Spotify audio feature and BERT word embedding to predict tracks sentiments. I use hugginface pre-trained BERT transformer as an embedding layer, and train an additional bidirectional GRU layer for the sentiment analysis regression task (point prediction in range [0-1]). To train the fine-tunning layer of the model I use Spotify valence attribute which I added to a lyrics dataset.
The examples below use NLTK demo and Spotify valence to measure a track's positivenesss. They demonstrate that using strictly audio OR lyrics might be inaccurate.
- Positive Sentiment Example: Baz Luhrmann - Everybody's Free To Wear Sunscreen.
- NLTK sentiment classification: Negative.
- Spotify Valence: 0.8.
- Negative Sentiment Example: Otis Redding- Mr. pitiful.
- NLTK sentiment classification: Negative.
- Spotify Valence: 0.9.
- Database: gathering songs lyrics, adding Spotify valence attribute and pre-processing. I uploaded to Kaggle the final 150K Lyrics Labeled with Spotify Valence Dataset.
- Model Design: Iteratively improved model capacity.
- Evaluation: loss and accuracy metrics across 3 buckets - negative, neutral and positive sentiments.
- Interpretation: Understanding what the model is learning using word clouds.
Words in the word cloud are sized by their respective difference on the model's prediction, and their positive (green) or negative (red) influence.
Positive Sentiment Example: Armin Van Buuren- Blah Blah Blah.
- NLTK sentiment classification: Negative.
- Spotify Valence: 0.18.
- LyricsAudioBoost Model: 0.76.