The proposed methodology can be applied to characterize playlists in terms of popularity and semantic diversity, allowing the comparative analysis of human-generated and algorithm-generated playlists in different contexts such as historical periods, platforms and musical genres. We find extremely valuable to compare different playlist datasets, as it allows to understand how changes in the listening experience are affecting playlist creation strategies.
This repository contains code to reproduce the results of our paper.
Lorenzo Porcaro, Emilia Gómez (2019). 20 Years of Playlists: A Statistical Analysis on Popularity and Diversity. Paper to be presented at the 20th Conference of the International Society for Music Information Retrieval (ISMIR 2019), Delft, The Netherlands, 4-8 November.
lorenzo.porcaro at gmail.com
git clone https://github.com/MTG/playlists-stat-analysis.git
cd playlists-stat-analysis/src/
git clone https://github.com/oliviaguest/gini
Create a virtual environment (tested on Python 3.5), then launch the following command for installing the dependencies (be sure to be in the src
folder):
pip install -r requirements.txt
It lasts between 5 and 10 minutes, and it is needed around 2GB of free disk
mkdir ../data
./download_datasets.sh
Check data/README.md
For Last.fm tags and tags embeddings write to
lorenzo.porcaro at gmail.com
For instance, to analyze AOTM dataset launch the following commands:
python playlist_popularity.py -d AOTM
python playlist_diversity.py -d AOTM
python playlist_qualia.py -d AOTM
Plot tag-embeddings using t-SNE algorithm:
python plot_embeddings_tsne.py -d AOTM
In the case of CORN dataset, playlist_diversity.py
lasts ~20 min, due to longness of average playlist.