Simple Spotify listening history comparison
This is a 3-step process:
- Individual users retrieve their Spotify listening history (see "takeify")
- User histories are consolidated, and track features are accessed from the Spotify API (see "apiify")
- Features undergo PCA dimensionality reduction, then are visualized based on user tag (see "clusterify")
- I have not yet implemented a user interface for the data retrieval step. Eventually, it will all be able to happen with just a few clicks; for now, please follow the instructions in the "takeify" section below to retrieve your data.
- I intend to modify my dimensionality reduction process so that each component is influenced by no more than three features, and no two components are influenced by the same feature. My goal is to be able to visualize this data in three dimensions, with each axis conveying clear and interpretable information (as opposed to the current visualization, in which each axis represents a weighted combination of every feature).
If you are retrieving your data to be analyzed but not analyzing it yourself, this is the only file you will need to use. Open takeify.ipynb, then select "Open In Colab" at the top of the code preview. This will take you to Google Colab, where you will run the code.
- Setup
Run code in the "Setup" section. You will be asked to confirm that you want to connect to Google Drive. Please connect to your personal Drive - this is necessary so that your output can be saved and easily shared with whoever is executing the later steps.
- Taking Your Data
In the "Give me your name" section, please put your name in the quotes. For example, that line of code would read as follows after I modify it:
user = "Leah"
Next, run the "Let me into your account" section. You should see a brief prompt, a URL, and an empty text box. Click on the URL.
If you are already logged into Spotify: You should be taken to a page that says "Example Domain". Copy the URL of this page from the search bar, paste it into the empty text box, and press enter.
If you are not yet logged into Spotify: You will see a login page. Log in, and then you should be taken to a page that says "Example Domain". Copy the URL of this page from the search bar, paste it into the empty text box, and press enter.
- This part saves your top spotify tracks to a file that I will be playing with
Run the code in this section.
If you are retrieving your data to be analyzed but not analyzing it yourself: locate the newly created .csv file in your Google Drive, and share it with whoever will be executing the next two steps.
Simply place user .csv files are in the correct directory, and run
Just run it woooooo
Feature Distribution Visualization
Plotting based on manually selected features
After considering the feature impacts on each PCA component, I tried plotting the data using only 3 variables (with one variable per axis), rather than using all variables on all axes as PCA does. I found that using only 3 variables, I was still able to produce a very similar result.
In both the PCA-based example and the 3-variable example, the data is represented in a U-shape; the three users are distributed similarly throughout the plot, and similar musical artists and songs are close together.