Categories
Analysis Blog

How To Analyse Your YouTube and Spotify Data

Your favourite music streaming services provide year-end summaries of your listening habits. These summaries typically include your top playlists, artists, songs, genres, listening time, and number of songs streamed. Some services even analyze your mood based on your listening history.

This article explores how to go beyond these basic summaries and gain deeper insights from your Spotify and YouTube data. We will achieve this through data collection, cleaning, transformation, and visualization using Power BI (This can also be achieved with any analytical tool of choice).

Youtube Analysis Dashboard Spotify Analysis Dashboard

YouTube Music and Video Analysis

Youtube analysis summary stats Youtube daily listening patterns

Spotify Music and Podcast Analysis

Visuals and cards show the total number of song plays, unique artists, songs, total minutes, top genres, active listening period, and top artists.

You can also see separate visuals for Spotify podcasts. More so, more customizable options are added to the visuals. For example, you can choose to see your top 10, 15, 20 or N artists. You can see the most popular genres or all of them and the months you listened to the most songs.

Spotify music analysis Stats

Spotify analysis top songs and artists Spotify daily listening patterns

View Live Dashboard Here

Data Collection

YouTube Music and Video Data

We will use Google Takeout to access your YouTube data. Google Takeout allows you to download all your Google product data, including YouTube history. The data is downloaded in JSON format.

Youtube history in TakeoutSelect Youtube history

Here is what the raw JSON data looks like:

{

  “header”: “YouTube Music”,

  “title”: “Watched Kele”,

  “titleUrl”: “https://music.youtube.com/watch?v\u003dr49K9xwf3UM”,

  “subtitles”: [{

    “name”: “Show Dem Camp – Topic”,

    “url”: “https://www.youtube.com/channel/UCunYZtqaebaiFoML6L6wCA…}

JSON files are key and value pairs.

Key = header  Value = “YouTube Music”

Key = title       Value = “Watched Kele”

Spotify Music and Podcast Data:

Download your Spotify data from the Account Privacy page. Spotify will send you a compressed file containing your entire listening history. The data is provided in JSON format with accompanying documentation to explain the data columns.

Download Spotify Data

The processing of your data is anticipated to occur within a timeframe of five days, subsequent to which you will receive a compressed file (Zip file) containing a comprehensive record of your streaming history.

Since we requested extended history, Spotify provided the history for the account’s lifetime, i.e., since you joined Spotify.

Here’s what the data looks like.

{
“ts”: “2024-09-10T17:59:36Z”,
“platform”: “windows”,
“ms_played”: 171291,
“conn_country”: “NG”,
“master_metadata_track_name”: “A Bar Song (Tipsy)”,
“master_metadata_album_artist_name”: “Shaboozey”,
“master_metadata_album_album_name”: “Where I’ve Been, Isn’t Where I’m Going”,

}

Spotify data columns

Spotify also provides documentation in your zip file to help you understand the columns in the data, which is very impressive.

Data Cleaning and Transformation using Power BI

Note: You can use any analytical tool for this step.

We will use Power BI to import and clean the data from both YouTube and Spotify. This may involve:

  • Import the JSON data into PowerBI and transform it further in Power Query.
  • Power Query automatically applies steps that involve the following: Converts the Data into a table > Expand and split Columns > Changed the data type.
  • Removing irrelevant data points
  • Adding separate columns for time, hour, and date
  • Splitting columns to get separate artist and song names
Data Augmentation

The data obtained from YouTube Takeout lacks information such as song duration and genre. Spotify didn’t also include a genre column. To address this, we can leverage external APIs like MusicBrainz or Spotify to retrieve this missing information.

Leveraging the Spotify API, we returned the duration and genre for each song row.  Link to Python Code.

This data was then merged with our primary dataset.

Spotify augumented data

Data Analysis

Common Analysis for Both Platforms

Measures:

Create calculated measures in Power BI to determine:

  • Total number of songs
  • Unique songs
  • Unique artists

Visualizations:

Create the relevant charts and graphs to display:

  • Songs by number of plays
  • Artists by number of plays
  • Number of plays by hour/day

Spotify Specific Analysis:

  • Total number of podcasts listened to
  • Top podcasts
  • Total minutes listened to by day and artist
  • Listening patterns by hour/day
Customization

This process allows you to go beyond the basic insights provided by streaming services. You can customize the analysis to view:

  • Your top 10, 15, 20 or all favorite artists (instead of just the top 5)
  • Listening habits across different devices
  • Listening patterns by day, hour, and month
  • Most popular genres or all genres listened to
  • Months with the most listening activity

Conclusion

By following these steps, you can gain deeper insights into your music preferences and listening habits on YouTube and Spotify. This goes beyond the basic summaries provided by the streaming services and allows for a more personalized music exploration experience.

You can assess the Live Dashboard Here

Interested in a video describing this, subscribe to the YouTube Channel and you’ll get a notification for the video which will be published soon.