Activate virtual environment

This is the virtual environment I use for dbt. dbt allows us to create SQL files and materialize tables into a Duckdb database

source ~/duckdb-env/bin/activate

Create or open a persistent Duckdb database

spotify_db.db is where we will materialize our dbt models into.

duckdb spotify_db.db

Unnesting JSON files

If your json file looks something like this, and you want to create tracks and artists tables, then we can use the UNNEST function.

The UNNEST function takes an array and returns a table with a row for each element in the array.

{
"tracks": [
	{
	"artist": "Elevation Worship",
	"album": "Acoustic Sessions",
	"track": "Do it Again (Acoustic)",
	"uri": "spotify:track:1to0nLR2su8W61hL1Z3dJr"
	},
	
	{
	"artist": "TWS",
	"album": "TWS 2nd Mini Album 'SUMMER BEAT!'",
	"track": "If I’m S, Can You Be My N?",
	"uri": "spotify:track:2y8vRHEoz15sYDmxyrDePO"
	}
],
"bannedTracks": [
],
"artists": [
	{
	"name": "2NE1",
	"uri": "spotify:artist:1l0mKo96Jh9HVYONcRl3Yp"
	},
	
	{
	"name": "AKMU",
	"uri": "spotify:artist:6OwKE9Ez6ALxpTaKcT5ayv"
	}
]
}

To unnest tracks

WITH raw_json AS (
	SELECT * FROM read_json_auto('../Spotify-Account-Data/YourLibrary.json') 
),
 
unnested_tracks AS (
	SELECT UNNEST(tracks) AS tracks
),
 
SELECT UNNEST(tracks)
FROM unnested_tracks

Alternatively, we can unnest tracks recursively

WITH raw_json AS (
	SELECT * FROM read_json_auto('../Spotify-Account-Data/YourLibrary.json') 
)
 
SELECT UNNEST(tracks, recursive:=true) FROM raw_json;

Manually Create Tables Using duckdb

Create a table from a JSON file

-- create a table named streaming_history_music
CREATE TABLE streaming_history_music (endTime TIMESTAMP, artistName VARCHAR, trackName VARCHAR, msPlayed BIGINT);
 
-- copy data from the json file into the table
COPY streaming_history_music FROM 'streaming_history_music.json' (AUTO_DETECT true);

Create a table without specifying the schema manually:

CREATE TABLE streaming_history_music AS
	SELECT * FROM 'streaming_history_music.json';