src package

Submodules

src.db module

class src.db.Song(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Base

album_image
album_name
artist_name
created_at
duration_ms
explicit
playlist_id
popularity
scraper_name
song_name
spotify_uri
updated_at

src.playlist_updater module

class src.playlist_updater.Updater[source]

Bases: object

add_songs_to_playlist(spotify_songs, playlist_id)[source]

Add spotify songs to a playlist, using songs URI.

Parameters

spotify_songs (list(dict)) – List of spotify songs

Returns

Json response from the Spotify API

Return type

json

filter_and_save_songs_to_db(spotify_songs, scraper_name, playlist_id)[source]

Filter out songs that have already been added and add the remaining songs to the playlist.

Parameters
  • spotify_songs (list(dict)) – List of spotify songs as dict

  • scraper_name (str) – Scraper class name

Returns

List of spotify songs that are not in the playlist yet

Return type

list(dict)

scrap_and_update()[source]

Run the whole pipeline for every scraper:

  • Scrap the concerned website and get their song history

  • Search for the songs in Spotify

  • Filter the songs already in playlist and save them to DB

  • Add the filtered songs to the playlist

Returns

Inserted songs

Return type

list(dict)

search_songs_in_spotify(radio_history)[source]

Retrieve songs informations from title and artist using Spotify Search API.

Parameters

radio_history (list(dict)) – list of dict with title and artist as keys

Returns

list of dict of spotify songs

Return type

list(dict)

single_scraper_pipeline(scraper)[source]
spotify_auth()[source]

Authenticates using Authorization Code Flow.

Returns

URL to redirect to

Return type

str

spotify_callback(authorization_code)[source]

Function called by Spotify with access token in the request parameters.

Parameters

authorization_code (str) – Authorization code

sync_db_with_existing_songs(playlist_id)[source]

If the playlist already exist, look for songs in it and stores them in the local database so we don’t add duplicates.

Parameters

playlist_id (str) – Playlist ID

src.scraping module

Add new scrapers here. Please follow these steps to do so:

  • Create a class whose names ends with Scraper, e.g: YourScrapper (although it should be explicit which website it crawls).

  • Make that class inherit from Scraper

  • Call for super() in its constructor, and pass it the URL of the webpage to crawl and the playlist_id to upload the songs to. e.g:

    player_url = 'https://radio.com/awesome-song-history'
    playlist_id = '3BCcE8T945z1MnfPWkFsfX'
    super(YourScrapper, self).__init__(player_url, playlist_id)
    
  • Overide the get_song_history method, the first row should be:

    soup, driver = self.scrap_webpage()
    
  • Add your scraper in the [tests](./tests/test_scraping.py) folder:

    class TestYourScraper(GenericScraperTest):
        scraper = scraping.YourScraper()
    
  • Add your scraper in the [src.playlist_updater.Updater](./src/playlist_updater.py) class:

    self.scrapers = [
        scraping.KSHEScraper(),
        scraping.EagleScraper(),
        scraping.YourScraper()  # New scraper!
    ]
    
  • You’re all set!

class src.scraping.EagleScraper[source]

Bases: src.scraping.Scraper

get_song_history()[source]

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

  • title

  • artist

  • timestamp (can be null, it’s not used so far)

class src.scraping.KLOScrapper[source]

Bases: src.scraping.Scraper

get_song_history()[source]

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

  • title

  • artist

  • timestamp (can be null, it’s not used so far)

class src.scraping.KSHEScraper[source]

Bases: src.scraping.Scraper

get_song_history()[source]

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

  • title

  • artist

  • timestamp (can be null, it’s not used so far)

class src.scraping.Q1043Scrapper[source]

Bases: src.scraping.Scraper

get_song_history()[source]

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

  • title

  • artist

  • timestamp (can be null, it’s not used so far)

class src.scraping.Scraper(player_url, playlist_id)[source]

Bases: abc.ABC

abstract get_song_history()[source]

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

  • title

  • artist

  • timestamp (can be null, it’s not used so far)

scrap_webpage()[source]

Scrap the webpage. This function must be called first in the get_song_history implementation.

Returns

soup and driver

Return type

tuple

class src.scraping.WMGKScrapper[source]

Bases: src.scraping.Scraper

get_song_history()[source]

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

  • title

  • artist

  • timestamp (can be null, it’s not used so far)

src.spotify module

class src.spotify.SpotifyApi[source]

Bases: object

add_tracks_to_playlist(track_uris, playlist_id)[source]

Add spotify songs to playlist, using their URIs.

Parameters

track_uris (list) – List of songs URIs.

Returns

Reponse from the Spotify API

Return type

json

check_playlist_exists(playlist_id)[source]
get_track_uris_from_playlist(playlist_id)[source]

Return the track URIs from the playlist

Returns

the songs URIs

Return type

set

search_track(track_name, artist_name)[source]

Search for a track using the Spotify Search API.

Parameters
  • track_name (str) – Track name

  • artist_name (str) – Artist name

Returns

Dict containing the song attributes

Return type

dict