src package¶

Subpackages¶

src.application package

Submodules¶

src.db module¶

class src.db.Song(**kwargs)[source]¶

Bases: sqlalchemy.ext.declarative.api.Base

album_image¶

album_name¶

artist_name¶

created_at¶

duration_ms¶

explicit¶

playlist_id¶

popularity¶

scraper_name¶

song_name¶

spotify_uri¶

updated_at¶

src.playlist_updater module¶

class src.playlist_updater.Updater[source]¶

Bases: object

add_songs_to_playlist(spotify_songs, playlist_id)[source]¶

Add spotify songs to a playlist, using songs URI.

Parameters: spotify_songs (list(dict)) – List of spotify songs
Returns: Json response from the Spotify API
Return type: json

filter_and_save_songs_to_db(spotify_songs, scraper_name, playlist_id)[source]¶

Filter out songs that have already been added and add the remaining songs to the playlist.

Parameters

spotify_songs (list(dict)) – List of spotify songs as dict
scraper_name (str) – Scraper class name

Returns

List of spotify songs that are not in the playlist yet

Return type

list(dict)

scrap_and_update()[source]¶

Run the whole pipeline for every scraper:

Scrap the concerned website and get their song history
Search for the songs in Spotify
Filter the songs already in playlist and save them to DB
Add the filtered songs to the playlist

Returns: Inserted songs
Return type: list(dict)

search_songs_in_spotify(radio_history)[source]¶

Retrieve songs informations from title and artist using Spotify Search API.

Parameters: radio_history (list(dict)) – list of dict with title and artist as keys
Returns: list of dict of spotify songs
Return type: list(dict)

single_scraper_pipeline(scraper)[source]¶

spotify_auth()[source]¶

Authenticates using Authorization Code Flow.

Returns: URL to redirect to
Return type: str

spotify_callback(authorization_code)[source]¶

Function called by Spotify with access token in the request parameters.

Parameters: authorization_code (str) – Authorization code

sync_db_with_existing_songs(playlist_id)[source]¶

If the playlist already exist, look for songs in it and stores them in the local database so we don’t add duplicates.

Parameters: playlist_id (str) – Playlist ID

src.scraping module¶

Add new scrapers here. Please follow these steps to do so:

Create a class whose names ends with Scraper, e.g: YourScrapper (although it should be explicit which website it crawls).
Make that class inherit from Scraper

Call for super() in its constructor, and pass it the URL of the webpage to crawl and the playlist_id to upload the songs to. e.g:

player_url = 'https://radio.com/awesome-song-history'
playlist_id = '3BCcE8T945z1MnfPWkFsfX'
super(YourScrapper, self).__init__(player_url, playlist_id)

Overide the get_song_history method, the first row should be:
soup, driver = self.scrap_webpage()

Add your scraper in the [tests](./tests/test_scraping.py) folder:

class TestYourScraper(GenericScraperTest):
    scraper = scraping.YourScraper()

Add your scraper in the [src.playlist_updater.Updater](./src/playlist_updater.py) class:

self.scrapers = [
    scraping.KSHEScraper(),
    scraping.EagleScraper(),
    scraping.YourScraper()  # New scraper!
]

You’re all set!

class src.scraping.EagleScraper[source]¶

Bases: src.scraping.Scraper

get_song_history()[source]¶

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

title
artist
timestamp (can be null, it’s not used so far)

class src.scraping.KLOScrapper[source]¶

Bases: src.scraping.Scraper

get_song_history()[source]¶

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

title
artist
timestamp (can be null, it’s not used so far)

class src.scraping.KSHEScraper[source]¶

Bases: src.scraping.Scraper

get_song_history()[source]¶

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

title
artist
timestamp (can be null, it’s not used so far)

class src.scraping.Q1043Scrapper[source]¶

Bases: src.scraping.Scraper

get_song_history()[source]¶

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

title
artist
timestamp (can be null, it’s not used so far)

class src.scraping.Scraper(player_url, playlist_id)[source]¶

Bases: abc.ABC

abstract get_song_history()[source]¶

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

title
artist
timestamp (can be null, it’s not used so far)

scrap_webpage()[source]¶

Scrap the webpage. This function must be called first in the get_song_history implementation.

Returns: soup and driver
Return type: tuple

class src.scraping.WMGKScrapper[source]¶

Bases: src.scraping.Scraper

get_song_history()[source]¶

Scrap the website and get its song history. This function must be overiden. Its implementation must return a list of dict with the following keys:

title
artist
timestamp (can be null, it’s not used so far)

src.spotify module¶

class src.spotify.SpotifyApi[source]¶

Bases: object

add_tracks_to_playlist(track_uris, playlist_id)[source]¶

Add spotify songs to playlist, using their URIs.

Parameters: track_uris (list) – List of songs URIs.
Returns: Reponse from the Spotify API
Return type: json

check_playlist_exists(playlist_id)[source]¶

get_track_uris_from_playlist(playlist_id)[source]¶

Return the track URIs from the playlist

Returns: the songs URIs
Return type: set

search_track(track_name, artist_name)[source]¶

Search for a track using the Spotify Search API.

Parameters

track_name (str) – Track name
artist_name (str) – Artist name

Returns

Dict containing the song attributes

Return type

dict