This thesis describes the problem of generating jazz music chords using recurrent neural networks (RNN) and numerical vector representations - embeddings. To create the embedding I use techniques known from the field of natural language processing (NLP). It is possible because of certain structural similarities between spoken language and music. I study the relationship between the chords in the hidden space generated by the algorithms: Word2Vec, FastText and Multi-hot. Visualization of chords in reduced vector representation space using t-SNE and PCA algorithms illustrates a lot of dependencies between the chords derived from the principles of jazz harmony. The test results confirmed the highest performance of chord generation when using the Word2Vec model in the Skip-Gram variant.
To generate chords I use their vector representation. I analyze the performance of sixteen models based on the recursive neural networks in terms of their hyper parameters and architecture. I also study the impact of the algorithm used to create the numerical representation on the results of recurrent models. To train neural networks I use a set of jazz standards obtained from publicly available Internet sources in a format allowing for the extraction of chord sequences.
The experiments shows the highest performance of deep models with recurrent layers - GRU and LSTM, but the one with the shortest runtime is the model with two GRU layers, Dropout regularization layer and Dense layer. I use trained models to build a simple program capable of generating continuous chord sequences in the unsupervised variant and with partially controlling this generation by the user. For implementation of all algorithms, visualization and data processing I use Python language and its libraries:
tensorflow, keras, gensim, scikit-learn, numpy, pychord, music21 etc.
19 Jun 2020 - Mateusz Dorobek