Introduction
The present tutorials are designed to implement the core notions seen in machine learning lessons. Most techniques can be applied to any type of data from which sets of features can be computed. The exercises here will introduce the basic mecanisms behind these technics and then, we will target specific applications to musical or audio data.
Get the baseline code for your language of choice :
Get the audio datasets from this link
Unzip files and place 00_Datasets
along with other folders
Reference slides
Download the slides
- Introduction to artificial intelligence
- Properties of machine learning
- Nearest-neighbors
Tutorial
In this introduction, we will cover basic Music Information Retrieval (MIR) interactions, in which we process a dataset of sound files and try to observe the properties of their various temporal and spectral features. Hence, we will quickly review basic calculus required to perform further machine learning tasks. This tutorial is also intended to review basic Matlab coding and plotting operations.
0.0 - Reference code
Along the tutorials, we provide a reference code for each section. This code contains helper functions that will alleviate you from the burden of data import and other sideline implementations. You will find designated spaces in each file to develop your solutions. The code is in Python (notebooks impending) and relies heavily on the concept of code sections which allows you to evaluate only part of the code (to avoid running long import tasks multiple times and concentrate on the question at hand.
Get the baseline MATLAB code for all tutorials from this zip file
Get the baseline Python code for all tutorials from this zip file
Get the baseline Jupyter notebooks code for all tutorials from this zip file
Dependencies
Python installation
In order to get the baseline script to work, you need to have a working distribution of Python 3.5
as a minimum (we also recommend to update your version to Python 3.7
). We will also be using the following libraries
We highly recommend that you install Pip or Anaconda that will manage the automatic installation of those Python libraries (along with their dependencies). If you are using Pip
, you can use the following commands
pip install matplotlib
pip install numpy
pip install scipy
pip install scikit-learn
pip install music21
pip install librosa
pip install torch torchvision
For those of you who have never coded in Python, here are a few interesting resources to get started.
Jupyter notebooks and lab
In order to ease following the exercises along with the course, we will be relying on Jupyter Notebooks. If you have never used a notebook before, we recommend that you look at their website to understand the concept. Here we also provide the instructions to install Jupyter Lab which is a more integrative version of notebooks. You can install it on your computer as follows (if you use pip
)
pip install jupyterlab
Then, once installed, you can go to the folder where you cloned this repository, and type in
jupyter lab
0.1 - Datasets
In order to test our algorithms on audio and music data, we will work with several datasets that should be downloaded on your local computer first from this link
Type | Origin |
---|---|
Classification | MuscleFish dataset |
Music-speech | MIREX Recognition set |
Source separation | SMC Mirum dataset |
Speech recognition | CMU Arctic dataset |
Unzip the file and place the 00_Datasets
folder along with the other code folders
For the first parts of the tutorial, we will mostly rely solely on the classification dataset. In order to facilitate the interactions, we provide the function importDataset
that will allow to import all audio datasets along the tutorials.
Exercise
- Launch the import procedure and check the corresponding structure
- Code a count function that prints the name and number of examples for each classes
Expected output [Reveal]
0.2 - Preprocessing
We will rely on a set of spectral transforms that allow to obtain a more descriptive view over the audio information. As most of these are out of the scope of the machine learning course, we redirect you to a signal processing course proposed by Julius O. Smith.
The following functions to compute various types of transforms are given as part of the basic package, in the 00_Preprocessing
folder
File | Transform |
---|---|
stft.m |
Short-term Fourier transform |
fft2barkmx.m |
Bark scale transform |
fft2melmx.m |
Mel scale transform |
fft2chromamx |
Chromas vector |
spec2cep.m |
Cepstrum transform |
cqt.m |
Constant-Q transform |
In order to perform the various computations, we provide the following function, which performs the different transforms on a complete dataset.
Exercise
- Launch the transform computation procedure and check the corresponding structure
- For each class, select a random element and plot its various transforms on a single plot. You should obtain plots similar to those shown afterwards.
- For each transform, try to spot major pros and cons of their representation.
Expected output [Reveal]
0.3 - Features
As you might have noted from the previous exercice, most spectral transforms have a very high dimensionality, and might not be suited to exhibit the relevant structure of different classes. To that end, we provide a set of functions for computing several spectral features in the 00_Features
folder, we redirect interested readers to this exhaustive article on spectral features computation.
File | Transform |
---|---|
featureSpectralCentroid.m |
Spectral centroid |
featureSpectralCrest.m |
Spectral crest |
featureSpectralDecrease.m |
Spectral decrease |
featureSpectralFlatness.m |
Spectral flatness |
featureSpectralKurtosis.m |
Spectral kurtosis |
featureSpectralRolloff.m |
Spectral rolloff |
featureSpectralSkewness.m |
Spectral skewness |
featureSpectralSlope.m |
Spectral slope |
featureSpectralSpread.m |
Spectral spread |
featureMFCC.m |
Mel-Frequency Cepstral Coefficients (MFCC) |
Once again, we provide a function to perform the computation of different features on a complete set. Note that for each feature, we compute the temporal evolution in a vector along with the mean and standard deviation of each feature. We only detail the resulting data structure for a single feature (SpectralCentroid
).
Exercise
- Launch the feature computation procedure and check the corresponding structure
- This time for each class, superimpose the plots of various features on a single plot, along with a boxplot of mean and standard deviations. You should obtain plots similar to those shown afterwards.
- What conclusions can you make on the discriminative power of each feature ?
- Perform scatter plots of the mean features for all the dataset, while coloring different classes.
- What conclusions can you make on the discriminative power of mean features ?
Expected output [Reveal]
Question 0.3.2
Question 0.3.4