This page introduces the 2017-18 projects for the ATIAM machine learning lessons. Each subject will target various notions seen in class. We will re-introduce the basic mecanisms behind each approach, along with the fundamental papers to understand and then, we will target specific applications to musical or audio data.
We detail here the global instructions that are common to all projects.
- Groups of 4 to 5 students per project
- Projects are coded in Python (cf. Coding style section underneath)
- All projects should be accompanied with a small report in english
- The report should be of 8 pages maximum following scientific papers
- Reports must be written in LaTeX with a given format style
- Each project has a referent PhD along with myself
All projects will be evaluated by the referent PhD, myself and another randomly picked PhD to ensure equity across different projects. The project should be delivered with an archive containing 3 folders
code/ : Should contain your well-documented code (cf. Coding style section) along with simple scripts that demonstrate the use of the developped methodologies. We recommend that you organize your code following modules.
report/ : Should contain your report in PDF format along with the LaTeX source and eventual figures.
toy/ : Should contain a well-documented toy dataset, along with the procedural scripts to generate it. You can create another PDF document describing the set if you fill the need, otherwise detail it in your report
All your files should be packed in a zip file unfolding to a folder named [ATIAM][ML2017] (LastName of all students).zip
Deadline : 25/12/2017 - 23h59
Submission : esling [at] ircam (dot) fr
Formatting : mail with subject : [ATIAM][ML2017] (Last names of all students involved)
Evaluation grid: This generic grid will be applied and sub-grids will be modulated for each subject.
(6 pts) - Report Including content and style
(6 pts) - Toy dataset Quality and completeness of the dataset
(8 pts) - Code Accuracy, evaluation and coding style
Code and style
We will provide small reference codes for each project if needed. This code will contain helper functions that will alleviate you from the burden of data import and other sideline implementations. The code is in Python and relies heavily on the concept of
code sections which allows you to evaluate only part of the code (to avoid running long import tasks multiple times and concentrate on the question at hand).
We highly recommend that you install Pip that will manage the automatic installation of those Python libraries (along with their dependencies).
Coding style We impose that your code follow the PEP8 coding style recommandation
Each folder represents a module, you should consequently ensure everything related to module definition.
- Write a init.py file
- Check that the documentation inside is valid
- Always document any new functionality
- Implement examples in a root-based script
Code documentation All code should be highly documented at all levels. In order to facilitate a common documentation, you are required to follow the Numpy documentation style practice, which can be found here
Unit testing is optional for the project but highly recommended (and for your future projects in any case). Every time you add a new
independent functionnality to the toolbox, you should develop a set of unit tests in order to ensure that all the functions work correctly and also that future modifications will not impair previous development.
If you do not know the principle of unit testing, you can read
We detail here the various subjects (organized alphabetically by the last name of the referent PhD). For each, you can find a detailed PDF version in the following list, and we summarize the abstracts underneath.
Disentangling variation factors in audio samples
Observations of complex data in the real world might be caused by several underlying generative factors that account for specific perceptual features. However, these factors are usually entangled and cannot be underlined directly from the data. Modeling such factors could generalize the learning of complex concepts through compositions of simpler abstractions. This enables us to understand the inner structure of the data, to process it efficiently and to control meaningful generative processes which may eventually open up on artificial creativity. An extensive body of research has been carried in the field of computer vision through the
Variational Auto-Encoders. The goal of this project is to extend these recent approaches to sound and music data, by defining a procedural toy dataset of sound synthesis and then applying the recent $\beta$-
SCAN approaches to these data.
Generation of chord progressions and inference in jazz
Abstract Symbolic music generation is a field that has been widely explored through Hidden Markov Models (HMM). For instance, Paiement, Bengio and Eck published a probabilistic models for melodic prediction using chord instantiation. Nevertheless, HMMs are bounded by their order of complexity. On the opposite, deep learning techniques are capable of computing highly non-linear functions and can extract information from complex data. Therefore, most recent works in the symbolic music generation field are now based on Neural Networks (NN), and particularly on Recurrent Neural Networks (RNN).
Latent representations for real-time synthesis space exploration
Generative systems are machine-learning models whose training is based on two simultaneous optimization tasks. The first is to build a latent space, that provides a low-dimensional representation of the data, eventually subject to various regularizations and constraints. The second is the reconstruction of the original data through the sampling of this latent space. These systems are very promising because their space is a high-level, “over-compressed” representation that can be used as an intermediate space for several tasks, such as visualization, measurements, or classification. The main goal of this project is to develop variational models to find generative
sound synthesis space, where each point of this space correspond to a new data content that comes from the
high-level understanding of the input data.
Embedding music for automatic composition spaces
This project aims to develop new representations for symbolic music generation and automatic composition. Whereas the previous approaches are based on known mathematical rules, you will try to develop a more empirical model through machine learning. Your goal is to represent musical symbols in a space that carry semantics relationships between them, called
embedding space. This approach allows to extract new descriptive dimensions that may be relevant for music analysis and generation. To that end, you will use Convolutional Neural Networks in order to capture features of the piano-roll representation of musical pieces. First, you will propose a toy dataset that has to follow some semantical rules in order to evaluate your embedding spaces. To that end, you will need to define what are the contextual rules that could mimic semantic relationships in music. Then, you will extend these approaches by using Recurrent Neural Networks to train your models with a symbolic prediction task.