April 7, 2017

Using Machine Learning To Characterize Singing Styles

jason-rosewell-60014Before written language was invented, different cultures used music to communicate with each other. Investigating how these cultures exchanged musical styles is a major research area in the field of comparative musicology, and its scholars are becoming increasingly reliant on data science and machine learning techniques to help answer their research questions.

For example, in “Towards The Characterization of Singing Styles in World Music”, affiliated faculty member at CDS and Professor of Music Technology at NYU Steinhardt, Juan Pablo Bello, Dr. Simon Dixon from Queen Mary University (UK), and their doctoral students Rachel Bittner and Maria Panteli, harnessed the power of machine learning to investigate the similarities and differences between singing styles in folk and traditional music around the world.

For this investigation, Bello and his team decided to focus on single musical element: vocal pitch, which comprises of smaller markers like vibrato (variation in pitch), melisma (singing one syllable within a range of notes), and slow or fast syllabic singing.

They began by training a Random Forest Classifier (a popular machine learning algorithm) to identify pitch contours of the singing voice in different audio recordings, and separate these from non-vocal contours. While they used 62 tracks from MedleyDB containing leading vocals as a training set, the team also worked with a larger dataset of 2808 audio clips from the Smithsonian Folkways Recordings collection, which contains music from 50 different countries and 28 different languages.

Then, they created a dictionary of singing style descriptors. After creating the dictionary, they could then classify the audio clips, and use k-means unsupervised clustering to identify which clips were the most similar in terms of vocal pitch.

Their fascinating study found that the majority of clusters represent recordings from neighboring countries, or similar cultures or languages. For example, cluster seven contained music from predominantly European cultures, while clusters two and five contained music from African and Caribbean cultures. In future the team hopes, with some help from musicology experts, to evaluate the singing style clusters in more detail.

by Cherrie Kwok