K. Vijaya Kumar, P Kalyanchakravarthi, D suresh
Gaussian Processes (GPs) are Bayesian nonparametric models that are becoming more and more popular for their superior capabilities to capture highly nonlinear data relationships in various tasks, such as dimensionality reduction, time series analysis, novelty detection, as well as classical regression and classi?cation tasks. In this paper, we investigate the feasibility and applicability of GP models for music genre classi?cation and music emotion estimation. These are two of the main tasks in the music information retrieval (MIR) ?eld. So far, the support vector machine (SVM) has been the dominant model used in MIR systems. Like SVM, GP models are based on kernel functions and Gram matrices; but, in contrast, they produce truly probabilistic outputs with an explicit degree of prediction uncertainty. In addition, there exist algorithms for GP hyper parameter learning—something the SVM framework lacks. In this paper, we built two systems, one for music genre classi?cation and another for music emotion estimation using both SVM and GP models, and compared their performances on two databases of similar size. In all cases, the music audio signal was processed in the same way, and the effects of different feature extraction methods and their various combinations were also investigated. The evaluation experiments clearly showed that in both music genre classi?cation and music emotion estimation tasks the GP performed consistently better than the SVM. The GP achieved a 13.6% relative genre classi?cation error reduction and up to an 11% absolute increase of the coef?cient of determination in the emotion estimation task.
Music Genre, Emotion Recognition, Gaussian Processes