Ensemble audio segmentation for radio and television programmes

TitleEnsemble audio segmentation for radio and television programmes
Publication TypeJournal Article
Year of Publication2016
AuthorsLópez Otero, P, Docío Fernández, L, García Mateo, C
JournalMultimedia Tools and Applications
Volume76
Number5
Pagination7421-7444
Date Published03/2016
ISSN1380-7501
AbstractState-of-the-art audio segmentation strategies obtain good results when performing simple tasks but its performance is degraded when segmenting real-world scenarios such as radio and television programmes; this issue can be partially solved by performing a fusion of different audio segmentation strategies. Hence, a framework to perform decision-level fusion in the audio segmentation task is presented in this paper. First, the class-conditional probabilities of each audio segmentation strategy are estimated from a confusion matrix obtained by performing audio segmentation in a training dataset. Performance measures are extracted from these class-conditional probabilities, which are used to compute different estimates of the classifier’s reliability; specifically, reliability estimates based on precision, recall, accuracy, F-score and mutual information were proposed. These reliability estimates are used as weights in a weighted majority voting fusion strategy. The validity of the proposed fusion scheme and reliability estimates was assessed in the framework of Albayzin 2010, 2012 and 2014 audio segmentation evaluations, which consisted in segmenting collections of radio and television programmes. The experimental results showed that this simple fusion strategy improves the performance achieved by the individual audio segmentation strategies and by other well-known decision-level fusion strategies.
DOI10.1007/s11042-016-3386-2
Citation Key585