Uncertainty Calibration for Deep Audio Classifiers

Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Jing Xiao

September 2022

Reliability diagrams of the ResNet-50 architecture and four calibration methods

Abstract

Although deep Neural Networks (DNNs) have achieved tremendous success in audio classification tasks, their uncertainty calibration are still under-explored. A well-calibrated model should be accurate when it is certain about its prediction and indicate high uncertainty when it is likely to be inaccurate. In this work, we investigate the uncertainty calibration for deep audio classifiers. In particular, we empirically study the performance of popular calibration methods{:}(i) Monte Carlo dropout, (ii) ensemble, (iii) focal loss, and (iv) spectral-normalized Gaussian process (SNGP), on audio classification datasets. To this end, we evaluate (i–iv) for the tasks of environment sound and music genre classification. Results indicate that uncalibrated deep audio classifiers may be over-confident, and SNGP performs the best and is very efficient on the two datasets of this paper.

Type

Publication

In 23rd Annual Conference of the International Speech Communication Association

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Speech Audio

Uncertainty Calibration for Deep Audio Classifiers

Abstract

Shijing Si

Researcher

Jianzong Wang

Honorary Director