Articles

ViSQOL: an Objective Speech Quality Model

Andrew Hines, Technological University DublinFollow
J. Skoglund, Google, Inc
A. C. Kokaram, Google, Inc
N. Harte, University of Dublin, Trinity College

Document Type

Article

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

1.2 COMPUTER AND INFORMATION SCIENCE, Computer Sciences

Publication Details

Eurasip Journal on Audio Speech and Music Processing, (2015) 2015:13

DOI 10.1186/s13636-015-0054-9

Abstract

This paper presents an objective speech quality model, ViSQOL, the Virtual Speech Quality Objective Listener. It is a signal-based, full-reference, intrusive metric that models human speech quality perception using a spectro-temporal measure of similarity between a reference and a test speech signal. The metric has been particularly designed to be robust for quality issues associated with Voice over IP (VoIP) transmission. This paper describes the algorithm and compares the quality predictions with the ITU-T standard metrics PESQ and POLQA for common problems in VoIP: clock drift, associated time warping, and playout delays. The results indicate that ViSQOL and POLQA significantly outperform PESQ, with ViSQOL competing well with POLQA. An extensive benchmarking against PESQ, POLQA, and simpler distance metrics using three speech corpora (NOIZEUS and E4 and the ITU-T P.Sup. 23 database) is also presented. These experiments benchmark the performance for a wide range of quality impairments, including VoIP degradations, a variety of background noise types, speech enhancement methods, and SNR levels. The results and subsequent analysis show that both ViSQOL and POLQA have some performance weaknesses and under-predict perceived quality in certain VoIP conditions. Both have a wider application and robustness to conditions than PESQ or more trivial distance metrics. ViSQOL is shown to offer a useful alternative to POLQA in predicting speech quality in VoIP scenarios.

DOI

https://doi.org/10.1186/s13636-015-0054-9

Recommended Citation

Hines et al. (2015) ViSQOL: an Objective Speech Quality Model, Journal on Audio, Speech, and Music Processing 2015:13 doi:10.1186/s13636-015-0054-9

Download

Included in

Computer Engineering Commons

COinS

Articles

ViSQOL: an Objective Speech Quality Model

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Articles

ViSQOL: an Objective Speech Quality Model

Authors

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links