Martin Cooke
No theory of speech perception can be considered complete without an explanation of how listeners are able to extract meaning from severely degraded forms of speech. Starting with a brief overview of a century of research which has seen the development of many types of distorted speech, followed by some anecdotal evidence that automatic speech recognisers still have some way to go to match listeners' performance in this area, I will describe the outcome of one recent [1] and several ongoing studies into the detailed time course of a listener's response to distorted speech. These studies variously consider the rapidity of adaptation, whether adaptation can only proceed if words are recognised, the degree to which the response to one form of distortion is conditioned on prior experience with other forms, and the nature of adaptation in a language other than one's own native tongue. Taken together, findings from these experiments suggest that listeners are capable of continuous and extremely rapid adaptation to novel forms of speech that differ greatly from the type of input that makes up the vast bulk of their listening experience. It is an open question as to whether big-data-based automatic speech recognition can offer a similar degree of flexibility. [1] Cooke, M, Scharenborg, O and Meyer, B (2022). The time course of adaptation to distorted speech. J. Acoust. Soc. Am. 151, 2636-2646. 10.1121/10.0010235
Martin Cooke is Ikerbasque Research Professor. After starting his career in the UK National Physical Laboratory, he worked at the University of Sheffield for 26 years before taking up his current position. His research has focused on analysing the computational auditory scene, devising algorithms for robust automatic speech recognition and investigating human speech perception. His interests also include the effects of noise on talkers as well as listeners, and second language listening in noise.