Professor Jon Barker
PhD
School of Computer Science
Personal Chair
School Ethics Lead
Member of the Speech and Hearing (SpandH) research group
+44 114 222 1824
Full contact details
School of Computer Science
Regent Court (DCS)
211 Portobello
91Ö±²¥
S1 4DP
- Profile
-
Professor Jon Barker is a member of the Speech and Hearing Research Group. He has a first degree in Electrical and Information Sciences from Cambridge University, UK. After receiving a PhD from the University of 91Ö±²¥ in 1999, he worked for some time at GIPSA-lab, Grenoble and IDIAP research institute in Switzerland before returning to 91Ö±²¥ where he has had a permanent post since 2002.
His research interests lie in noise-robust speech processing. Key application areas include distant-microphone speech recognition, speech intelligibility prediction and improved speech processing for hearing-aid users.
- Research interests
-
Professor Barker’s research interests are focused around machine listening and the computational modelling of human hearing. A recent focus has been on modelling speech intelligibility, ie can we predict whether or not a speech signal will be intelligible to a given listener?
This understanding will help us produce better signal processing for application such as hearing aids and cochlear implants. Another strand of his work is about taking insights gained from human auditory perception and using them to engineer robust automatic speech processing systems.
- Publications
-
Journal articles
- . Computer Speech & Language, 89, 101685-101685.
- . Data in Brief, 111199-111199.
- . The Journal of the Acoustical Society of America, 155(3_Supplement), A277-A277.
- . Frontiers in Psychology, 15, 1310176.
- . The Journal of the Acoustical Society of America, 153(3_supplement), A332-A332.
- . The Journal of the Acoustical Society of America, 153(3_supplement), A48-A48.
- . Data in Brief, 41.
- . IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 2968-2980.
- . The Journal of the Acoustical Society of America, 148(4), 2711-2711.
- . The Journal of the Acoustical Society of America, 145(2), EL136-EL141.
- . Journal of the Acoustical Society of America, 143(6), 523-529.
- . Speech Communication, 100, 58-68.
- . Speech Communication, 95, 127-136.
- . Computer Speech & Language, 46, 535-557.
- . Computer Speech and Language, 46, 605-626.
- . The Journal of the Acoustical Society of America, 141(5), 3693-3693.
- . Computer Speech & Language.
- . Circuits, Systems, and Signal Processing.
- . Journal of the Acoustical Society of America, 140(5), EL458-EL463.
- , 137-172.
- . 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings, 162-167.
- . IEEE Transactions on Audio, Speech and Language Processing, 21(3), 624-635.
- . IEEE Signal Processing Letters, 20(6), 563-566.
- . Computer Speech and Language.
- . Computer Speech and Language.
- . Computer Speech and Language.
- Combining speech fragment decoding and adaptive noise floor modelling.. IEEE Transactions on Audio, Speech and Language Processing, 20, 818-827.
- Crowdsourcing for word recognition in noise. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 3049-3052.
- . COMPUT SPEECH LANG, 24(1), 94-111.
- . IEEE T AUDIO SPEECH, 17(3), 446-458.
- . SPEECH COMMUN, 50(4), 337-353.
- Improving source localisation in multi-source, reverberant conditions: exploiting local spectro-temporal location cues.. Abstract for Acoust. Soc. Am. mtg.
- . J Acoust Soc Am, 123(1), 414-427.
- . SPEECH COMMUN, 49(12), 874-891.
- . SPEECH COMMUN, 49(5), 402-417.
- . SPEECH COMMUN, 49(5), 384-401.
- . J Acoust Soc Am, 120(5 Pt 1), 2421-2424.
- . IEEE T AUDIO SPEECH, 14(1), 58-67.
- . SPEECH COMMUN, 45(1), 5-25.
- . SPEECH COMMUN, 43(1-2), 123-142.
- . The Journal of the Acoustical Society of America, 113(4), 2230-2230.
- . Speech Communication, 27, 159-174.
- Is the sine-wave speech cocktail party worth attending?. SPEECH COMMUNICATION, 27(3-4), 159-174.
- Modelling the recognition of sine-wave sentences. BRITISH JOURNAL OF AUDIOLOGY, 31(2), 112-113.
- . Journal of the Acoustical Society of America, 100, 2682-2682.
- Clarity: Machine Learning Challenges to Revolutionise Hearing Device Processing.
Chapters
- , New Era for Robust Speech Recognition (pp. 327-344). Springer International Publishing
- , New Era for Robust Speech Recognition (pp. 51-77). Springer International Publishing
- Crowdsourcing in Speech Perception In Eskanazi M, Levow G-A, Meng H, Parent G & Sundermann D (Ed.), Crowdsourcing for Speech Processing (pp. 137-169). John Wiley and Sons
- In Virtanen T, Singh R & Raj B (Ed.), Techniques for Noise Robustness in Automatic Speech Recognition (pp. 371-398). Wiley
- Robust automatic speech recognition In Wang D-L & Brown GJ (Ed.), Computational Auditory Scene Analysis: Principals, Algorithms and Applications (pp. 297-350). Wiley/IEEE Press
Conference proceedings papers
- . ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 14 April 2024 - 19 April 2024.
- . 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) (pp 93-94), 14 April 2024 - 19 April 2024.
- Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement. European Signal Processing Conference (pp 421-425)
- . ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4 June 2023 - 10 June 2023.
- . ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4 June 2023 - 10 June 2023.
- The First Cadenza Signal Processing Challenge: Improving Music for Those With a Hearing Loss. CEUR Workshop Proceedings, Vol. 3528
- . ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 23 May 2022 - 27 May 2022.
- . ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 23 May 2022 - 27 May 2022.
- . ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Vol. 2022-May (pp 7372-7376)
- SNuC: The 91Ö±²¥ Numbers Spoken Language Corpus. 2022 Language Resources and Evaluation Conference, LREC 2022 (pp 1978-1984)
- Predicting Speech Intelligibility for People with Hearing Loss: The Clarity Challenges. Internoise 2022 - 51st International Congress and Exposition on Noise Control Engineering
- . Interspeech 2021 (pp 691-695). Brno, Czechia, 30 August 2021 - 3 September 2021.
- . ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6 June 2021 - 11 June 2021.
- , Vol. 00 (pp 296-300)
- . ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6 June 2021 - 11 June 2021.
- . SoutheastCon 2021, 10 March 2021 - 13 March 2021.
- . ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4 May 2020 - 8 May 2020.
- Deep learning of articulatory-based representations and applications for improving dysarthric speech recognition. Speech Communication - 13th ITG-Fachtagung Sprachkommunikation (pp 331-335)
- . ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4 May 2020 - 8 May 2020.
- . ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4 May 2020 - 8 May 2020.
- . ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 12 May 2019 - 17 May 2019.
- . 2018 IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings, 15 April 2018 - 20 April 2018.
- . Proceedings of the Annual Conference of the International Speech Communication Association
- . 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 5 March 2017 - 9 March 2017.
- (pp 331-342)
- Source-filter separation of speech signal in the phase domain. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2015-January (pp 598-602)
- . 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 13 December 2015 - 17 December 2015.
- . 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 18 October 2015 - 21 October 2015.
- Investigating the Impact of Artificial Enhancement of Lip Visibility on the Intelligibility of Spectrally-Distorted Speech. FAAVSP-2015 (pp 93-98), 11 September 2015 - 13 September 2015.
- . 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 13 December 2015 - 17 December 2015.
- A framework for the evaluation of microscopic intelligibility models. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2015-January (pp 2558-2562)
- The effect of cochlear implant processing on speaker intelligibility: A perceptual study and computer model. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2015-January (pp 1566-1570)
- (pp 173-184)
- Speech pre-enhancement using a discriminative microscopic intelligibility model. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp 2068-2072)
- A fragment-decoding plus missing-data imputation system evaluated on the 2nd CHiME challenge. Proceedings of the 2nd CHiME Workshop on Machine Listening in Multisource Environments (pp 53-58)
- The second ‘CHiME’ Speech Separation and Recognition Challenge: Datasets, tasks and baselines. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE
- Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol. 3 (pp 2637-2640)
- . ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp 4693-4696)
- Recent advances in fragment-based speech recognition in reverberant multisource environments.. Proceedings of ISCA Workshop on Machine Listening in Multisource Environments (pp 68-73)
- Binaural cues for fragment-based speech recognition in reverberant multisource environments. Proceedings of INTERSPEECH 2011 (pp 1657-1660)
- Incorporating localisation cues in a fragment decoding framework for distant binaural speech recognition.. IEEE Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA’11) (pp 207-212)
- Crowdsourcing for word recognition in noise. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 (pp 3056-+)
- Binaural cues for fragment-based speech recognition in reverberant multisource environments. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp 1657-1660)
- . ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp 4808-4811)
- Distant microphone speech recognition in a noisy indoor environment: combining soft missing data and speech fragment decoding.. ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition
- Robust Formant Estimation: Increasing the Reliability by Comparison among three Methods. Proceedings of the International Conference on Circuits, Systems and Signals, (Recent Advances in Circuits, Sistems and Signals) (pp 341-344)
- An Approach to Vocal Tract Length Normalization by Robust Formant Estimation. Proceedings of the International Conference on Circuits, Systems and Signals, (Recent Advances in Circuits, Sistems and Signals) (pp 345-348)
- Speaker turn tracking with mobile microphones: Combining location and pitch information. European Signal Processing Conference (pp 954-958)
- . 2010 8th International Conference on Communications, COMM 2010 (pp 79-82)
- The CHiME corpus: A resource and a challenge for computational hearing in multisource environments. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp 1918-1921)
- . Proceedings of SPIE - The International Society for Optical Engineering, Vol. 7745
- Using location cues to track speaker changes from mobile, binaural microphones. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 (pp 124-127)
- A SPEECH FRAGMENT APPROACH TO LOCALISING MULTIPLE SPEAKERS IN REVERBERANT ENVIRONMENTS. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS (pp 4593-4596)
- The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements.. ICMI (pp 109-116)
- Audio-visual speech fragment decoding. Proceedings of the International Conference on Auditory-Visual Speech Processing (AVSP 2007)
- Applying word duration constraints by using unrolled HMMs. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 (pp 353-356)
- Integrating pitch and localisation cues at a speech fragment level. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 (pp 2752-2755)
- Speech separation based on the statistics of binaural auditory features. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Vol. 5
- Recognition of reverberant speech using full cepstral features and spectral missing data. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Vol. 1
- Audio-Visual Speech Recognition in the Presence of a Competing Speaker. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 (pp 1292-1295)
- A Multipitch Tracker for Monaural Speech Segmentation. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 (pp 1678-1681)
- Recent advances in speech fragment decoding techniques. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 (pp 85-88)
- Recognition of reverberant speech using full cepstral features and spectral missing data. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13 (pp 289-292)
- Speech separation based on the statistics of binaural auditory features. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13 (pp 5807-5810)
- Recognition of reverberant speech using full cepstral features and spectral missing data. 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol I, Proceedings (pp 289-292). Toulouse, FRANCE, 14 May 2006 - 19 May 2006.
- Speech separation based on the statistics of binaural auditory features. 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol V, Proceedings (pp 949-952)
- Soft harmonic masks for recognising speech in the presence of a competing speaker. 9th European Conference on Speech Communication and Technology (pp 2641-2644)
- Binaural feature selection for missing data speech recognition. 9th European Conference on Speech Communication and Technology (pp 1269-1272)
- Recognising speech in the presence of a competing speaker using a 'speech fragment decoder'. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5 (pp 425-428)
- Tracking facial markers with an adaptive marker collocation model. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5 (pp 665-668)
- Mask estimation based on sound localisation for missing data speech recognition. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5 (pp 537-540)
- A Missing Data Approach for Robust Automatic Speech Recognition in the Presence of Reverberation. Proceedings of the 18th International Congress on Acoustics (ICA) (pp 449-452)
- Temporal integration as a consequence of multi-source decoding. Proceedings of the ISCA Workshop on the Temporal Integration in the Perception of Speech (TIPS)
- Missing data speech recognition in reverberant conditions. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS (pp 65-68)
- Combining bottom-up and top-down constraints for robust ASR: The multisource decoder. Proceedings of Workshop on consistent and reliable acoustic cues for sound analysis (CRAC-01)
- From Missing Data to Maybe Useful Data: Soft Data Modelling for Noise Robust ASR. Proceedings of the Worshop on Innovation in Speech Processing (WISP 2001)
- Handling Missing and Unreliable Information in Speech Recognition. Proceedings of the 8th International Workshop on Artificial Intelligence and Statistics (AISTATS-2001)
- Linking Auditory Scene Analysis and Robust ASR by Missing Data Techniques. Proceedings of the Worshop on Innovation in Speech Processing (WISP 2001)
- Robust ASR based on clean speech models: an evaluation of missing data techniques for connected digit recognition in noise.. INTERSPEECH (pp 213-217)
- A neural oscillator sound separator for missing data speech recognition. Proceedings of the International Joint Conference on Neural Networks, Vol. 4 (pp 2907-2912)
- A neural oscillator sound separator for missing data speech recognition. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS (pp 2907-2912)
- Soft decisions in missing data techniques for robust automatic speech recognition.. INTERSPEECH (pp 373-376)
- Decoding speech in the presence of other sound sources.. INTERSPEECH (pp 270-273)
- Evidence of correlation between acoustic and visual features of speech. Proc. ICPhS ’99
- Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models. Proceedings of the ISCA Workshop on Auditory-Visual Speech Processing (AVSP) ’99
- Is primitive AV coherence an aid to segment the scene?. Proceedings of the ISCA Workshop on Auditory-Visual Speech Processing (AVSP) ’98
- Acoustic confidence measures for segmenting broadcast news.. ICSLP
- Modelling the recognition of spectrally reduced speech.. EUROSPEECH
- Non-intrusive speech intelligibility prediction for hearing-impaired users using intermediate ASR features and human memory models. 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2024). Seoul, Korea, 14 April 2024 - 14 April 2024.
- . Interspeech 2022
- . Interspeech 2022
- . Interspeech 2022
- . Interspeech 2022
- . Interspeech 2022
- . Interspeech 2022
- . Interspeech 2021
- . Interspeech 2021
- . Interspeech 2021
- . Interspeech 2020
- . Interspeech 2019
- . Interspeech 2018
- . Interspeech 2018
- . Interspeech 2018
- . Interspeech 2017
- . Interspeech 2017
- . Interspeech 2016
- . Interspeech 2016
- . Interspeech 2016
- . Interspeech 2016
- . Southcon/96 Conference Record
- Autoencoder bottleneck features with multi-task optimisation for improved continuous dysarthric speech recognition. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
- Towards Solving the Cocktail Party Problem through Primitive Grouping and Model Combination. Proceedings of Forum Acusticum
Posters
- A comparison of audiovisual and auditory-only training on the perception of spectrally-distorted speech. 18th International Congress of Phonetic Sciences.
Theses / Dissertations
- The relationship between auditory organisation and speech perception: Studies with spectrally reduced speech.
Other
- POPeye: Real-time, binaural sound source localisation on an audio-visual robot-head.
- Simultaneous Tracking of Perceiver Movements and Speaker Changes Using Head-Centered, Binaural Data.
Preprints
- The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss, arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- Overview Of The 2023 Icassp Sp Clarity Challenge: Speech Enhancement For Hearing Aids, arXiv.
- , arXiv.
- The Cadenza ICASSP 2024 Grand Challenge, arXiv.
- , arXiv.
- CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings, arXiv.
- DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation, arXiv.
- The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines, arXiv.
- Grants
-
Current grants
- EnhanceMusic: , EPSRC, 06/2022 - 11/2026, £377,568, as PI
- , EPSRC, 10/2019 to 10/2025, £480,416, as PI
- , EPSRC, 04/2019 to 09/2027, £5,508,850, as Co-PI
- TAPAS: , EC H2020, 11/2017 to 06/2022, £468,000, as Co-PI
Previous grants
- Deep learning of articulatory-based representations of dysarthric speech, Google, 02/2016 to 01/2017, £46,624, as Co-PI
- , EPSRC, 10/2015 to 09/2018, £125,493, as PI
- INSPIRE: Investigating Speech In Real Environments, EC FP7, 01/2012 to 12/2015, £308,473, as PI
- EPSRC, 07/2010 to 09/2010, £9,978, as PI
- CHIME: Computational Hearing in Multisource Environments, EPSRC, 06/2009 to 05/2012, £326,245, as PI
- Audio-Visual Speech Recognition in the Presence of Non-Stationary Noise, EPSRC, 02/2005 to 05/2007, £116,853, as PI
- Professional activities and memberships
-
- Member of the research group
- Co-founder of the CHiME series of International Workshops and Robust Speech Recognition Evaluations, 2011 onwards.
- EURASIP Best Paper Award, 2009; for best paper in Speech Communication during 2005.
- ISCA Best Paper Award, 2008; for best paper in Speech Communication 2005-2007.