Vlog: Do face coverings affect identifying voices? A small experiment using VOCALISE and PHONATE
In these recent months of 2020, like many others around the world, we have found ourselves adjusting to the new normal of wearing masks in various places like supermarkets and other public spaces. We found ourselves (minorly) annoyed that some biometric identification, like face recognition, doesn’t quite work when wearing masks. This made us wonder how well voice biometric solutions could work when speakers are wearing masks, and we decided to perform a small experiment to analyse this.
Over the last few weeks, we have been performing some small-scale tests of our VOCALISE and PHONATE software against speech spoken from behind a mask. We have found our systems to be quite robust to masked speech – they are able to recognise speakers across different mask-wearing conditions well.
The video below explains our experiment and discusses our findings. We hope that you find it interesting!
Exploring the relationship between voice similarity estimates by listeners and by an automatic speaker recognition system incorporating phonetic features
We are happy to announce that our latest paper has been accepted for publication in the prestigious ‘Speech Communication‘ journal. This represents joint work between Cambridge University’s ‘Faculty of Modern and Medieval Languages and Linguistics’ and Oxford Wave Research (OWR).
Similar-sounding voices is of interest in many areas, be it for voice parades in a forensic setting, voice casting for film-dubbing or voice banking to save one’s voice for future synthesis in case of a degenerative disease. However, it is a very time-consuming and expensive task. With the aim of finding an objective method that could speed up the process, we considered an automatic approach to rate voice similarity and explored the relationship between voice similarity ratings made by a total of 106 human listeners – some of whom may have been you – and comparison scores produced by an i-vector-based automatic speaker recognition system that extracts perceptually-relevant phonetic features. Our results showed a significant positive correlation between human and machine, motivating us to continue our developments in this space.
The main highlights of this work are that human judgements of voice similarity are seen to correlate with automatic speaker recognition assessments (using auto-phonetic features) (this trend was seen with both English and German speakers’ judgements of English voices). These automatic speaker recognition assessments therefore show potential for automatically selecting foil voices for voice parades.
This paper is based on Linda’s Gerlach’s master’s thesis work (University of Marburg, Germany) at Oxford Wave Research last year and uses the phonetic mode of VOCALISE speaker recognition software.
The full paper is available for free download on the Journal’s webpage. Please check the following link for the full abstract and paper, available for free using this link before 19th November 2020:
Oxford Wave Research Ltd. are pleased to announce our appointment as the exclusive distributor in the United Kingdom and the Republic of Ireland for Salient Sciences (legal name Digital Audio Corporation, known to many as “DAC”).
We are excited to have our colleagues at Oxford Wave Research now officially offering Salient Sciences’ products and services in the UK and Ireland. We have previously worked closely with them on several interesting projects; going forward, we anticipate an even closer collaboration to provide unique, innovative solutions to our shared base of audio and video forensics clients worldwide.
Donald Tunstall, General Manager, Salient Sciences:
We also have many years of experience working with the DAC hardware-based audio processing solutions, such as the MicroDAC, PCAP, and CARDINAL AudioLab systems.
OWR will now be taking over all sales and support in the UK and Ireland, with immediate effect, for the VideoFOCUS and CARDINAL MiniLab Suite products, including all maintenance contracts and support.
Watch this space for training course announcements from DAC in the UK in 2020.
Dr Ekrem Malkoç is joining Oxford Wave Research as our Technical Sales Manager. He will be spearheading expansion of Oxford Wave Research’s forensic and commercial speech and audio processing products into new regions and markets.
Ekrem is a well-known expert in the field of forensic speech and audio processing, forensic image analysis as well as forensic linguistics. He has a PhD in forensic linguistics from Ankara University (Turkey), MSc and MA degrees in Criminalistics and European Criminology from the Ankara University and Katholieke University of Leuven (Belgium) respectively, and a bachelor’s degree in Electrical and Electronics Engineering. Ekrem worked in Turkish Gendarmerie till 2015 as a Colonel after having served as the manager of two regional Gendarmerie Forensic Laboratories.
You can read more about him here https://oxfordwaveresearch.com/about-us/
Last week (14-17 July 2019) some of the OWR team had the pleasure of attending the annual IAFPA (International Association for Forensic Phonetics and Acoustics) conference which was hosted this year in Istanbul, Turkey.
It was a great opportunity for us to learn about the work of other members of the forensic phonetics and acoustics community from all around the world. One of the hot topics IAFPA this year was cross-language speaker comparison (Croatian-Serbian, Czech-Persian and French-English to name a few) We were delighted to see how much of this and other research from Switzerland and the Netherlands made use of the capabilities of our forensic automatic speaker recognition software VOCALISE.
We enjoyed every part of the conference but the highlight for us was undoubtedly our intern Linda’s poster winning the 2019 Best Student Poster award. As you can imagine, the team celebrated appropriately with Turkish beer.
We also showcased our advances in the use of Deep Neural Network (DNN)s using x-vectors in automatic speaker comparison and speaker profiling, presented by Dr. Finnian Kelly, our Principal Research Scientist.
Abstracts of our papers:
1. From i-vectors to x-vectors – a generational change in speaker recognition illustrated on the NFI-FRIDA database, Finnian Kelly, Anil Alexander, Oscar Forth and David van der Vloed, 14-17 July 2019, International Association of Forensic Phonetics and Acoustics (IAFPA) Conference, Istanbul, Turkey [download here]
2. The effect of background selection on the strength of evidence David van der Vloed, Finnian Kelly and Anil Alexander, 14-17 July 2019, International Association of Forensic Phonetics and Acoustics (IAFPA) Conference, Istanbul, Turkey [download here]
3. One out of many: A sliding window approach to automatic speaker recognition with multi-speaker files Linda Gerlach, Finnian Kelly and Anil Alexander, 14-17 July 2019, International Association of Forensic Phonetics and Acoustics (IAFPA) Conference, Istanbul, Turkey [download here]
4. More than just identity: speaker recognition and speaker profiling using the GBR-ENG database, Linda Gerlach, Finnian Kelly and Anil Alexander 14-17 July 2019, International Association of Forensic Phonetics and Acoustics (IAFPA) Conference, Istanbul, Turkey (Winner of 2019 Best Student Paper award)[download here]
Special thanks to Burcu Önder Gürpinar for 4 fantastic days of forensics and we look forward to showing you what we have in store for IAFPA 2020.
Linda Gerlach, who is interning with us from the Philipps-Universität Marburg in Germany, is working on a collaborative project between Oxford Wave Research (OWR) and the University of Cambridge. This work forms part of her MA thesis and seeks to better understand the relation of voice similarity ratings by humans and by an automatic approach. Results from this work could potentially help develop forensically sound methods and solutions for voice lineups (where a witness has to pick out the voice of a perpetrator from a lineup of foils).
This test takes about 15 mins and you will be presented with pairs of voice recordings by male speakers and asked to judge the similarity of each pair.
Oxford Wave Research are pleased to announce the promotion of Dr Finnian Kelly to Principal Research Scientist.
Since joining Oxford Wave Research in 2016 as a Senior Research Scientist, Finnian has made significant contributions to the development of our speaker recognition, speaker diarization and speech & audio processing systems. He has successfully led the OWR team in two NIST speaker recognition evaluations. Finnian was with the Sigmedia Research Group at Trinity College, Dublin where he completed his PhD in 2013 and is a Research Associate with the Center for Robust Speech Systems (CRSS) at The University of Texas at Dallas. Finnian has published in (and acts as a reviewer for) many top-tier international conferences and journals, and has been an invited speaker at research labs in Europe and the US. Finnian is a member of the research committee of the International Association for Forensic Phonetics and Acoustics (IAFPA), and an affiliate member of the NIST OSAC Speaker Recognition subcommittee.
We are delighted that he will now be heading OWR’s research and leading us into new and exciting areas of work.
The team at Oxford Wave Research congratulate Finnian on his new role and look forward to working closely with him on future developments.
Oxford Wave Research are proud to be Platinum Sponsors of the 2019 AES International Conference on Audio Forensics.
Taking place in Porto, Portugal on June 18-20th 2019, this conference is dedicated to exploring advances in the field of Audio Forensics by providing a platform for research related to the forensic application of speech/signal processing, acoustical analyses, audio authentication, and the examination of methodologies and best practices.
We will be presenting our paper titled ‘Deep neural network based forensic automatic speaker recognition in VOCALISEusing x-vectors’ and will be giving a Platinum talk on ‘The who, the when and the what – challenges in the development of real-world solutions for forensic audio processing’.
This is the seventh AES conference devoted to the technical developments and practical approaches developing in the field of Audio Forensics and Oxford Wave Research will be demonstrating some of the latest (and coolest!) developments in speaker recognition and speech & audio processing.
The conference program will include paper presentations and discussions as well as Keynotes, Tutorials and Workshops on topics related to Forensic Audio.
Oxford Wave Research are delighted to be named partners of with the recently opened Centre for Forensic Phonetics and Acoustics (CFPA) at the University of Zurich. Opened on the 6th March 2019 the CFPA brings together research from a range of fields to address all areas of voice recognition in relation to forensic investigation.
Led by Prof. Volker Dellwo this centre will combine world-class research into forensic speaker recognition, voice disguise and voice line-ups with forensic services such as speaker profiling, speaker comparison, transcription, audio authentication and audio enhancement for both prosecution & defence.
As all great collaborations should, this one started with a few nice glasses in 2017 in Zurich.
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.