-
Speaker anonymization by modifying fundamental frequency and x-vector singular value
- Back
Document Title
Speaker anonymization by modifying fundamental frequency and x-vector singular value
Author
Mawalim C.O., Galajit K., Karnjana J., Kidani S., Unoki M.
Name from Authors Collection
Affiliations
Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923–1292, Japan; NECTEC, National Science and Technology Development Agency, Pathum Thani, Thailand
Type
Article
Source Title
Computer Speech and Language
ISSN
08852308
Year
2022
Volume
73
Open Access
All Open Access, Hybrid Gold
Publisher
Academic Press
DOI
10.1016/j.csl.2021.101326
Format
Abstract
Speaker anonymization is a method of protecting voice privacy by concealing individual speaker characteristics while preserving linguistic information. The VoicePrivacy Challenge 2020 was initiated to generalize the task of speaker anonymization. In the challenge, two frameworks for speaker anonymization were introduced; in this study, we propose a method of improving the primary framework by modifying the state-of-the-art speaker individuality feature (namely, x-vector) in a neural waveform speech synthesis model. Our proposed method is constructed based on x-vector singular value modification with a clustering model. We also propose a technique of modifying the fundamental frequency and speech duration to enhance the anonymization performance. To evaluate our method, we carried out objective and subjective tests. The overall objective test results show that our proposed method improves the anonymization performance in terms of the speaker verifiability, whereas the subjective evaluation results show improvement in terms of the speaker dissimilarity. The intelligibility and naturalness of the anonymized speech with speech prosody modification were slightly reduced (less than 5% of word error rate) compared to the results obtained by the baseline system. © 2021 The Authors
Industrial Classification
Knowledge Taxonomy Level 1
Knowledge Taxonomy Level 2
Knowledge Taxonomy Level 3
Funding Sponsor
Japan Society for the Promotion of Science; KDDI Foundation
License
N/A
Rights
N/A
Publication Source
Scopus