Research

Research Projects, Papers, Books, etc.

Share

  • Research Projects
  • Grants

Privacy and fairness-aware speech dataset generation

Author: Junichi Yamagishi (Principal Researcher)

Period: June 2024 – March 2027
Grant category: Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research, Challenging Research (Pioneering)
Issue number: 24K21324
URL: https://kaken.nii.ac.jp/ja/grant/KAKENHI-PROJECT-24K21324/

In this research, we aim to realize a meta-machine learning method that generates a dataset similar to a real speech dataset. Specifically, we explore and evaluate methods to generate large-scale speech datasets that (1) properly protect individual privacy, (2) improve data bias and fairness, and (3) have utility similar to that of real speech datasets. Specifically, we study encoder-decoder models for speech datasets, privacy protection methods for sensitive attributes (content of speech, accent, age, gender, etc.) in addition to speaker identity, and fairness in representative usage tasks (speech recognition, speaker recognition, gender recognition, etc.) to measure the utility of the generated datasets.