Ricoh Company Ltd.

29/08/2024 | Press release | Distributed by Public on 29/08/2024 01:07

Self-Supervised Learning and Data Augmentation Technologies for AI Speech Recognition Paper Accepted by the INTERSPEECH 2024

TOKYO, August 29, 2024 - Ricoh today announced its paper on Self-Supervised Learning and Data Augmentation Technologies for Artificial Intelligence (AI) Speech Recognition will be presented at INTERSPEECH 2024, the international spoken language processing conference. This is the first time a Ricoh paper has been accepted by the INTERSPEECH.

The paper presented at this conference introduces the development of an efficient and effective training method for AI speech recognition models using only speech data without transcripts.

Traditionally, supervised learning methods for AI speech recognition require speech data paired with corresponding transcripts to teach the AI the relationship between that speech and the text. However, this method demands a large volume of transcribed speech data, making it costly to acquire. Moreover, the audio quality recorded in real-world environments can vary depending on factors such as application and location, necessitating enhanced tolerance to acoustic noise for broader usability across different settings. Ricoh's newly developed self-supervised learning method, combined with data augmentation techniques that strengthen resistance to acoustic noise, achieves more accurate speech recognition performance at a lower cost compared to traditional methods.