Silent pauses play a crucial role in text-to-speech synthesis, where they help make the text reading
sound more natural.
In this work, our goal is to predict these silent pauses from texts to improve
automatic reading systems. As this task has not been extensively studied for French, it is necessary to
build training data dedicated to the prediction of pauses.
We propose a strategy for inferring pauses, based on temporal information from transcribed speech, in order to obtain such a corpus. We then
show that with the help of a model based on Transformers and appropriate data, it is possible to obtain promising results for the prediction of pauses produced by a speaker during text reading.
This paper has been accepted for publication in the proceedings of the TALN 2023 conference.
The full article is in French.