Post by account_disabled on Mar 11, 2024 9:13:09 GMT
What is Translatotron? How does Google's AI dedicated to automatic translations, which is also able to reproduce the source voice, work? Let's find out together also through a test video. Alessio Pomaro Alessio Pomaro Aug 20, 2021 •2 min read Google's Translatotron 2: Improves translation and blocks deepfakes Google's Translatotron 2: Improves translation and blocks deepfakes What is Translatotron? Translatotron is a Google system based on AI that allows you to create automatic translations starting from a person's speech, also reproducing the original voice.
These are not, therefore, systems composed of a Speech To Text (STT) which India Mobile Number Data transforms speech into text, a text translator (e.g. Google Translate) and finally a Text To Speech (TTS) which transforms the text back into audio . With Translatotron, Google is trying to develop a Speech To Speech system, without textual intermediation. In the following video I have collected some examples of translations by Translatotron 2 which maintains the original voice. An example of Translatotron 2 in action with translation while maintaining voice, How does it work The operation is truly astonishing. The system is based on neural networks , which use the spectrogram of the voice to be translated as input and produce the spectrogram of the translated voice as output.
The spectrogram is a visual representation of a signal, in this case speech audio. To simplify, we can say that Google Translate, for example, must listen to what the user says, transcribe it, translate it and finally read it again. Translatotron , on the other hand, analyzes the spectrogram of speech and produces another one that represents the translation. To do this, it uses a “ neural vocoder ” that converts the translated spectrogram into audio waveforms, and with the option of a speech encoder that keeps the character of the source voice intact. Translatotron 2 Translatotron 2 is a significantly improved model, both in translation capacity and in the quality of the voice reproduced.
These are not, therefore, systems composed of a Speech To Text (STT) which India Mobile Number Data transforms speech into text, a text translator (e.g. Google Translate) and finally a Text To Speech (TTS) which transforms the text back into audio . With Translatotron, Google is trying to develop a Speech To Speech system, without textual intermediation. In the following video I have collected some examples of translations by Translatotron 2 which maintains the original voice. An example of Translatotron 2 in action with translation while maintaining voice, How does it work The operation is truly astonishing. The system is based on neural networks , which use the spectrogram of the voice to be translated as input and produce the spectrogram of the translated voice as output.
The spectrogram is a visual representation of a signal, in this case speech audio. To simplify, we can say that Google Translate, for example, must listen to what the user says, transcribe it, translate it and finally read it again. Translatotron , on the other hand, analyzes the spectrogram of speech and produces another one that represents the translation. To do this, it uses a “ neural vocoder ” that converts the translated spectrogram into audio waveforms, and with the option of a speech encoder that keeps the character of the source voice intact. Translatotron 2 Translatotron 2 is a significantly improved model, both in translation capacity and in the quality of the voice reproduced.