Skip to main content

Showing 1–3 of 3 results for author: Tambrahalli, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.01233  [pdf, other

    cs.SD cs.LG eess.AS

    RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations

    Authors: Neha Sahipjohn, Neil Shah, Vishal Tambrahalli, Vineet Gandhi

    Abstract: Significant progress has been made in speaker dependent Lip-to-Speech synthesis, which aims to generate speech from silent videos of talking faces. Current state-of-the-art approaches primarily employ non-autoregressive sequence-to-sequence architectures to directly predict mel-spectrograms or audio waveforms from lip representations. We hypothesize that the direct mel-prediction hampers training/… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  2. arXiv:2305.11926  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting

    Authors: Neil Shah, Vishal Tambrahalli, Saiteja Kosgi, Niranjan Pedanekar, Vineet Gandhi

    Abstract: We present MParrotTTS, a unified multilingual, multi-speaker text-to-speech (TTS) synthesis model that can produce high-quality speech. Benefiting from a modularized training paradigm exploiting self-supervised speech representations, MParrotTTS adapts to a new language with minimal supervised data and generalizes to languages not seen while training the self-supervised backbone. Moreover, without… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: 5 pages, 1 figure

  3. arXiv:2303.01261  [pdf, other

    cs.CL cs.SD eess.AS

    ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations

    Authors: Neil Shah, Saiteja Kosgi, Vishal Tambrahalli, Neha Sahipjohn, Niranjan Pedanekar, Vineet Gandhi

    Abstract: We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations. It can train a multi-speaker variant effectively using transcripts from a single speaker. ParrotTTS adapts to a new language in low resource setup and generalizes to languages not seen while training the self-supervised backbone. Moreover, without training on bilingual… ▽ More

    Submitted 16 December, 2023; v1 submitted 1 March, 2023; originally announced March 2023.