When lowering the amount of labeled data to one hour, wav2vec 2.0 outperforms the previous state Experiments using all labeled data of Librispeech achieve 1.8/3.3 WER on theĬlean/other test sets. Representations which are jointly learned. The speech input in the latent space and solves a contrastive task defined over a quantization of the latent Transcribed speech can outperform the best semi-supervised methods while being conceptually simpler. ![]() ![]() We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on ![]() The abstract from the paper is the following: The Wav2Vec2 model was proposed in wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations by Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |