Any publication?

by sappho192 - opened Aug 31, 2025

Aug 31, 2025

Hi, thank you for releasing this model into public.

I'd like to study what changes were made in this 2.0 version compared to the previous model, but I couldn't find any papers related to this.
Is there any way I can find out in detail what has changed?

Thanks in advance.

naymaraq

NVIDIA org Sep 2, 2025

The biggest diff in the training dataset, plus slightly different augmentations. The training data of 2.0 version includes non-speech audio samples to help the model distinguish between speech and non-speech sounds (such as coughing, laughter, and breathing, etc.)

You can refer to MarbleNet Paper: https://arxiv.org/pdf/2010.13886

sappho192

Sep 3, 2025

@naymaraq
Thank you! That's a nice improvement :)

sappho192 changed discussion status to closed Sep 3, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment