As in, train the vocoder just on one speaker, and when you inference, even with another speaker, it will sound like the one you trained on?
As in, train the vocoder just on one speaker, and when you inference, even with another speaker, it will sound like the one you trained on?