Reference Encoder Padding #7

its-sandy · 2019-06-23T12:32:44Z

How do we ensure that the padding of the reference mel spectogram is taken into account when the reference encoder is applied on a batch of mels?

hadaev8 · 2019-09-23T11:02:29Z

Came you to any conclusion?
I faced this problem too, since gst encoder takes zero paddings, the network is able to take into account the duration of the audio, which on my dataset led to the fact that short lines are pronounced slowly, and long fast.

I tried using one-dimensional convolution and masking zero before gru layer, but it worsened the work of tokens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference Encoder Padding #7

Reference Encoder Padding #7

its-sandy commented Jun 23, 2019

hadaev8 commented Sep 23, 2019

Reference Encoder Padding #7

Reference Encoder Padding #7

Comments

its-sandy commented Jun 23, 2019

hadaev8 commented Sep 23, 2019