You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am planning to apply mean-teacher for my problem of token classification. Since adding different noise for teacher and student is really important for the approach, i am confused about how to calculate consistency cost as length of active logits would differ. for e.g. if i use synonym noise then it can happen that it increases the length of the sentence (some tokens maybe replaces by synonym of len 2) when given to teacher model and same augmentation may generate different sentence(of different length) when given to student model.
The text was updated successfully, but these errors were encountered:
I am planning to apply mean-teacher for my problem of token classification. Since adding different noise for teacher and student is really important for the approach, i am confused about how to calculate consistency cost as length of active logits would differ. for e.g. if i use synonym noise then it can happen that it increases the length of the sentence (some tokens maybe replaces by synonym of len 2) when given to teacher model and same augmentation may generate different sentence(of different length) when given to student model.
The text was updated successfully, but these errors were encountered: