You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to transfer the MT framework to a NLP task but I don't understand how to train it with unlabeled data. I have got the idea of the paper, but i'm confusing about the implementation.
I notice that the TwoStreamBatchSampler divides the dataset into labeled part and unlabeled part, but the code above seems handles both labeled and unlabeled data in a universal way. I think only the labeled part of model_out should be used to calculate the class_loss. Did I get it wrong?
The text was updated successfully, but these errors were encountered:
@tarvaina So sorry to bother you several years on, but I have this same question. During training when some number of the samples in each batch are unlabeled (-1) how is it that the class loss is being calculated on both labeled and unlabeled samples?
In this line here: class_loss = class_criterion(class_logit, target_var) / minibatch_size
we have class_logit, which for some batch size of 8 should be of dimension 8 x 1000. Then we have target_var, which is some tensor of length 8 representing the classes (ex: [-1, -1, -1, -1, 2, 4, 5, 5].
Can you clarify how this is working? Why are the unlabeled samples being taken into account? Thank you so much!
Replying in case this is helpful for others: when the class_criterion is created, NO_LABEL labels are set to be ignored. I believe this handles things correctly! class_criterion = nn.CrossEntropyLoss(reduction='sum', ignore_index=NO_LABEL).cuda()
I want to transfer the MT framework to a NLP task but I don't understand how to train it with unlabeled data. I have got the idea of the paper, but i'm confusing about the implementation.
I notice that the
TwoStreamBatchSampler
divides the dataset into labeled part and unlabeled part, but the code above seems handles both labeled and unlabeled data in a universal way. I think only the labeled part ofmodel_out
should be used to calculate theclass_loss
. Did I get it wrong?The text was updated successfully, but these errors were encountered: