You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello~ I am studying your code and i have a question about how the model handle the color image due to I can't find the RGB Channel when frame sequence input into the model.
In the multi_head_attention.py, at the beginning of the call method (after self.wq(q), and i know the self.wq is a conv_layer), your comment says:#(batch_size, num_heads, seq_len_q, rows, cols, depth), where is the channel-dim? The dimension meaning of the six i understand is: seq_len_q is the length of the frame sequence; num_heads × depth = d_model; rows is the H of image; cols is the W of image)
Sincerely hope that you can answer my doubts and if you do not mind, can i ask you for some knowledge about the field of Video Prediction? I am trying to do some research about predicting image sequence with Transformer
The text was updated successfully, but these errors were encountered:
But i still have a same question as the another issue, the Conv layer requires [batch_size, rows, cols, depth], how can the additional dim seqlen input into the Conv layer?
Hello~ I am studying your code and i have a question about how the model handle the color image due to I can't find the RGB Channel when frame sequence input into the model.
In the multi_head_attention.py, at the beginning of the call method (after self.wq(q), and i know the self.wq is a conv_layer), your comment says:
#(batch_size, num_heads, seq_len_q, rows, cols, depth)
, where is the channel-dim? The dimension meaning of the six i understand is: seq_len_q is the length of the frame sequence; num_heads × depth = d_model; rows is the H of image; cols is the W of image)Sincerely hope that you can answer my doubts and if you do not mind, can i ask you for some knowledge about the field of Video Prediction? I am trying to do some research about predicting image sequence with Transformer
The text was updated successfully, but these errors were encountered: