the Channel dim of a color picture #5

Angry-Echo · 2023-06-28T09:32:18Z

Hello~ I am studying your code and i have a question about how the model handle the color image due to I can't find the RGB Channel when frame sequence input into the model.
In the multi_head_attention.py, at the beginning of the call method (after self.wq(q), and i know the self.wq is a conv_layer), your comment says:#(batch_size, num_heads, seq_len_q, rows, cols, depth), where is the channel-dim? The dimension meaning of the six i understand is: seq_len_q is the length of the frame sequence; num_heads × depth = d_model; rows is the H of image; cols is the W of image)

Sincerely hope that you can answer my doubts and if you do not mind, can i ask you for some knowledge about the field of Video Prediction? I am trying to do some research about predicting image sequence with Transformer

The text was updated successfully, but these errors were encountered:

Angry-Echo · 2023-06-28T11:41:43Z

Oh！I guess the depth is channel, is that right ?

Angry-Echo · 2023-06-28T11:49:32Z

But i still have a same question as the another issue, the Conv layer requires [batch_size, rows, cols, depth], how can the additional dim seqlen input into the Conv layer?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the Channel dim of a color picture #5

the Channel dim of a color picture #5

Angry-Echo commented Jun 28, 2023

Angry-Echo commented Jun 28, 2023

Angry-Echo commented Jun 28, 2023

the Channel dim of a color picture #5

the Channel dim of a color picture #5

Comments

Angry-Echo commented Jun 28, 2023

Angry-Echo commented Jun 28, 2023

Angry-Echo commented Jun 28, 2023