Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is this permutation necessary here (classification task) ? #205

Open
Joao-L-S-Almeida opened this issue Oct 28, 2024 · 7 comments
Open
Assignees

Comments

@Joao-L-S-Almeida
Copy link
Member

x = x.reshape(x.shape[0], x.shape[1], -1).permute(0, 2, 1)

Any comments @CarlosGomes98 ?

@Joao-L-S-Almeida Joao-L-S-Almeida self-assigned this Oct 28, 2024
@romeokienzler
Copy link
Collaborator

@paolo-fraccaro any idea?

@Joao-L-S-Almeida Joao-L-S-Almeida changed the title Whys is this permutation necessary here (classification task) ? Why is this permutation necessary here (classification task) ? Oct 28, 2024
@paolofraccaro
Copy link
Collaborator

I cannot figure out either, but also have a very vague memory that @CarlosGomes98 told me there was some factory that was needing data flipped. But cannot really remember the specifics.

@romeokienzler
Copy link
Collaborator

@paolo-fraccaro from your findings it seems that NHWC and NCHW are both present in TT correct? depending on the model/factory...

should we set a project wide standard? we could as we control all the factories....if yes, what would happen if we don't do it?

I've read somewhere that NCHW is the standard and better (performance wise) for nVidia

In case we don't set the standard, there is no risk in getting things wrong as the shapes won't match and we will come to know through an error, correct?

We could introduce this as a parameter (with NCHW as default) to the factory(ies)

@paolofraccaro
Copy link
Collaborator

Yes I think this would be helpful. Sticking to one convention would help. I am happy with NCHW (NLC for transformers). Our classification head at the moment expects NHWC/NCL instead.

@romeokienzler
Copy link
Collaborator

ok @paolo-fraccaro what would u suggest? change the CH or add a transform neck and have the factories add the neck to the model per default?

@paolofraccaro
Copy link
Collaborator

this is a way! We have the out_channels attribute from the backbones (which I believe is mandatory) therefore we should be able to permute to any default way we want to use with a neck. Either way this could be turned on/off with an attribute at factory level and clearly stated in the documentation.

@CarlosGomes98
Copy link
Contributor

This code in the head strikes me as probably transformer specific... this transpose bit should be done by a neck instead. I think we can remove this line and try with a transform.

As for the convention, NCHW is the norm, but transformers have N T C (t for tokens) internally. This is needed so that internally they can efficiently perform attention (e.g. applying the same mlp to all tokens) and is why we have the necks that perform permutation if needed (see the permute neck and the neck that goes from tokens to image)

@romeokienzler romeokienzler self-assigned this Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants