Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try using a network pretrained using Resnet 50 #112

Closed
vikasmahato opened this issue Mar 10, 2018 · 20 comments
Closed

Try using a network pretrained using Resnet 50 #112

vikasmahato opened this issue Mar 10, 2018 · 20 comments

Comments

@vikasmahato
Copy link
Contributor

vikasmahato commented Mar 10, 2018

References #7

This paper provides various benchmarks for pretrained models used for transfer learning.
https://openreview.net/pdf?id=Bygq-H9eg

It also suggests that using Resnet should perform better as compared to VGG.

@marco-c
Copy link
Owner

marco-c commented Mar 11, 2018

There are already two PRs about this: #61 and #105.

@marco-c
Copy link
Owner

marco-c commented Mar 11, 2018

N.B.: They are adding the architecture, but nothing regarding a pretrained network

@vrishank97
Copy link
Contributor

Great. Should I work on this in network.py ?

@marco-c
Copy link
Owner

marco-c commented Mar 11, 2018

Yes, the first step would be to figure out how to use an already pretrained network, since the size of our images are different than the default VGG, Resnet, etc.

@vrishank97
Copy link
Contributor

I'm considering squashing images to 224x224. As we are mainly concerned with UI/UX elements and not text, I don't think squashing will affect performance.

@marco-c
Copy link
Owner

marco-c commented Mar 11, 2018

Yeah, that's one of the option that we should try

@vikasmahato
Copy link
Contributor Author

@vrishank97 I was working on this issue. However if you want to take it please let me know.

@vrishank97
Copy link
Contributor

@vikasmahato Since we are just experimenting with different ways of getting the transfer learning to work, we'll get results faster if we work in parallel. I will try downscaling images to imagenet dimensions and re-training resnet 50 on it. What approach are you currently working on?

@vikasmahato
Copy link
Contributor Author

@vrishank97 I was also thinking the same. However since you are already doing it I'll try to use Inception V3 and see how it performs.

@vrishank97
Copy link
Contributor

@vikasmahato Great. Can you try using the Inception-Resnet-v2 instead? It had a higher accuracy on imagenet.

https://research.googleblog.com/2016/08/improving-inception-and-image.html

@vikasmahato
Copy link
Contributor Author

@vrishank97 Sure!

@Shashi456
Copy link
Contributor

@vrishank97 @vikasmahato I wanted to ask you both what are the steps in using any pretrained network , are you finetuning the last layer ?

@vrishank97
Copy link
Contributor

We freeze the convo layers and retrain the fully connected layers with a custom softmax output layer.

@vrishank97
Copy link
Contributor

vrishank97 commented Mar 12, 2018

Here are some resources
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
https://towardsdatascience.com/transfer-learning-using-keras-d804b2e04ef8

and a pre-existing tool by tensorflow
https://www.tensorflow.org/tutorials/image_retraining

Here they generate bottleneck features to train a model, doing so speeds up the process as we don't have to keep running the computationally expensive convolution operations.

@marco-c
Copy link
Owner

marco-c commented Mar 12, 2018

We should try both approaches: freeze all except the top one and also keep training everything.

@vrishank97
Copy link
Contributor

vrishank97 commented Mar 12, 2018

@marco-c The dimensions shouldn't be an issue for transfer learning. All images are resized by prepare_images() in utils.py. We only need to load imagenet weights.

@marco-c
Copy link
Owner

marco-c commented Mar 12, 2018

@marco-c The dimensions shouldn't be an issue for transfer learning. All images are resized by prepare_images() in utils.py. We only need to load imagenet weights.

Yes, but by resizing we might be losing some information.

@vrishank97
Copy link
Contributor

Agreed. But I think its mainly the text areas where we lose information, major UI elements would still be recognisable, especially if we use Inception. It has an input of 299x299 instead of 224x224, so lower information loss.

@marco-c
Copy link
Owner

marco-c commented Mar 12, 2018

Yes, hopefully yes.

@marco-c
Copy link
Owner

marco-c commented Jun 9, 2018

Closing in favor of #194.

@marco-c marco-c closed this as completed Jun 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants