Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Information Sharing on the rf_inversion Technology #91

Open
raysers opened this issue Nov 8, 2024 · 2 comments
Open

Information Sharing on the rf_inversion Technology #91

raysers opened this issue Nov 8, 2024 · 2 comments

Comments

@raysers
Copy link

raysers commented Nov 8, 2024

Could I share a new piece of technology information? Recently, I discovered an interesting project called Fluxtapoz:

https://github.com/logtd/ComfyUI-Fluxtapoz

Its initial platform is ComfyUI and is currently only compatible with the non-mlx standard version of flux. However, some enthusiasts have ported it to run within Diffusers:

https://github.com/raven38/rf_inversion

Thus, it may still be possible to adapt it for mflux compatibility.

This technology appears to be distinct from ControlNet and ipadapter. It is rumored that it doesn’t require a model; it can achieve consistent style transfer, style imitation, and similar functions solely through prompt inputs.

However:

At the moment, I don’t have the necessary hardware to run standard flux, so I haven’t tested this technology personally. All my information comes from the official demonstrations and some blogger reviews.

Additionally, I’ve noticed that the current mflux development team seems stretched thin, with only @filipstrand and @anthonywu actively contributing on a regular basis at the moment. (There have been other active contributors in the past, but right now it’s just these two.) Since mlx is relatively new, there are very few who are truly proficient with it. This raises the bar for contributors and understandably limits the available manpower. Therefore, I don’t want to add to the team’s workload by suggesting they put this on their to-do list (which already seems to have an extensive backlog).

I’m merely sharing this brief overview of the technology as a piece of information. The reason for sharing: I think that, given the rapid iteration of new technologies, this could be a short-lived innovation, but it’s also possible that it might become a game-changing technology, alongside ControlNet and ipadapter.

If it turns out to be the latter, then sharing this information will have been worthwhile. Who knows—maybe this mention might even have a bit of foresight? And if one day this technology sparks something remarkable when combined with mflux, wouldn’t that be exciting? Anything is possible, I believe.

@raysers raysers changed the title Information Sharing on the Fluxtapoz Technology Information Sharing on the rf_inversion Technology Nov 8, 2024
@filipstrand
Copy link
Owner

filipstrand commented Nov 9, 2024

@raysers Really cool suggestion, thanks for bringing it up here! A year ago I experimented a bit with null text inversion in the Diffusers codebase and that was actually my first attempt to implement new research with diffusion models (fun times!) This looks like something similar and probably even better. I'll put this on the TODO list as something to further investigate when there is time.

@raysers
Copy link
Author

raysers commented Nov 10, 2024

Thank you, @filipstrand 大佬—thank you for your interest in this feature. I only intended to share it, so it was a delightful surprise to hear you might add it to the TODO list. As you mentioned, it’s still something that requires further investigation, and I fully agree. Perhaps, as the official team focuses on implementing higher-priority features, time will also naturally test the practicality of this technology. If it eventually proves suitable for MFLUX and is implemented, that would indeed be another pleasant surprise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants