You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This model from Upstage is extremely strong for models that fit on a single GPU for training and inference! https://huggingface.co/upstage/solar-pro-preview-instruct. However, it does use a custom architecture solar which is based on Llama/Mistral but modifies the forward pass to add long range residual connections. It would be awesome to support this architecture natively out of the box!
Alternatives
No response
Additional context
Thank you so much for this awesome project :)
The text was updated successfully, but these errors were encountered:
🚀 The feature, motivation and pitch
This model from Upstage is extremely strong for models that fit on a single GPU for training and inference! https://huggingface.co/upstage/solar-pro-preview-instruct. However, it does use a custom architecture
solar
which is based on Llama/Mistral but modifies the forward pass to add long range residual connections. It would be awesome to support this architecture natively out of the box!Alternatives
No response
Additional context
Thank you so much for this awesome project :)
The text was updated successfully, but these errors were encountered: