-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support input embedding size more than 8K token #87
Comments
Example:
┌────────────────────────────────────────────────────────────────────────────────┐ I think this is just the initial embedding of the whole project because the TODO file itself is very very small. |
Thanks for reporting the issue. each embed for each file send a request asynchronous to AI now. for example here one of your file is need more than 8K token!? base on your input "read the TODO file" RAG try to find similar files and one of these files seems is more that 8k! Do you have any experience about chunk or batch in embedding!? |
I do have experience in chunking and batch embeddings. Why do you ask? |
If you have some experience and have time, feel free to add this enhancement, otherwise I will work on that and fix this issue. |
@rodrigomeireles Problem solved in this version v1.7.4. Also, now you can use Please check it out. |
This issue is fixed in PR #88, and I closed this issue. if there is any other issue related to that, create a new issue. |
Big files seem to reach the 8k context of text-embedding-3-small using Open AI. There should probably be a chunking/retrieval strategy implementation for this.
The text was updated successfully, but these errors were encountered: