-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AOAI O1 Preview specific endpoint #31
Conversation
ilmarinen
commented
Sep 25, 2024
•
edited
Loading
edited
- Add a RestEndpointO1PreviewModelsAzure class that captures the specifics needed for hitting the o1-preview endpoint in Azure.
- Add rate limiting logic at the model level.
eureka_ml_insights/models/models.py
Outdated
def __post_init__(self): | ||
self.bearer_token_provider = get_bearer_token_provider(AzureCliCredential(), "https://cognitiveservices.azure.com/.default") | ||
|
||
@sleep_and_retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add this to the parent class so all endpoint models can benefit from it?
I would say it should be optional though, so if the user does not provide ratelimit related argument, default behavior should be without rate limiting.
eureka_ml_insights/models/models.py
Outdated
def check_call_limit(*args, **kwargs): | ||
return None | ||
|
||
self.check_call_limit = check_call_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you setting this instead of making self the first arg of the check_call_limit method like any regular member method?
eureka_ml_insights/models/models.py
Outdated
|
||
@sleep_and_retry | ||
@limits(calls=self.calls, period=self.period) | ||
def check_call_limit(*args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this extra method that doesn't do anything instead of directly decorating the method that makes the API call?
# Print the headers - they include the requert ID and the timestamp, which are useful for debugging. | ||
logging.info(e.info()) | ||
logging.info(e.read().decode("utf8", "ignore")) | ||
return None, False, False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use inheritance to avoid repeating existing code.
eureka_ml_insights/models/models.py
Outdated
if system_message: | ||
data["messages"]= [{"role": "system", "content": system_message}] + data["input_data"][ | ||
"input_string" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should throw a key error because data["input_data"] does not exist.
eureka_ml_insights/models/models.py
Outdated
@@ -9,6 +9,8 @@ | |||
import anthropic | |||
from azure.identity import AzureCliCredential, get_bearer_token_provider | |||
|
|||
from ratelimit import limits, sleep_and_retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make sure to run the linters according to the contribution instructions.
@@ -39,6 +39,7 @@ | |||
'bitsandbytes>=0.42.0', | |||
'accelerate>=0.21.0', | |||
'pycocotools>=2.0.8', | |||
'ratelimit>=2.2.1', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add this to the conda environment yml file as well.