-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Kubernetes Orchestration Library for Model Server Deployment and Benchmarking #22
Comments
@wangchen615: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
My thoughts:
cc @smarterclayton @terrytangyuan @sjmonson @ahg-g for thoughts on this. |
Currently, inference-perf provides libraries for client requests, dataset handling, load generation, and result reporting. However, there's a need to add Kubernetes orchestration capabilities to deploy and manage model servers for benchmarking purposes.
Current Status:
Requirements:
Design Considerations:
Integration with Existing Structure:
Deployment Architecture:
Reference Implementation:
Next Steps:
Questions to Address:
Please share your thoughts and suggestions on the proposed approach.
/kind feature
/priority important-soon
/area orchestration
The text was updated successfully, but these errors were encountered: