New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Profiling][Model][Doc] Support Llama3-8B and 70B on A100s #22

Merged

nitinkedia7 merged 6 commits into main from users/nitinkedia7/fix-profiling

Jul 24, 2024

Commits on Jul 24, 2024

Merged PR 1873: Support Llama3 8B and 70B for 32k context length on a…

…100_pairwise_nvlink

# Changelog

* Support Llama3 8B and 70B https://llama.meta.com/llama3/
* Max supported context length is 32k, only on 4xA100.
* Pipeline parallel is not profiled yet for more than 4k.
* Attention profiling enhancements:
** Reduce number of input combinations by removing those batches which require more kv cache blocks than available GPU memory.

nitinkedia7 committed Jul 24, 2024

1478250

Fix llama3-8B and 70B profiling data

nitinkedia7 committed Jul 24, 2024
Configuration menu
View commit details

Copy full SHA for 41ab8ab

Browse repository at this point
Copy the full SHA

41ab8ab View commit details

Browse the repository at this point in the history
Bring documentation files to top-level docs/ folder

nitinkedia7 committed Jul 24, 2024
Configuration menu
View commit details

Copy full SHA for 2f92e46

Browse repository at this point
Copy the full SHA

2f92e46 View commit details

Browse the repository at this point in the history
Add llama3-70b attention profiling data

nitinkedia7 committed Jul 24, 2024
Configuration menu
View commit details

Copy full SHA for 8350006

Browse repository at this point
Copy the full SHA

8350006 View commit details

Browse the repository at this point in the history
format

nitinkedia7 committed Jul 24, 2024
Configuration menu
View commit details

Copy full SHA for 4867101

Browse repository at this point
Copy the full SHA

4867101 View commit details

Browse the repository at this point in the history
minor

nitinkedia7 committed Jul 24, 2024
Configuration menu
View commit details

Copy full SHA for 2649f64

Browse repository at this point
Copy the full SHA

2649f64 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Profiling][Model][Doc] Support Llama3-8B and 70B on A100s #22

[Profiling][Model][Doc] Support Llama3-8B and 70B on A100s #22

Commits on Jul 24, 2024