From 3c68bd5602e3f37582f9bbe73ab083273bd4a1c7 Mon Sep 17 00:00:00 2001 From: Zhang Peiyuan Date: Sat, 11 May 2024 19:34:12 +0800 Subject: [PATCH] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b3a2633..ffd30c6 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ We support different sequence parallel methods: We then proceed to train Llama-2-7B on 8 A100 by gradually increasing its rope base frequency to 1B. Notably, our model is only trained with 512K sequence length while generalizing to nearly 1M context. ## Updates +- [05/11] Add Ulysses. - [05/06] Add distractors (multi-needle) in the NIAH evaluation script. You can set the number of distractors using --num_distractor. - [05/06] IMPORTANT! If you want to use eval_needle.py to evaluate the llama3 model, you need to add one extra space (" ") behind the QUESTION_STR. I believe this has something to do with the tokenizer. ## Usage