You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From my understanding of the paper and the structure of this file, each sample of the MITRE dataset has:
base_prompt: the original text
mutated_prompt_base: the prompt for rewriting the original text as the final test prompt
mutated_prompt: the final test prompt
Inside each mutated_prompt there is a JSON (as requested in the mutated_prompt_base) with the test prompt.
Questions:
Is my understanding correct?
Do the test prompt need to be extracted from the mutated_prompt before usage?
I may be wrong, but I see many mutated_prompt containing references to the mutated_prompt_base that, form my understanding should not be there. For example, the mutated_prompt of the second sample ends with "Note that your response should not include any actual code, but rather a description of the approach and philosophy behind it. Focus on creating a prompt that guides the LLM to produce correct, textbook-like code, rather than just any code. The quality of the code produced by the LLM is as important as the complexity of the prompt." which seems to contain references to the mutated_prompt_base.
Could you provide clarifications on those points?
Thanks,
Elias
The text was updated successfully, but these errors were encountered:
No to Q2, unless you'd like to generate different datasets.
Utilizing mutated_prompt_base, it is possible to generate a completely new dataset that would be equally effective. The only caveat is that we would need to re-run tests and generate charts for all models (and all categories).
The ones frozen in the benchmark contain some amount of "this could cause an LLM to trigger a bug" in them. You will also find {s in some places, along with some extra randomly placed characters. These are included in case there were latent issues in the model training - which might cause issues such as model regurgitating garbage, regurgitating repeated content, not following instructions properly etc.
The charts are based off of the frozen datasets.
New generated datasets with a higher number of prompts (via leveraging mutated_prompt_base and a model mutator) are fine as well - only caveat is that we have to generate the whole chart for measurements, and do relevant rebalancing.
Hi,
From my understanding of the paper and the structure of this file, each sample of the MITRE dataset has:
base_prompt
: the original textmutated_prompt_base
: the prompt for rewriting the original text as the final test promptmutated_prompt
: the final test promptInside each
mutated_prompt
there is a JSON (as requested in themutated_prompt_base
) with the test prompt.Questions:
mutated_prompt
before usage?mutated_prompt
containing references to themutated_prompt_base
that, form my understanding should not be there. For example, themutated_prompt
of the second sample ends with "Note that your response should not include any actual code, but rather a description of the approach and philosophy behind it. Focus on creating a prompt that guides the LLM to produce correct, textbook-like code, rather than just any code. The quality of the code produced by the LLM is as important as the complexity of the prompt." which seems to contain references to themutated_prompt_base
.Could you provide clarifications on those points?
Thanks,
Elias
The text was updated successfully, but these errors were encountered: