Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MITRE dataset potential issues #57

Open
AmenRa opened this issue Sep 30, 2024 · 1 comment
Open

MITRE dataset potential issues #57

AmenRa opened this issue Sep 30, 2024 · 1 comment
Assignees

Comments

@AmenRa
Copy link

AmenRa commented Sep 30, 2024

Hi,

From my understanding of the paper and the structure of this file, each sample of the MITRE dataset has:

  1. base_prompt: the original text
  2. mutated_prompt_base: the prompt for rewriting the original text as the final test prompt
  3. mutated_prompt: the final test prompt

Inside each mutated_prompt there is a JSON (as requested in the mutated_prompt_base) with the test prompt.

Questions:

  1. Is my understanding correct?
  2. Do the test prompt need to be extracted from the mutated_prompt before usage?
  3. I may be wrong, but I see many mutated_prompt containing references to the mutated_prompt_base that, form my understanding should not be there. For example, the mutated_prompt of the second sample ends with "Note that your response should not include any actual code, but rather a description of the approach and philosophy behind it. Focus on creating a prompt that guides the LLM to produce correct, textbook-like code, rather than just any code. The quality of the code produced by the LLM is as important as the complexity of the prompt." which seems to contain references to the mutated_prompt_base.

Could you provide clarifications on those points?

Thanks,

Elias

@mbhatt1
Copy link
Contributor

mbhatt1 commented Oct 3, 2024

  1. Yes to Q1.
  2. No to Q2, unless you'd like to generate different datasets.
  3. Utilizing mutated_prompt_base, it is possible to generate a completely new dataset that would be equally effective. The only caveat is that we would need to re-run tests and generate charts for all models (and all categories).
  4. The ones frozen in the benchmark contain some amount of "this could cause an LLM to trigger a bug" in them. You will also find {s in some places, along with some extra randomly placed characters. These are included in case there were latent issues in the model training - which might cause issues such as model regurgitating garbage, regurgitating repeated content, not following instructions properly etc.
  5. The charts are based off of the frozen datasets.

New generated datasets with a higher number of prompts (via leveraging mutated_prompt_base and a model mutator) are fine as well - only caveat is that we have to generate the whole chart for measurements, and do relevant rebalancing.

@mbhatt1 mbhatt1 self-assigned this Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants