Skip to content

prompt.fail explores prompt injection techniques in large language models (LLMs), providing examples to improve LLM security and robustness.

License

Notifications You must be signed in to change notification settings

jalvarezz13/prompt.fail

Repository files navigation

prompt.fail

Welcome to prompt.fail, a project dedicated to exploring and documenting techniques for prompt injection in large language models (LLMs). Our mission is to enhance the security and robustness of LLMs by identifying and understanding how malicious prompts can manipulate these models. By sharing and analyzing these techniques, we aim to build a community that contributes to the development of more resilient AI systems.

Table of Contents

🔓 What is Prompt Injection?

Prompt injection is a critical area of study in the field of AI safety and security. It involves crafting specific inputs (prompts) that can cause large language models to behave in unintended or harmful ways. Understanding these vulnerabilities is essential for improving the design and implementation of future AI systems.

OWASP Top 10 for Large Language Model Applications

You can find the prompt injection techniques in the first position of the OWASP Top 10 for Large Language Model Applications. The OWASP Top 10 for Large Language Model Applications is a list of the most critical security risks to be aware of when working with large language models (LLMs). OWASP says: "Manipulating LLMs via crafted inputs can lead to unauthorized access, data breaches, and compromised decision-making."

Why is Prompt Injection Important?

Prompt injection can lead to a wide range of security risks, including:

  • Data Leakage: Malicious prompts can cause LLMs to reveal sensitive information.
  • Bias Amplification: Biased prompts can reinforce or amplify existing biases in the model.
  • Adversarial Attacks: Attackers can manipulate LLMs to generate harmful or misleading content.
  • Privacy Violations: Prompts can be used to extract personal data or violate user privacy.

This repository is a collaborative effort to document various prompt injection techniques. We encourage contributions from the community to help expand our knowledge base and share insights on how to mitigate these risks.

📝 Examples

🚧 Work in progress here... 🚧

✍️ Contributing

We highly appreciate contributions from the community. Here’s how you can contribute:

Option 1: Open an Issue

If you have an idea for a new prompt injection technique, idea, or question, feel free to open an issue. We welcome all feedback and suggestions.

Option 2: Submit a Pull Request

If you would like to contribute with code or documentation, you can submit a pull request. Here’s how you can do it:

  1. Fork the repository.
  2. Create a new branch (Example: feature/your-feature).
  3. Commit your changes (Please, use conventional commits conventions).
  4. Push to the branch
  5. Open a Pull Request.

Let’s work together to make prompt.fail a valuable resource for the Cybersecurity & AI community!

📜 License

This project is licensed under the GPL-3.0 license. For more information, please refer to the LICENSE file.




Made with ❤️ by jalvarezz13
Inspired by the amsi.fail project

About

prompt.fail explores prompt injection techniques in large language models (LLMs), providing examples to improve LLM security and robustness.

Topics

Resources

License

Stars

Watchers

Forks