-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context-Aware Sanitization #550
Comments
I think this is a pretty powerful request and we migth want to take it step by steps (i.e. split into multiple). Without thinking too much the first two things that came to my mind are:
More thinking tbd :-) |
@jhrozek I agree, and I was thinking the prompt route would be the first to investigate, but as you say , sometimes these things can be brittle and vary from model to model. I don't think we need a policy engine per-say, starting with regex that can be set by the user should be sufficient. Are you thinking of being able to plugin in evaluation engines, agree that is a lot more extensive (but very interesting too and I had not even thought of that). To reiterate though, yes, simple and light should be the first approach, but not of a type that could make it difficult to extend and refine over time. |
To address these usecases, Codegate will need to use a combination of contextual: (a) rules based inspection/filtering/sanitization of the request (user->llm) and response (llm->user), and (b) system prompts with the request. Another idea is to integrate Codgate with bandit to identify and report any CWEs in the llm generated code. |
Overview
Context-Aware Sanitization would provide a way for CodeGate to selectively filter, transform, or prevent AI-generated code suggestions from overwriting declared code snippets based on contextual rules. While our existing “secret scanning on the fly” feature already redacts and encrypts tokens (such as passwords, API keys, or other sensitive data) before passing them to the LLM to then switch back to the original format upon return path, Context-Aware Sanitization would extend this a step further by letting users define custom constraints and exceptions for certain code snippets.
For example, a user might declare that:
DB_HOST
) must never be overwritten or read by the LLM.Users
,Payments
) must never be dropped or altered etc./User/**LukeHinds**/folder
must never be exposed or modified)This ensures that any AI recommendation via an assistant, agent or MCP, do not violates the user rules by changing the content of the declared code snippet (e.g. examples above).
Key Objectives
Users can define item-specific or file-specific rules that precisely dictate what the AI assistant can modify.
Leverages a similar intercept-and-filter pipeline approach used for “secret scanning on the fly".
A simple UI allowing creating, editing, and deleting these sanitization rules.
Protects not just secrets but also critical code sections, databases, or environment configurations from unintentional or even malicious changes recommended by the LLM via an agent, assistant of MCP instance.
Possible Functionality
block
,sanitize
, orwarn
(e.g., do not allow changes, automatically mask references, or show a warning to the user).Relationship to Other Work
This feature depends on [Issue 454](Codegate Workspaces (repos) #454) (the forthcoming “CodeGate Projects” feature), which will provide the underlying data model and UI scaffolding for storing and managing sanitization rules for specific code bases (i,e. repos)
We may be able reuse the existing pipeline where tokens are redacted and encrypted prior to LLM submission, ensuring a uniform approach. Context-Aware Sanitization will simply plug additional scanning rules into that pipeline.
Technical Considerations
Example Use Cases
block
modifications toUsers
orPayments
table.DROP TABLE Users;
or modifies a “Payments” schema is automatically replaced with a sanitized placeholder (or flagged to the developer).warn
for any attempt to reassign environment variables in.env
files.sanitize
any /User/lukehinds/mycode references from code suggestions or prompts, to save leaking unwanted information.Feasibility
User request reference from @aj47 that was the inspiration for this idea.
The text was updated successfully, but these errors were encountered: