Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use .gitignore as part of the excluded file list #1090

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ericwb
Copy link
Member

@ericwb ericwb commented Jan 9, 2024

When using Bandit to scan projects based on Git source control,
it would be benefitual to ignore files based on the patterns
in the .gitignore file.

Today, Bandit has some default excludes that get overridden if
a user passes in other excludes. This is a bit confusing to the
end user. But it also serves a purpose similar to .gitignore in
that the paths excluded by default are typically included in a
.gitignore.

Note, it will only check for .gitignore files in top-level directories
specified on the Bandit command line as targets. It does not recursive
look for .gitignore files. This is done because recursive searching
for .gitignore files would be complex to add to Bandit existing
file discovery.

This change adds a new Apache 2 licensed dependency of ignorelib.

Fixes #826

@ericwb
Copy link
Member Author

ericwb commented Jan 9, 2024

FYI, in my testing, I have found that this change significantly sped up scans of 4 test repos I used. It skipped many files that the default exclude didn't catch. Most likely due to the way the default excludes expect you to run Bandit within the current working directory of a repo in order to properly exclude those file patterns.

In my case, the scan went from 4 minutes to just 8 seconds.

@sigmavirus24
Copy link
Member

Should probably support hgignore too

@ericwb
Copy link
Member Author

ericwb commented Jan 9, 2024

Should probably support hgignore too

Is there much use of Mercurial anymore? Also, we'd have to find and pull in a new dependency to parse .hgignore files since GitPython only works with .gitignore.

https://gitpython.readthedocs.io/en/stable/reference.html#git.repo.base.Repo.ignored

@ericwb
Copy link
Member Author

ericwb commented Jan 9, 2024

FYI, also opened an issue on GitPython to support ignored files sizes above ARG_MAX. gitpython-developers/GitPython#1790

@sigmavirus24
Copy link
Member

Should probably support hgignore too

Is there much use of Mercurial anymore? Also, we'd have to find and pull in a new dependency to parse .hgignore files since GitPython only works with .gitignore.

https://gitpython.readthedocs.io/en/stable/reference.html#git.repo.base.Repo.ignored

Yeah, there is. Lots of companies still use it. We could fundraise around it if necessary to pay someone to implement it.

@lukehinds
Copy link
Member

Should probably support hgignore too

Is there much use of Mercurial anymore? Also, we'd have to find and pull in a new dependency to parse .hgignore files since GitPython only works with .gitignore.
https://gitpython.readthedocs.io/en/stable/reference.html#git.repo.base.Repo.ignored

Yeah, there is. Lots of companies still use it. We could fundraise around it if necessary to pay someone to implement it.

Have we had many folks requesting this?

@sigmavirus24
Copy link
Member

gitignore? No.

hgignore? Also ko

@ericwb
Copy link
Member Author

ericwb commented Jan 19, 2024

I'd argue this is a good improvement into usability of Bandit. For many projects that use tox, they already have .tox in their .gitignore, so with this addition, it helps they to avoid have to pass many directories to the exclude list in the CLI or Bandit config.

@ericwb
Copy link
Member Author

ericwb commented Jan 19, 2024

We can also utilize ignorelib (https://pypi.org/project/ignorelib/) instead of GitPython which could make the code cleaner and not depend on GitPython which shells out to run a command.

When using Bandit to scan projects based on Git source control,
it would be benefitual to ignore files based on the patterns
in the .gitignore file.

Today, Bandit has some default excludes that get overridden if
a user passes in other excludes. This is a bit confusing to the
end user. But it also serves a purpose similar to .gitignore in
that the paths excluded by default are typically included in a
.gitignore.

Note, it will only check for .gitignore files in top-level directories
specified on the Bandit command line as targets. It does not recursive
look for .gitignore files. This is done because recursive searching
for .gitignore files would be complex to add to Bandit existing
file discovery.

This change adds a new Apache 2 licensed dependency of ignorelib.

Fixes PyCQA#826

Signed-off-by: Eric Brown <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use .gitignore as basis of default excludes
3 participants