Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How will PyPI filtering of SBOM data work? #1

Open
miketheman opened this issue Nov 1, 2024 · 5 comments
Open

How will PyPI filtering of SBOM data work? #1

miketheman opened this issue Nov 1, 2024 · 5 comments
Labels
question Further information is requested

Comments

@miketheman
Copy link
Member

Set of filtering requirements to add to popular package indexes like PyPI to ensure other tools are adhering to standards.

Today this is basically accomplished with trove classifiers, which are self-advertised. For example, projects that say they are typed: https://pypi.org/search/?q=&o=&c=Typing+%3A%3A+Typed
This can lead to a disconnect from when the functionality is added to when the classifier is added - a la glyph/automat#161

Do you envision a different mechanism that is more verifiable, i.e. from the metadata itself?

If so, there's existing challenges with metadata storage that likely need to be addressed first (and we should fix that!) but it's not super simple.

@miketheman miketheman added the question Further information is requested label Nov 1, 2024
@sethmlarson
Copy link
Member

I was imagining less intense filtering for "absolute correctness" and more:

  • Are the SBOM documents valid JSON?
  • This document is claiming to be SPDX/CycloneDX, does it actually have the bare minimum to be recognized by other tools? (Usually format version and a few required fields)
  • Is any basic information in the SBOM primary component not matching the Python dist metadata or not have a primary component?

This will help filter out when tools are generating SBOM data incorrectly and prevent silent "no linkage" scenarios. There might be more conditionsto add as we discover more!

@sethmlarson sethmlarson changed the title PyPI filtering How will PyPI filtering of SBOM data work? Nov 1, 2024
@miketheman
Copy link
Member Author

Those kinds of filters make sense - do you envision them being as part of the upload phase, or as a post-upload verification, or a user-side search?

@sethmlarson
Copy link
Member

@miketheman I was imagining having them as a part of the upload phase if that's possible. I don't know how "post-upload" verification surfaces to the user, is that documented somewhere?

@miketheman
Copy link
Member Author

I don't know how "post-upload" verification surfaces to the user, is that documented somewhere?

Largely because it doesn't exist yet! 😆 But since we kick off tasks in response to a completed upload, we could perform analyses post-upload and persist results, and then surface them to the package managers or the world.

The upload step is a bit "heavy" now, so I'd suggest that this step include some refactoring to make it flow a little better.

And if any of these checks could be performed prior to upload (a la twine check) and save the user from an attempt/failure/fix/attempt/... cycle, that'd be good UX in my mind.

@sethmlarson
Copy link
Member

Got it. twine check is a great idea, I'll look at that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants