Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SBOM schema validation #17987

Merged
merged 2 commits into from
Aug 8, 2024
Merged

Fix SBOM schema validation #17987

merged 2 commits into from
Aug 8, 2024

Conversation

Bo98
Copy link
Member

@Bo98 Bo98 commented Aug 8, 2024

So our schema validation has never worked. It's always been using the wrong format - some JSON LD from SPDX 3.0. We're using SPDX 2.3 and we want a JSON Schema - not JSON LD.

This PR fixes it and did detect some issues, e.g. the created date was in the incorrect format (should have been ISO 8601).

There's an outstanding problem here: our stripping of created for bottles is producing an invalid SBOM. The created field is mandatory.

I'm not sure what the correct fix for that would be. There's also a number of other outstanding issues of reproducibility - a bottle built one week is currently producing a different SBOM to one build the next week when brew tags are made. Seems like we're currently undoing all of the benefits of --only-json-tab, where we made sure the tab is stored in the package manifest rather than the bottle tarball itself.

@Bo98 Bo98 marked this pull request as draft August 8, 2024 01:24
@Bo98 Bo98 force-pushed the sbom-schema-fix branch 2 times, most recently from e51c2d7 to e8809d6 Compare August 8, 2024 01:30
@carlocab
Copy link
Member

carlocab commented Aug 8, 2024

There's an outstanding problem here: our stripping of created for bottles is producing an invalid SBOM. The created field is mandatory.

Can't we add this back at pour time so that the on-disk SBOM is valid?

In any case: the SBOMs in the bottle are incomplete anyway, and shouldn't be used. We should maybe rename them while they're still in the bottle so it's not misleading.

@Bo98
Copy link
Member Author

Bo98 commented Aug 8, 2024

Can't we add this back at pour time so that the on-disk SBOM is valid?

In any case: the SBOMs in the bottle are incomplete anyway, and shouldn't be used. We should maybe rename them while they're still in the bottle so it's not misleading.

We can add it at pour time and I think we do. Though at which point: why ship an incomplete SBOM in the bottle at all when we can generate it entirely on install like we do for tabs?

@carlocab
Copy link
Member

carlocab commented Aug 8, 2024

why ship an incomplete SBOM in the bottle at all when we can generate it entirely on install like we do for tabs?

Not sure; I actually thought that we stopped shipping SBOMs in bottles entirely because they broke bottle reproducibility.

@MikeMcQuaid
Copy link
Member

The created field is mandatory.

We could set it to the source_modified_time so it's consistent between bottles. Alternatively, we set it to some arbitrary fixed time.

Seems like we're currently undoing all of the benefits of --only-json-tab

I agree, with hindsight, though that putting the SBOM as part of the OCI manifest/metadata seems like it would be a smarter move (although that comes with downsides, too, noted below).

There's also a number of other outstanding issues of reproducibility - a bottle built one week is currently producing a different SBOM to one build the next week when brew tags are made.

Yes, this is bad and something else that should be fixed.

In any case: the SBOMs in the bottle are incomplete anyway, and shouldn't be used.

This is binary thinking; being incomplete does not make these SBOMs useless. I think there's still some value in having something inside the bottle. It's easier to download and consume a bottle than our tab metadata. It's also more trustworthy to have said SBOM be part of the checksummed object rather than in the OCI manifest/metadata that can be overwritten at any time with a Homebrew/core change.

Though at which point: why ship an incomplete SBOM in the bottle at all when we can generate it entirely on install like we do for tabs?

Because:

  • some data is more useful than no data
  • providing an SBOM for the bottle rather than just the keg is useful

I think the short-term best solutions are:

  • stripping out more bad/reproducibility-harming metadata from the bottle's SBOM
  • fixing the schema validation errors in the bottle and poured SBOM

And, in the longer term:

  • add the SBOM into the OCI manifest/metadata
  • after this is done: consider if it should/could be removed from the bottle

@MikeMcQuaid
Copy link
Member

And, in the longer term:

  • add the SBOM into the OCI manifest/metadata
  • after this is done: consider if it should/could be removed from the bottle

Another thing: we should really have some regression tests for ensuring that we can create reproducible bottles. A lot of issues in SBOM and elsewhere would have been caught by this.

- Remove/change data from bottle SBOM to avoid harming reproduciblity
- Add `schema_validation_errors` method to provide nicer test failures
- Add tests more tests for SBOM when bottling
- Cleanup SBOM tests to use more typical RSpec form and be DRYer
@MikeMcQuaid MikeMcQuaid marked this pull request as ready for review August 8, 2024 08:39
@MikeMcQuaid
Copy link
Member

Should have fixed the short-term issues mentioned here as well as adding some more tests.

@MikeMcQuaid MikeMcQuaid merged commit eaa5204 into master Aug 8, 2024
25 checks passed
@MikeMcQuaid MikeMcQuaid deleted the sbom-schema-fix branch August 8, 2024 08:49
@MikeMcQuaid
Copy link
Member

Thanks for the PR @Bo98! I'm definitely game to make bigger changes but felt like it was important to resolve the issues you flagged here ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants