Skip to content

Conversation

gamlerhart
Copy link
Contributor

@gamlerhart gamlerhart commented Mar 20, 2025

Motivation: In some companies, the development
team has to produce Software Bill of Materials (SBOM) for their project for compliance reasons:
To track dependencies and licenses across their organisation. Provide a Module that produces SBOMs
in JSON format.

Changes in the core: Extended the .getArtifact
to return the coursier.Resolution as well.
This is then used to get the license information.

Outside the core: Add a SBOM contrib module

  • Generate the most basic CycloneDX SBOM files Supporting Java modules for a start
  • Provide a basic upload to the Dependency Track server

@gamlerhart
Copy link
Contributor Author

gamlerhart commented Mar 21, 2025

I've these high level questions

  • I implemented the SBOM json from scratch, so that we can use the UPickle etc and do not add libraries:
    The alternative is to use the CycloneDX 'model' library, that mostly implements the JSON and some hashing.
    But that then adds this extra library etc: More stuff to download, more stuff in the classpath etc.
    So, the question is: What is preferred in general: Avoiding external libraries when possible? Or go for maximum comparability and include more external libraries?

  • I adding this to the 'contrib' section the right place? The alternative seems to have it as a complete external library. However, that adds extra maintenance burdens: Pushing it to Maven repos, compiling it, versioning it. So, having it in-sync seems better.

  • I've extended the returned data from the Coursier .artifacts method. I think that is ok, because that method wasn't yet published in 0.12.9?

(update: Android tests are fixed)

@gamlerhart gamlerhart marked this pull request as ready for review March 21, 2025 11:08
Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've these high level questions

  • I implemented the SBOM json from scratch, so that we can use the UPickle etc and do not add libraries:
    The alternative is to use the CycloneDX 'model' library, that mostly implements the JSON and some hashing.
    But that then adds this extra library etc: More stuff to download, more stuff in the classpath etc.
    So, the question is: What is preferred in general: Avoiding external libraries when possible? Or go for maximum comparability and include more external libraries?

I'd say it depends on how stable the CyconeDX format it. We don't have an issue with downloading additional dependencies, as long as we encapsulate them in an isolated classloader and properly manage their use (e.g. share same classloader for multple modules via a worker module).

But the model added here seems small enough and the benefit of being directly cacheable by Mill as a task result is something I find useful.

  • I adding this to the 'contrib' section the right place? The alternative seems to have it as a complete external library. However, that adds extra maintenance burdens: Pushing it to Maven repos, compiling it, versioning it. So, having it in-sync seems better.

Contrib is exactly for when you don't want to host it yourself.

  • I've extended the returned data from the Coursier .artifacts method. I think that is ok, because that method wasn't yet published in 0.12.9?

I'd like to have @alexarchambault thoughts on that. I think we don't want to return tuples from Resolver.artifacts (until we can use named tuples). Maybe we want add a new method with a better name instead.

I added some comments below.

@gamlerhart gamlerhart force-pushed the sbom-for-java-deps branch from b740cdc to bdb28c2 Compare April 4, 2025 19:56
@gamlerhart
Copy link
Contributor Author

  • I'll stick with the upickl JSON for a start. That integrates well with Mill eco system. I consider the CyclonDX library only if there is too painful to directly write the Json.

case Right(res) =>
Result.Success(res)
case Right(artifacts) =>
Result.Success(ArtifactResolution(resolution, artifacts))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe coursier.Fetch.Result could be used for that, like

Suggested change
Result.Success(ArtifactResolution(resolution, artifacts))
Result.Success(Fetch.Result(resolution, artifacts.fullDetailedArtifacts0, artifacts.fullExtraArtifacts))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we could rename CoursierModule#artifacts to CoursierModule#fetch then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used Fetch.Result and renamed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexarchambault Ping. Updated to Fetch.Result

@gamlerhart
Copy link
Contributor Author

Rebased & Tests passing =)

@gamlerhart gamlerhart requested a review from lefou May 21, 2025 06:47
@lihaoyi lihaoyi force-pushed the main branch 2 times, most recently from 1d3b959 to 3ba698a Compare July 10, 2025 04:00
@lihaoyi
Copy link
Member

lihaoyi commented Aug 26, 2025

@gamlerhart sorry we've overlooked this. Could you rebase this one more time, and flesh out the PR description with a more verbose explanation of what this is all about? I'm not familiar with SBOMs myself and need help understanding what's going on here

@gamlerhart
Copy link
Contributor Author

Hi everyone. I'll try to get to it and rebase it.

@lihaoyi At this stage the PR is the most minimal possible SBOM. The idea is to get the 'foot in the door' for SBOM support. And then see what people actually need and extend it.

Eg. Missing atm:

  • Having different hash flavors (md5, sha1, sha2 etc) for include library. I added SHA2 only.
  • Having a report for the build tool itself
  • Support for non-maven dependencies.

Anyway, that is something to add if there is demand =).

@gamlerhart gamlerhart marked this pull request as draft August 27, 2025 20:08
@lefou
Copy link
Member

lefou commented Aug 27, 2025

@gamlerhart To get an idea, how such an SBOM looks like, could you attach an SBOM generated with this PR for Mill itself?

@gamlerhart gamlerhart force-pushed the sbom-for-java-deps branch 4 times, most recently from e68e4df to 006f36c Compare September 2, 2025 19:46
@gamlerhart
Copy link
Contributor Author

Ok...I'm not sure how to create a SBOM for Mill itself with a dev build:

What I tried:

  1. I've mill checked out in a second location
  2. I add //| mvnDeps: ["com.lihaoyi::mill-contrib-sbom:$MILL_VERSION"] to the build.mill in that directory, to add the contribution library
  3. I then run with this branch /mill dist.run /home/roman/mill-test-checkout resolve dist._

Then I get this error:

============================== resolve dist._ ==============================
[build.mill-60/65] compile
[build.mill-60] [info] compiling 22 Scala sources to /home/roman/dev-temp/mill/out/mill-build/compile.dest/classes ...
[build.mill-60] [error] -- [E046] Cyclic Error: /home/roman/dev-temp/mill/mill-build/src/millbuild/MillScalaModule.scala:12:67 
[build.mill-60] [error] 12 |trait MillScalaModule extends ScalaModule with MillJavaModule with ScalafixModule { outer =>
[build.mill-60] [error]    |                                                                   ^
[build.mill-60] [error]    |                             Cyclic reference involving val <import>
[build.mill-60] [error]    |
[build.mill-60] [error]    |                              Run with -explain-cyclic for more details.
[build.mill-60] [error]    |
[build.mill-60] [error]    | longer explanation available when compiling with `-explain`
[build.mill-60] [error] -- [E46] /home/roman/dev-temp/mill/libs/javalib/package.mill:123:53
[build.mill-60] [error] 123 │  object worker extends MillPublishScalaModule with BuildInfo {
[build.mill-60] [error]     │                                                    ^
[build.mill-60] [error]     │Cyclic reference involving val <import>
[build.mill-60] [error]     │
[build.mill-60] [error]     │ Run with -explain-cyclic for more details.
[build.mill-60] [error] -- [E8] /home/roman/dev-temp/mill/website/package.mill:3:12
[build.mill-60] [error] 3 │import org.jsoup._
[build.mill-60] [error]   │           ^^^^^
[build.mill-60] [error]   │value jsoup is not a member of org
[build.mill-60] [error] -- [E8] /home/roman/dev-temp/mill/runner/meta/package.mill:4:21
[build.mill-60] [error] 4 │import mill.contrib.buildinfo.BuildInfo
[build.mill-60] [error]   │                    ^^^^^^^^^

So, I think my plain dist.run doesn't work with the mill build itself.

@gamlerhart gamlerhart marked this pull request as ready for review September 2, 2025 20:50
@lihaoyi
Copy link
Member

lihaoyi commented Sep 3, 2025

@gamlerhart since Mill has a meta-build in mill-build/build.mill, you need to add your dependency there instead of in the build header

@lefou
Copy link
Member

lefou commented Sep 3, 2025

  1. Make a local release of your development snapshot:
> MILL_STABLE_VERSION=1 mill dist.installLocalCache
...
[7152] /home/lefou/.cache/mill/download/1.0.4-37-a03078
  1. Add the plugin to the Mill build
diff --git a/mill-build/build.mill b/mill-build/build.mill
@@ -15,6 +15,7 @@
     // TODO: implement empty version for ivy deps as we do in import parser
     mvn"com.lihaoyi::mill-contrib-buildinfo:${mill.api.BuildInfo.millVersion}",
     mvn"com.goyeau::mill-scalafix_mill1:0.6.0",
-    mvn"org.jsoup:jsoup:1.21.2"
+    mvn"org.jsoup:jsoup:1.21.2",
+    mvn"com.lihaoyi::mill-contrib-sbom:${mill.api.BuildInfo.millVersion}"
   )
 }
  1. Use the just released Mill version via MILL_VERSION:
> MILL_VERSION="1.0.4-37-a03078" mill ...

or edit the build.mill:

diff --git a/build.mill b/build.mill
@@ -1,4 +1,4 @@
-//| mill-version: 1.0.4-26-a6e4c1
+//| mill-version: 1.0.4-37-a03078
 //| mill-jvm-opts: ["-XX:NonProfiledCodeHeapSize=250m", "-XX:ReservedCodeCacheSize=500m"]
 //| mill-opts: ["--jobs=0.5C"]
 

@gamlerhart
Copy link
Contributor Author

Thanks for the help =). That worked:

The SBOM for Mill itself looks like this:
sbom.json

Or for example when viewing it in a tool like dependency track:
image

Reminder: It is a foot in the door state:

  • There are tons and tons of data that can be potentially filled. Depends on the feedback on what people actually need.
  • Plus: Afaik there is ways to represent sub-modules etc, but the generated SBOM is the flat list. Same: Will see if people actually need that.

From my side: I probably more interested to also get the JavaScript/npm dependencies next. As having JVM backend + NPM in the frontend is so common.

Motivation: In some companies, the development
team has to produce Software Bill of Materials (SBOM)
for their project for compliance reasons:
To track dependencies and licenses across their organisation.
Provide a Module that produces SBOMs
in JSON format.

Changes in the core: Extended the .getArtifact
to return the coursier.Resolution as well.
This is then used to get the license information.

Outside the core: Add a SBOM contrib module
- Generate the most basic CycloneDX SBOM files
  Supporting Java modules for a start
- Provide a basic upload to the Dependency Track server
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants