Skip to content

Conversation

bentsherman
Copy link
Member

@bentsherman bentsherman commented Aug 27, 2025

This PR introduces a new syntax for process which uses typed inputs and outputs. The existing syntax is still supported.

This PR refactors several large classes -- namely ProcessConfig and TaskProcessor -- to better separate concerns and enable a v1 / v2 model for process inputs/outputs. When moving existing code to new files, I try to change it as little as possible to not break anything.

ProcessConfig refactor

The following new classes were spun out of ProcessConfig:

  • ProcessConfigV1 / ProcessConfigV2 extend ProcessConfig with the declared inputs / outputs based on legacy (v1) or typed (v2) semantics

  • ProcessDslV1 / ProcessDslV2 are builder DSLs for legacy / typed process definitions

  • ProcessConfigBuilder is an adapter for applying process configuration to a process definition

  • ProcessBuilder is the base builder class used by the above builders

TaskProcessor refactor

The following new classes were spun out of TaskProcessor:

  • TaskInputResolver implements the input file resolution from makeTaskContextStage2()

  • TaskOutputResolver implements the task output resolution logic for typed processes

  • TaskEnvCollector implements the output env/eval resolution from collectOutEnvMap()

  • TaskFileCollector implements the output file resolution from collectOutFiles()

Typed inputs / outputs

The following new classes implement the new behavior for typed inputs / outputs:

  • ProcessInputs and ProcessOutputs replace InputsList and OutputsList from the v1 model

  • ProcessInput and ProcessOutput replace all InParam and OutParam classes from the v1 model

  • ProcessFileInput and ProcessFileOutput replace FileInParam and FileOutParam in the v1 model

Backwards compatibility

The runtime supports both legacy (v1) and typed (v2) processes by creating the ProcessDef with either a ProcessConfigV1 or ProcessConfigV2.

ProcessDef, TaskProcessor, and TaskRun check this type to determine whether to use v1 or v2 semantics. An instanceof check is performed at these decision points:

if( config instanceof ProcessConfigV1 )
    // use legacy inputs/outputs
if( config instanceof ProcessConfigV2 )
    // use typed inputs/outputs

Based on initial work in #4553

TODO:

  • update docs
  • update tests
  • add e2e tests

Copy link

netlify bot commented Aug 27, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 814d40c
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/68e7243c5030b200087d9f12
😎 Deploy Preview https://deploy-preview-6368--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@bentsherman bentsherman changed the title Typed processe Typed processes Aug 27, 2025
@bentsherman bentsherman marked this pull request as ready for review September 1, 2025 17:53
@bentsherman bentsherman requested review from a team as code owners September 1, 2025 17:53
@bentsherman bentsherman added this to the 25.10 milestone Sep 1, 2025
Copy link
Collaborator

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good. I like the tutorial in particular. I think it's clear and includes the right amount of detail. Also, the order makes sense and it's a good length.

I will take a second pass and nit pick the language. In the meantime, I've added two high level comments. They are very minor.

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a too big change compared to current syntax. I do not support this approach

@bentsherman bentsherman force-pushed the typed-processes branch 3 times, most recently from 25a80b1 to 0e7be56 Compare September 5, 2025 20:44
@bentsherman
Copy link
Member Author

Updated to use "phase 1" syntax, i.e. support for multiple input channels and tuple inputs

Copy link
Collaborator

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman - I went through the tutorial in detail focusing on the language. I split everything into separate comments to hopefully make it easier to accept/reject.

I found some of the code blocks confusing as when I went into the example repo the code blocks didn't match what was in master branch. I'm fine with using the rnaseq-nf example and a little bit of difference is okay, but if anyone does what I tried to do it's hard to follow. Can we better align this? Alternatively, can we peel off this example from rnaseq-nf and start building an repo full of examples specifically for the docs? If might give a little more latitude for v1, v2, v3 of tutorials like this and allow better synergy between what is written and what's in the repo. Happy to hear your thoughts

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@bentsherman bentsherman linked an issue Oct 9, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DSL2 - emit tuples with optional values
4 participants