Wrap executorlib executors #678

liamhuber · 2025-06-19T20:40:15Z

To exploit the new caching interface provided in executorlib-1.5.0 so that nodes can rely on their running state and lexical path to access previously executed results.

Locally with the SingleNodeExecutor everything is looking good, but that doesn't natively support terminating the process that submit the job. I'd like to play around with this using the SlurmClusterExecutor on the cluster before making further changes.

To exploit the new caching interface so that nodes can rely on their running state and lexical path to access previously executed results. Locally with the SingleNodeExecutor everything is looking good, but that doesn't natively support terminating the process that submit the job. I'd like to play around with this using the SlurmClusterExecutor on the cluster before making further changes. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

github-actions · 2025-06-19T20:40:25Z

👈 Launch a binder notebook on branch pyiron/pyiron_workflow/executor

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

codecov · 2025-06-19T20:52:15Z

Codecov Report

Attention: Patch coverage is 88.88889% with 9 lines in your changes missing coverage. Please review.

Project coverage is 92.05%. Comparing base (4841e34) to head (f579159).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
pyiron_workflow/node.py	70.00%	6 Missing ⚠️
pyiron_workflow/executors/wrapped_executorlib.py	96.29%	1 Missing ⚠️
pyiron_workflow/mixin/run.py	96.42%	1 Missing ⚠️
pyiron_workflow/nodes/composite.py	66.66%	1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (88.88%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #678      +/-   ##
==========================================
- Coverage   92.11%   92.05%   -0.07%     
==========================================
  Files          33       34       +1     
  Lines        3665     3725      +60     
==========================================
+ Hits         3376     3429      +53     
- Misses        289      296       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Since we now depend explicitly on a new feature Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

It causes a weird hang that blocks observability. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And make the expected file independently accessible Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And hide it behind its own boolean flag for testing Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And make the file name fixed and accessible at the class level Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

bot spam

The slurm executor populates this with a submission script, etc. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

@jan-janssen

From @jan-janssen in [this comment](pyiron/executorlib#708 (comment)) Co-authored-by: Jan Janssen Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber · 2025-07-11T21:18:27Z

This is currently working to all attempted behaviour when run together with pyiron/executorlib#712

pyiron_workflow/executors/wrapped_executorlib.py

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

The local file executor got directly included in executorlib as a testing tool. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And always with-execute tuples since there is only ever one instance of this executor. If we have already been assigned an executor _instance_ then we trust the user to be managing its state and submit directly rather than wrapping in a with-clause Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Recent changes threw off the balance of times in the first vs second run, so rather compare to what you actually care about: that the second run is bypassing the sleep call. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Instead of a with-clause. This way the executor is still permitted to release the thread before the job is done, but we still guarantee that executors created by bespoke instructions get shutdown at the end of their one-future lifetime. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

There was necessarily only the one future, so don't wait at shutdown. This removes the need for accepting the runtime error and prevents the wrapped executorlib executors from hanging indefinitely. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber · 2025-07-15T20:12:38Z

ImportError: cannot import name 'TestClusterExecutor' from 'executorlib.api' (/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.13/site-packages/executorlib/api.py). Did you mean: 'FluxClusterExecutor'?

Waiting on executorlib-1.5.3

liamhuber · 2025-07-15T20:45:18Z

Together with pyiron/executorlib#732 this is working very nicely on the cluster now. I can

Let the workflow run
Interrupt the workflow after the slurm job has started
- Restart the workflow after the slurm job is complete
- Restart the workflow while the slurm job is still going

and in all cases everything runs perfectly smoothly. I.e. I can start with this:

import pyiron_workflow as pwf
from pyiron_workflow.executors.wrapped_executorlib import CacheSlurmClusterExecutor


wf = pwf.Workflow("executor_test")
wf.n1 = pwf.std.UserInput(20)
wf.n2 = pwf.std.Sleep(wf.n1)
wf.n3 = pwf.std.UserInput(wf.n2)

wf.n2.executor = (CacheSlurmClusterExecutor, (), {"resource_dict": {"partition": "s.cmfe"}})

wf.run()

And then either let it run, or restart the kernel and follow up with this after the appropriate delay for the case I'm interested in:

import pyiron_workflow as pwf
from pyiron_workflow.executors.wrapped_executorlib import CacheSlurmClusterExecutor

wf = pwf.Workflow("executor_test")
wf.load(filename=wf.label + "/recovery.pckl")

wf.failed = False
wf.use_cache = False
wf.run()

Outside the scope of this PR but on the TODO list is:

Convenience around use_cache and Workflow use_cache is annoying on composites #699
Convenience around running workflows in a background thread so I can make a real save and not rely on a recovery file Let workflows be easily run in the background #700
CI tests on Slurm ala Test with SLURM executorlib#726 (done and merged in Draft a slurm test #704)
Adding the new wrapped executorlib.SlurmClusterExecutor to the API Expose executorlib SLURM power in API #708
Removing the old way of finding serialized slurm results Remove old caching tools #709

Including the lower bound Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And debug the error message Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Since that's the way users will typically interact with this field. I also had to change the inheritance order to make sure we were dealing with the user-facing executor and not the task scheduler, but this doesn't impact the submit loop. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

So we pass throught the Runnable._shutdown_executor_callback process Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

.ci_support/lower_bound.yml

liamhuber · 2025-07-17T14:36:55Z

Codecov complains that Node._clean_wrapped_executorlib_executor_cache is not being tested, but it's just flat out wrong. It is absolutely being invoked in tests.integration.test_wrapped_executorlib.TestWrappedExecutorlib.test_automatic_cleaning.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber · 2025-07-17T14:41:20Z

CI tests on Slurm

Actually, I'd be more comfortable with this PR if it included these. I'll still leave exposure in the API for later, but let's take a crack at robust testing right here.

* Test slurm submission Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com> * Don't apply callbacks to cached returns Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com> * Only validate submission-time resources Otherwise we run into trouble where it loads saved executor instructions (that already have what it would use anyhow) Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com> * Mark module Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com> * Test cached result branch Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com> --------- Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber added 3 commits June 19, 2025 13:43

Be kinder to fstrings

1acffcf

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Remove prints

afa60b2

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Black

b5191ad

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber added 9 commits June 19, 2025 13:52

Bump lower bound of executorlib

4f08434

Since we now depend explicitly on a new feature Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Wrap SlurmClusterExecutor

e90d55f

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Merge branch 'main' into executor

579a157

Merge branch 'main' into executor

a8f8f60

Don't re-parse executor tuples

15cf97a

It causes a weird hang that blocks observability. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Exploit lexical path cleaning

4ec33bf

And make the expected file independently accessible Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Move cache cleaning into the finally method

9d053ff

And hide it behind its own boolean flag for testing Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Update executorlib syntax

f7dfc3d

And make the file name fixed and accessible at the class level Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Test the single node executor

ad3356c

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber added 5 commits July 11, 2025 08:40

Clean the associated cache subdirectory

154b323

The slurm executor populates this with a submission script, etc. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Clean up the slurm stuff too

1c471f7

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Add local file executor

17e4184

From @jan-janssen in [this comment](pyiron/executorlib#708 (comment)) Co-authored-by: Jan Janssen Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

lint

b624ef8

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Test local both executors

48ecfa7

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

jan-janssen reviewed Jul 12, 2025

View reviewed changes

pyiron_workflow/executors/wrapped_executorlib.py Outdated Show resolved Hide resolved

jan-janssen mentioned this pull request Jul 12, 2025

Add TestClusterExecutor to simplify debugging of SlurmClusterExecutor and FluxClusterExecutor pyiron/executorlib#714

Merged

liamhuber added 6 commits July 14, 2025 13:20

Merge branch 'main' into executor

e3cf308

Add prefix to cleaning directory

2bafc68

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Use test executor

fed80d3

The local file executor got directly included in executorlib as a testing tool. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Validate executor at assignment

5cd3413

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Decrease improvement expectation

dfc50af

Recent changes threw off the balance of times in the first vs second run, so rather compare to what you actually care about: that the second run is bypassing the sleep call. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber added 3 commits July 15, 2025 12:03

Clean up written file

8b03ca6

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Don't wait

c46b98b

There was necessarily only the one future, so don't wait at shutdown. This removes the need for accepting the runtime error and prevents the wrapped executorlib executors from hanging indefinitely. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber changed the title ~~[WIP] Wrap executorlib executors~~ Wrap executorlib executors Jul 15, 2025

liamhuber added 7 commits July 17, 2025 06:19

Bump executorlib version

4350f57

Including the lower bound Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Merge branch 'main' into executor

9ec4512

Test application to non-node

f5b78f7

And debug the error message Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Test file cleaning

6c6be03

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Test uninterpretable executor setting

b509692

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Modify test for coverage

b06b914

So we pass throught the Runnable._shutdown_executor_callback process Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

jan-janssen reviewed Jul 17, 2025

View reviewed changes

.ci_support/lower_bound.yml Outdated Show resolved Hide resolved

Decrease lower bound

0182d3a

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

This was referenced Jul 17, 2025

Weird behaviour in typehinting/type return when using pwf.api.inputs_to* #703

Closed

Expose executorlib SLURM power in API #708

Closed

liamhuber marked this pull request as ready for review July 18, 2025 17:54

liamhuber merged commit 8e1acd8 into main Jul 18, 2025
37 of 39 checks passed

liamhuber deleted the executor branch July 18, 2025 18:56

liamhuber mentioned this pull request Jul 19, 2025

Wrap executorlib caching capability #663

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wrap executorlib executors #678

Wrap executorlib executors #678

Uh oh!

liamhuber commented Jun 19, 2025

Uh oh!

github-actions bot commented Jun 19, 2025

Uh oh!

codecov bot commented Jun 19, 2025 •

edited

Loading

Uh oh!

liamhuber commented Jul 11, 2025

Uh oh!

Uh oh!

liamhuber commented Jul 15, 2025

Uh oh!

liamhuber commented Jul 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

liamhuber commented Jul 17, 2025

Uh oh!

liamhuber commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Wrap executorlib executors #678

Wrap executorlib executors #678

Uh oh!

Conversation

liamhuber commented Jun 19, 2025

Uh oh!

github-actions bot commented Jun 19, 2025

Uh oh!

codecov bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

liamhuber commented Jul 11, 2025

Uh oh!

Uh oh!

liamhuber commented Jul 15, 2025

Uh oh!

liamhuber commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

liamhuber commented Jul 17, 2025

Uh oh!

liamhuber commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jun 19, 2025 •

edited

Loading

liamhuber commented Jul 15, 2025 •

edited

Loading