Skip to content

Conversation

sam-maloney
Copy link
Contributor

Problem: changes to RFC14 now allow a resource count to be a string containing either an RFC22 idset or RFC45 range

Update the Python Jobspec class to properly validate string counts

@sam-maloney sam-maloney force-pushed the validate-resource-count-strings branch 5 times, most recently from 5a17689 to 85af1d2 Compare September 15, 2025 14:57
@sam-maloney
Copy link
Contributor Author

I think this is ready for review (see comment above about failing CodeQL check) @grondo or @garlick

@grondo
Copy link
Contributor

grondo commented Sep 15, 2025

My only suggestion/question was if we could use try/except blocks to remove the requirement for regex. E.g. (psuedocode)

 try:
    idset.decode()
 except:
    try:
        range_decode()
     except:
         raise ValueError("count must be a valid number, idset, or range")

I do realize after writing the above that this would suppress the specific error raised by idset.decode() so perhaps that was why the match is attempted first?

Not a big deal, but thought I'd bring up the question first.

@sam-maloney sam-maloney force-pushed the validate-resource-count-strings branch from 85af1d2 to ba62438 Compare September 16, 2025 13:15
@sam-maloney
Copy link
Contributor Author

@grondo I must admit my main reason for using regex was because count_decode was not exposed in the Python bindings (or at least I didn't know how to access it directly) 😅

So here's an alternative solution which directly models the _idset or _rlist submodules of the _flux bindings to expose a new _count submodule, and which then makes the validation code very simple...

(old solution backed up here: https://github.com/sam-maloney/flux-core/tree/validate-resource-count-strings-backup)

@grondo
Copy link
Contributor

grondo commented Sep 16, 2025

I must admit my main reason for using regex was because count_decode was not exposed in the Python bindings (or at least I didn't know how to access it directly) 😅

Oh yeah, sorry. I meant to either do something exactly like what you did (or write a decode method in python that raises an error on failure). Sorry I should have been more clear!

@grondo
Copy link
Contributor

grondo commented Sep 16, 2025

This is great! The only issue is that libjob/count.h is not public in libjob, and therefore won't be able to be built in the external bindings generator: flux-python.

Let me see how easy it would be to export the count API from libjob so we can keep this simpler approach. If not, we could fall back to your first proposal for now.

@grondo
Copy link
Contributor

grondo commented Sep 16, 2025

Ok, doing this properly would probably require moving libjob/count.[ch] to a new libflux-count.so to match what is done with libflux-idset.so, otherwise the interface could be renamed with a flux_* prefix so they will be exported from libflux-core.so. Exporting the simpler count_* interface names does make me worry a bit about name collisions in other projects linking with libflux-core.so (since I assume it would need to link with libflux-count.so)

So there are a couple ways forward:

  • For now, just use your previous approach. Though I wonder if we could drop the idset regex and just use idset decode from Python since we do have that functionality in Python
  • One of us could make a PR that does the shuffle of the count interface to make it part of the public API, then your current approach would be workable

Not sure if @garlick has an opinion about the export of the count API.

@sam-maloney sam-maloney force-pushed the validate-resource-count-strings branch from ba62438 to afd592d Compare September 22, 2025 11:39
@sam-maloney sam-maloney force-pushed the validate-resource-count-strings branch 3 times, most recently from 40e05dd to a228b0d Compare September 22, 2025 13:00
Problem: changes to RFC14 now allow a resource count to be a string
containing either an RFC22 idset or RFC45 range

Update the Python Jobspec class to properly validate string counts
Problem: none of the current valid jobspecs use a resource count string

Add use_case_1.8.yaml from RFC14, resource_count_string_min_only.yaml,
and valid/resource_count_string_range.yaml to the testsuite valid jobspecs
@sam-maloney sam-maloney force-pushed the validate-resource-count-strings branch from a228b0d to cab62f4 Compare September 22, 2025 13:28
@sam-maloney
Copy link
Contributor Author

For now, just use your previous approach. Though I wonder if we could drop the idset regex and just use idset decode from Python since we do have that functionality in Python

Alright, I'm a lot less comfortable with how exactly the python bindings should be "properly" exposed, so for now I'll stick to modifying the "simpler" previous approach. (2nd attempt in branch at https://github.com/sam-maloney/flux-core/tree/validate-resource-count-strings-bindings)

This is regex free, uses the idset.decode function to check if the string decodes successfully as an idset and otherwise attempts to parse it into a range_dict which is then passed to the standard _validate_complex_range() function. (I would also like to improve the validation in that function, but I'll do that separately to keep this PR focused on just the handling of the strings themselves.)

As an aside, is the main reason for avoiding regex because of performance or because they are impenetrable to read later?

Copy link

codecov bot commented Sep 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.08%. Comparing base (4fa08fb) to head (cab62f4).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7052      +/-   ##
==========================================
- Coverage   84.10%   84.08%   -0.02%     
==========================================
  Files         549      549              
  Lines       92490    92513      +23     
==========================================
+ Hits        77785    77787       +2     
- Misses      14705    14726      +21     
Files with missing lines Coverage Δ
src/bindings/python/flux/job/Jobspec.py 91.76% <100.00%> (+0.33%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@grondo
Copy link
Contributor

grondo commented Sep 22, 2025

As an aside, is the main reason for avoiding regex because of performance or because they are impenetrable to read later?

I was thinking the latter :-). Also, the approach we've mostly taken elsewhere in the Python code is use try except blocks to attempt to decode something that could one of multiple things. However, at the time I made that comment I was thinking it would be trivial to expose the count_* interface, so it no longer holds as much weight.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-maloney: this version looks good to me if you're happy with it. Thanks!

@sam-maloney
Copy link
Contributor Author

@sam-maloney: this version looks good to me if you're happy with it. Thanks!

@grondo I'm happy if you're happy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants