Skip to content

Conversation

spirali
Copy link
Collaborator

@spirali spirali commented Aug 29, 2025

This PR introduces advance coupling that allows to set individual weights between groups.

Important notes:

  • New dependency on HiGHS (MLP solver), that is written in C++. It needs cmake & gcc to be installed for build (however everything is still buildable via cargo, no manual installation of HiGHS is needed). The final library is then statically linked to HQ.
    • In case of build problems, it it quite easy to switch to minilp (pure Rust solver) but the performance differences for more complex scenarios is much worse, so for now, the implementation has only HiGHS support.
  • All computations related to resource groups are now solved by the MLP solver so even if the user is not using coupling, this PR may have different performance then the previous code. As we automatically detect CPU groups, probably every run on a bigger cluster (with NUMA compute nodes) will be hit by this change. So it is a good idea to benchmark this change on real machines.
  • In case of build problems on small machines like Raspberry PI, we migrate all "group" support behind a feature flag. It should be ok as all Raspberry PIs is probably without NUMA.
  • This is PR not planned for v0.24.0.
  • Cargo fix moves some codes to the newest Rust ...

@spirali spirali force-pushed the advanced-coupling branch 6 times, most recently from c2d3e34 to d556f1f Compare September 18, 2025 14:46
@spirali spirali force-pushed the advanced-coupling branch from d556f1f to c2348c2 Compare October 2, 2025 13:17
@spirali spirali force-pushed the advanced-coupling branch from ddd0db4 to a059aa0 Compare October 3, 2025 08:49
@spirali
Copy link
Collaborator Author

spirali commented Oct 3, 2025

I am still not sure if the price for HiGHS is worth, but if it compiles it seems to me more safe default for performance reasons. On the other hand, it is not hard to remove HiGHS in case of problems.

This PR is blocking some other changes in the scheduler. So I am now asking for review request. But there is no need to hurry because of my vacation. (However it is also blocking #959.)

@spirali spirali requested a review from Kobzol October 3, 2025 10:33
Copy link
Member

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a change to documentation to talk about the new build dependencies.

/// - `--coupling cpus,gpus"
#[arg(long, value_parser = passthrough_parser(parse_resource_coupling))]
pub coupling: Option<PassThroughArgument<ResourceDescriptorCoupling>>,
/// - `--coupling TODO"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO is intended to be here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks

input: &str,
resources: &[ResourceDescriptorItem],
) -> anyhow::Result<(u8, ResourceGroupIdx)> {
let input = input.trim();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You gave up on the parser combinators? 😆 It's easier to have a high quality error message when parsing "manually", that's for sure...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not giving up completely, but if I remember correctly, in this case, it was quite ugly code and manual parser was easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants