Skip to content

Conversation

ritvikrao
Copy link
Contributor

@ritvikrao ritvikrao commented Sep 25, 2025

Incorporates the reconverse communication layer in Charm++, a future replacement for Converse that is more sustainable, more lightweight (fewer lines of code), and incorporates LCI (github.com/uiuc-hpc/lci).

Installing and running on Delta

$ git clone charm && git checkout reconverse-suport
$ module load libfabric

In the charm top-level directory,

./build charm++ multicore-linux-x86_64 --with-production -j8

In user program, change Makefile to point to charm, for example

CHARMC=/path/to/charm/bin/charmc

When submitting jobs (sbatch or salloc), export the following

export LD_LIBRARY_PATH=/path/to/charm/lib:$LD_LIBRARY_PATH
export LCI_ATTR_BACKEND=ofi
export FI_CXI_RX_MATH_MODE=either “hybrid” or “software”

Use a larger process width than you would with old converse (i find that 2 or 4 procs per socket gives me the best times on reconverse, vs. 8 in old converse)

Make sure to +pemap if you are using all/almost all cores. If you do not do this then if 2 PEs are mapped to the same PU, your job will abort without explanation (except if you do bullet point c below)

Setcpuaffinity isn’t implemented yet in reconverse so you need to provide a manual pemap
Run srun jobs with --unbuffered to force prints before aborting (to help with debugging)
Delta documentation shows layout of cores wrt numa domains https://docs.ncsa.illinois.edu/systems/delta/en/latest/user_guide/architecture.html

Note 0:
You can let charm use your local copy of reconverse by

./build --with-fetch-reconverse-dir=/path/to/reconverse <other args>

Note 1:
Instead of specifying export LCI_ATTR_BACKEND=ofi every time you run the program, you could also do

./build --with-cmake-args="-DLCI_NETWORK_BACKENDS=ofi" <other args>

ritvikrao and others added 30 commits July 23, 2025 09:18
Clean up inconsistent whitespace in ckcheckpoint.C, cklocation.C, ckrdma.h,
init.C, partitioning_strategies.C, spanningTree.C, and TopoManager.C for
better code consistency.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants