Skip to content

Conversation

coreyjadams
Copy link
Collaborator

** NOT FOR RELEASE **

This is an overhaul of the FSDP tutorial. Let's bring it in after the release goes out.

PhysicsNeMo Pull Request

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.
  • An issue is linked to this pull request.

Dependencies

ktangsali and others added 28 commits May 21, 2025 18:58
…buted applications (NVIDIA#906)

* Wrap DeviceMesh in quotes for typing hint, to protect older torch versions (NVIDIA#905)

from compatibility issues.

* Bumps torch version to >=2.4.0 to minimize support surface for distributed applications.

* Adds changelog note

* Merge SongUNetPosLtEmb with SongUNetPosEmb and add support for batch>1 (NVIDIA#901)

* mult-gpu training supported corrdiff optimization

* enable mixed precision for val

* clean codebase for opt

* add amp_mode aware model architecture

* add None checking for params

* revise datatype casting schema

* Add test cases for corrdiff optimizations

Signed-off-by: Neal Pan <nuochengp@nvidia.com>

* revised from_checkpoint, update tests and CHANGELOG

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Lint and format code properly

Signed-off-by: Neal Pan <nuochengp@nvidia.com>

* add multi-gpu optimization

* rebase changes and update tests and configs

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* merge ResidualLoss and refactored layer and Unet init based on PR review

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Update layers.py with robust apex import

* address incompatibility between dynamo and patching, retain same optimization perf w torch.compile

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update tests

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update changelog

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* initialize global_index directly on device

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* formatting

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* fix loss arguments in train.py

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* merge songunetposembd with songuneyposltembd with index slicing (recompile issue persists)

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* fix small errors in songunet

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* revise positional_embedding_indexing to avoid recompile/graph break and with faster bw comparing to old version

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update changelog

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* add back SongUNetPosLtEmbd class for better ckp loading

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* add forward in SongUnetLtPosEmbd and update train.py

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update test for lt model

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update comments for embedding_selector test for lt model

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update doctest

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Added tiny detail in corrdiff readme

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* minor update to arguments and docstring

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

---------

Signed-off-by: Neal Pan <nuochengp@nvidia.com>
Signed-off-by: jialusui1102 <jialusui1102@gmail.com>
Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
Co-authored-by: Alicia Sui <asui@cw-pdx-cs-001-vscode-01.cm.cluster>
Co-authored-by: Neal Pan <nuochengp@nvidia.com>
Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com>
Co-authored-by: Charlelie Laurent <claurent@nvidia.com>

* Update CHANGELOG.md

Fix lint error

---------

Signed-off-by: Neal Pan <nuochengp@nvidia.com>
Signed-off-by: jialusui1102 <jialusui1102@gmail.com>
Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
Co-authored-by: Corey adams <coreyjadams@gmail.com>
Co-authored-by: Jialu (Alicia) Sui <125910753+jialusui1102@users.noreply.github.com>
Co-authored-by: Alicia Sui <asui@cw-pdx-cs-001-vscode-01.cm.cluster>
Co-authored-by: Neal Pan <nuochengp@nvidia.com>
Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com>
Co-authored-by: Charlelie Laurent <claurent@nvidia.com>
* fixing model.py to make comapatible with NIM

* adding freq buffer to ParameterModel

* formatting

---------

Co-authored-by: Rishi Ranade <rranade@login-eos01.eos.clusters.nvidia.com>
Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com>
* Make sure that gpu processing and output settings are configurable. Set sensible fdefaults in the example config

* Make sure that gpu processing and output settings are configurable. Set sensible fdefaults in the example config
* make dali optional

* update Changelog
* update to make it compatible for windows

* update darcy fno to minimize the dependencies to make it very light-weight and hello-worldy

* use pathlib

* lint

* updates to checkpoint loading
* updating readme

* Adding prerequisites section

* fixing ci issues

* linting

---------

Co-authored-by: Kaustubh Tangsali <71059996+ktangsali@users.noreply.github.com>
Co-authored-by: Kaustubh Tangsali <ktangsali@nvidia.com>
Fix broken ShardTensor link.
… samples (NVIDIA#949)

* add requirements.txt for bloodflow and deforming plate

* move diffusion example (NVIDIA#930)

* move diffusion example

* update broken links

* add requirements for flow reconstruction
* Add datapipes docs.

* Fix class names.
… curation steps (NVIDIA#953)

Co-authored-by: Kaustubh Tangsali <71059996+ktangsali@users.noreply.github.com>
* update logging, launch, utils api docs with added descriptions and examples

* update introductory tutorial for typos and added clarity
* Adding first half of torch compile tutorial.

* fixes to formatting and syntax

* Add second half of torch.compile tutorial.

* Clean up organization of performance docs.

* Minor clean up on perf table teasers

* remove all but IO section

* Fix typos in torch compile tutorial
* Adding first half of torch compile tutorial.

* fixes to formatting and syntax

* Add second half of torch.compile tutorial.

* Clean up organization of performance docs.

* Minor clean up on perf table teasers

* remove all but IO section

* Fix typos in torch compile tutorial
* add tutorial on physics informing

* add geometry stuff

* fix typos

* add some opening text to index.rst

* add summary

* typos

* address feedback

* address feedback

* add Ram's changes

---------

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>
@coreyjadams coreyjadams changed the base branch from 1.1.0-rc to main August 1, 2025 12:27
* update lr_decay_rate to be configurable

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update lr_decay_rate comment

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

---------

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>
CharlelieLrt and others added 27 commits August 1, 2025 08:19
* Massive refactor on domino utils.py to improve code quality

* Adds missing tensorboard requirement

* Fixes missing cuml requirement

* Begins process of fixing inference_on_stl.py

* Fixes outdated type definition

* black formatting pass

* Fixes import order

* black formatting

* Reshape accepts a shape, not a splatted iterable

* Fixes lost array axis

* Enhances docstrings in utils.py with examples and improved clarity; removes outdated examples.

* Enhances area_weighted_shuffle_array function by adding area_factor parameter for adjustable sampling bias; updates docstring with detailed explanation and examples.

* Updates docstrings in utils.py for accuracy and clarity; modifies examples in calculate_center_of_mass, standardize, nd_interpolator, pad, and pad_inp functions; adjusts k-nearest neighbors parameter in nd_interpolator for flexibility; corrects boolean checks in pad and pad_inp examples.

* black format

* Add test suite for domino utils module

This commit introduces a new test file `test_domino_utils.py` that includes comprehensive unit tests for various functions in the domino utils module. Each test verifies the functionality of the corresponding utility function using examples from the documentation, ensuring correctness and reliability.

* Refactor array_type function to handle CuPy import gracefully and optimize area_weighted_shuffle_array for consistent array handling. Remove redundant test for array_type.

* Import PyVista conditionally in extract_surface_triangles function to avoid unnecessary dependency loading.

* black formatting

* Remove unused import
…de fixes (NVIDIA#973)

* clarifies I/O in domino train.py

* Gives paths in config.yaml user-agnostic pathnames

* Switches from relu -> gelu to allow smooth gradients

* Adds initial commit for design sensitivities study

* Corrects outdated type hint

* Refactors parameters in signed_distance_field calls for clarity

* Refactors directory handling in create_directory and get_filenames functions to use pathlib for improved readability and functionality. Updates type hints to support both str and Path types.

* Deletes merge(); this function is (a) not used anywhere, (b) can be replaced simply by the built-in sum(lists), and (c) as-written will always create an error, since `newlist` is a tuple and hence does not have a .extend() method.

* black formatting

* Code quality improvements

* Replaces 'axis' with 'dim' in torch.cat calls for correctness with PyTorch documentation in GeoProcessor, GeometryRep, NNBasisFunctions, ParameterModel, and DoMINO classes.

* Adds initial changes for DoMINO sensitivity

* Refactors DesignDatapipe and DoMINOInference for improved readability and performance; updates type hints and formatting, and modifies input handling for mesh data.

* Refactors DesignDatapipe to directly use STL centers for geometry coordinates; updates DoMINOInference to improve memory management and adds detailed docstrings for clarity.

* Enhances DesignDatapipe by updating bounding box type hints, improving random sampling, and adding detailed docstrings for initialization and item retrieval methods.

* Implements Laplacian smoothing for mesh data in a new utility function; updates DoMINOInference to utilize the new smoothing function and modifies sensitivity calculations accordingly. Enhances type hints and formatting for clarity.

* Adds numba to requirements for improved performance in sensitivity analysis

* Adds sbatch_logs/ to .gitignore to exclude SLURM batch log files from version control.

* Adds compute-optimized mesh_postprocessing utilities

* Working `main.py` with abstracted postprocessing step

* formatting

* Refactors main.py to remove duplicate STL combining function and streamline input handling. Updates input file processing and enhances results storage for mesh data.

* Commits configuration files for sensitivity studies

* Adds requirements.txt

* Adds raw and smooth drag gradient data files, and implements a plotting script for gradient checking.

* Refactors import statements in main.py for consistency and clarity. Streamlines input file path construction.

* Creates main_gradient_checking.py for drag gradient checking using DoMINOInference, including sensitivity analysis and output to text files.

* Updates file paths in main_gradient_checking.py and plot_gradient_checking.py to save output data in a dedicated gradient_checking_results directory. Adds new raw and smooth drag gradient data files.

* Adds a new aerodynamics example using DoMINO to compute design sensitivities (e.g., drag adjoint) with respect to underlying input geometry in CHANGELOG.md.

* Add README.md for DoMINO sensitivity analysis pipeline, detailing usage, features, and configuration for aerodynamic design optimization.

* black formatting fixes

* Add SPDX license headers to plot_gradient_checking.py

* Fixes markdownlint

* Removes unused import

* Updates license year

* Fixes license year

* Removes unused main block sections

* Removes erroneous uv.lock commit

* Removes some optimization language

* Remove unnecessary cached yaml

* Refactors to not require separate config (instead pulling it from DoMINO), as well as eliminating relative paths

* Add warning for loading model without checkpoint in DoMINOInference

* Add verbose option to DoMINOInference for memory usage logging

* Refactor imports in design_datapipe.py for clarity and efficiency; remove unused imports and reorganize necessary ones.

* Refactor DesignDatapipe to use NearestNeighbors from cuML for neighbor finding; update input handling in DoMINOInference for improved tensor management and type consistency.

* Enhance DesignDatapipe to accept a device parameter for tensor management; update tensor creation in DoMINOInference for improved efficiency and consistency.

* Readme cleanup

* Replace GELU activation with a configurable activation function in GeoProcessor.

* formatting

* remove duplicate section

* Makes activations configurable

* formatting

* add license
* Add PyG version of VortexShedding example and VortexSheddingDataset

* Replace Union type hints with an alias. Add MeshNodeBlock tests.

* Add distributed sampler to the example. Add MeshEdgeBlock test. Fix DGL inference script.

* Fix VortexShedding PyG inference script

* Add MGN DGL2PYG tests.

* Update inference notebooks

* Make linter happy.

* Fix test.

* Update req.txt. Clean up TODO

* Address review feedback.

* Update README

* Add proper epoch loss reporting

* Address review feedback.

* Require DGL or PyG only when necessary
* Add correctness test for deterministic ssampler

* lint

* drop np dep
…#1012)

* Removed unecessary check in args overriding

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Replaced exception with warning in argument overriding

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

---------

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
* Use e2grid healpixpad when possible

* Drop unused imports

* changelog

* formatting
* address vdr comments

* fix lint

* fix lint

---------

Co-authored-by: root <root@eos0014.eos.clusters.nvidia.com>
* Migrate Vortex Shedding Reduced Mesh example to PyG

* Update CHANGELOG
…lobal parameters input (NVIDIA#903)

* changes based on updated main branch

* update to model.py and end to end testing

* changes to sharded parts of the code

* Update README

* Update inference_on_stl.py to comply with new method

* minor refactor

* update

* Tested training

* remove hardcoded stuff from inference_on_stl.py

* Removed comments from model.py

* Remove air_density and stream_velocity from domino_sharded

* Remove comments from domino_datapipe

* Removed names and make paths generic

* make encode_parameters false

* Update and remove comments

* Update README

* Update README, remove redundant text

* Update model.py to remove air_density and stream_velocity

* Update inference_on_stl.py to be consistent with main

* Update README.md to be compliant with main

* Update tests

* changes based on CI

* small cleaning config.yaml

* Update changelog

* fixing doctest issue

---------

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>
Co-authored-by: Rishikesh Ranade <dr.rranade@gmail.com>
Co-authored-by: RishikeshRanade <65631160+RishikeshRanade@users.noreply.github.com>
…A#1000)

* make dimensions consistent for checkpointing

* add use_reentrant=False to checkpoint in songuent for torch.compile support

* removed use_patch_grad_acc from loss_valid_kwargs in corrdiff train.py script as the regression loss does not support it

* set graph static for corrdiff training to enable checkpoint

* change the checkpoint reference dimension from x to y as it is the same dimension used to name the layers

* correct positional embedding in song unet

* correct embedding for gridtype==test and N_grid_channels==2

* Change single dimension shape with geometric mean to use checkpointing

* reformatted

---------

Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com>
…g qkv and added inference optimization and fixes (NVIDIA#954)

* restructured attention into separate class and fix errors in reshaping qkv

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update CHANGELOG

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* revert earlier changes in train.py

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* add multiple inference optimization for CorrDiff

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* minor update

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* add attention ckp conversion and restructure use_fp16 logistics

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* update unit tests for fp16

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Minor formatting to the Attention docstring

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Removed private attribute _use_fp16 initialization in UNet end EDMPrecondSuperResolution

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Made overlap_count a private argument in patching and the method _get_overlap_count a private method

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added non-regression test for GridPatching2D and get_overlap_count method

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added API doc for use_fp16 method in UNet wrapper

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added docs for overlap_count argument in image_fuse

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Removed utils subdirectory in tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Fixed some pytest package confusion in utils testing

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* restructure get_overlap_count() as a static method and update related unit tests

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Minor formatting in docstring for get_overlap_count

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Minor detail in docstring for image_fuse

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Changed exepct path for non-regression reference data used in test_patching

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* only do attn ckp conversion for UNet based models

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* add comment to move attn ckp conversion to classes later

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Consistently set stochastic sampler precision to float32

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Moved attention module conversion to UNetBlcok load_state_dict method

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Minor renaming in UNetBlock

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Simplified warning logic for attention module's keys mapping

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated corrdiff train and generate recipes with overridable args

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added validation to make sure amp_mode is disabled when torch.autocast is disabled

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Implemented automated channels_last layout in SongUNet when using use_apex_gn

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Fix CI: added attribute use_apex_gn to SongUNet

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Refactored amp_mode and profile_mode properties for SongUNets and their wrappers

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added two distinct shape-specific apply_wrapper in stochastic sampler

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated tests to be compatible with the modified amp_mode API

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Fix pytorch deprecation warning for is_autocast_enabled

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Implemented property factory for amp_mode and profile_mode in model wrappers + added them to StormCastUNet to pass CI tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated CI tests for diffusion models

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* resolve conflicts between cpu and apex and update related CI

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* resolve recompile errors for stochastic sampler in CICD

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>

* Updated CHANGELOG.md

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated CHANGELOG.md

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated CHANGELOG.md

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Some comments in SongUNets

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated docs with amp_mode and profile_mode APIs

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

---------

Signed-off-by: jialusui1102 <jialusui1102@gmail.com>
Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
Co-authored-by: Charlelie Laurent <claurent@nvidia.com>
Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com>
* fixed grid effect

* added data filter

* added data filter

* updated comment

---------

Co-authored-by: Oliver Hennigh <ohennigh@login-eos01.eos.clusters.nvidia.com>
…IA#982)

* Fix regression output shape

* Only use act if fused_act is True

* Avoid dtype change of buffer/param and fix softmax dtype

* Added unit tests for song unet models with learnable positional embedding, lead time aware, with compile, apex_gn, etc...

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated tests for SongUNetPosLtEmbd with AMP, Apex GN and compile

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Renamed variable in SongUNetPOsEmbd

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Revert bug introduced in SongUNetPosEmbd positional_embedding_selector

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Reverted test script to its original state

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Fixed some new CI tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added missing parameter in new tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added dtype casting in SongUNetPosEmbd forward

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Fixed number of channels in new tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added random seed in new tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added more missing random seeds to new tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Removed some random seeds added by mistake in new tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

---------

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
Co-authored-by: Julius Berner <jberner@nvidia.com>
Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com>
Co-authored-by: Charlelie Laurent <claurent@nvidia.com>
* first commit

* add README.md

* add README.md

* add README.md

* revise for 2nd round code review

* revise for 2nd round code review

* CHANGELOG update for TopoDiff

* code reivew for merge

* code review

* add command to run the model

* add command to run the model

* add command to run the model

* add command to run the model

* avoid floating material in generation

* avoid floating material in generation

* topodiff merge

* topodiff merge

* topodiff merge

* topodiff merge

* Topodiff merge

* Topodiff merge

* Topodiff merge

* Topodiff merge

* formatting

* .formatting, name change

* fix bugs, cleanup

* fix pydantic

---------

Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com>
Co-authored-by: root <root@eos0543.eos.clusters.nvidia.com>
Co-authored-by: root <root@eos0307.eos.clusters.nvidia.com>
Co-authored-by: root <root@eos0175.eos.clusters.nvidia.com>
* adding moe

* address review comments, update readme

* Small bug fix for preprocessor

* address review comments

---------

Co-authored-by: root <root@eos0287.eos.clusters.nvidia.com>
Co-authored-by: root <root@eos0247.eos.clusters.nvidia.com>
* fixed grid effect

* uv fix

* blaa

* removed nemo build

* added unmannaged

---------

Co-authored-by: Oliver Hennigh <ohennigh@login-eos01.eos.clusters.nvidia.com>
* Refactor signed_distance_field function in sdf.py for improved clarity and performance. Update parameter types to use np.ndarray and cp.ndarray, enhance docstring with detailed descriptions and examples, and streamline array conversion logic.

* Optimize memory allocation in signed_distance_field function by using wp.empty instead of wp.zeros. Update array dimensions for kernel launch and streamline return logic.

* Enhance docstring in signed_distance_field function to clarify parameters and return types, including GPU acceleration details and usage of sign winding number method. Remove unnecessary blank line.

* Enhance docstring in signed_distance_field function to provide clearer explanation of the 'include_hit_points' parameter, specifying its role in defining the SDF.

* formatting

* Fix formatting inconsistencies in docstring of signed_distance_field function in sdf.py.

* Adds fix for back-compatibility with input_points arrays with incorrect shape
* Added experimental tEDMPrecondSuperRes

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Some refactors in diffusion ResidualLoss to accomodate t-EDM subclass

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added experimental t-EDM loss

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added warning message when importing from physicsnemo.experimental

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Some fixes in docstrings

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added student-t distribution in StackedRandomGenerator

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added t-student option in corrdiff diffusion_step

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added t-student distribution option in corrdiff generate.py

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated warning message for student-t distribution

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Corrected wrong import in experimental diffusion metrics

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added t-student distribution option in CorrDiff train.py

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Minor string modif

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Some minor renaming and reformating

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated CHANGELOG.md

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added another safety check to CorrDiff generate.py

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added tests for t-EDM models, metrics and utils

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Moved t-EDM tests to existing directories

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Some fixes in t-edm tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Fixed missing device in diffusion_step

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Added a few missing docstrings for StackedRandomGenerator

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Changed default value of P_mean to 0 in t-EDM loss

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Made P_mean and P_std configurable in CorrDiff train.py and generate.py

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Updated CHANGELOG.md to document configurable P_mean and P_std

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* A few fixes in CorrDiff

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

---------

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
…1035)

* Bumps ruff from 0.0.290 to 0.12.5. Removes black, which is superseded by ruff-format.

* Refactor ruff configuration in pyproject.toml to use non-deprecated settings

* Migrates pre-commit settings to repo-wide settings

* Replaces black with ruff-format in Makefile and updates linting commands to use ruff-check.

* Adds Ruff note to Changelog

* Update CONTRIBUTING.md to reflect changes in CI checks, replacing black with ruff for formatting and linting instructions.

* Avoids acronyms

* Adds docs about Ruff

* Markdownlint fixes

* Implements Ruff safe fixes

* Adds hand-written fixes for lint errors

* Refactors _check_checkpoint to remove duplicate code

* Addresses Ruff lint issues with tarfile.extractall(), which appropriate modifications for back-compatibility with Python < 3.12.
* add patching support for determinstic sampler

* code cleanup and unit test update

* use patching wraper and fix pytest functions

* change utils.generative to utils.diffusion

* set default to torch.float64

* do compilation in determinstic sampler

* update

* Identified and fixed critical bug in stochastic_sampler and deterministic_sampler

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Format CHANGELOG.md

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

* Implements wrapper selector to fix compile issues in tests

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

---------

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>
Co-authored-by: root <root@cw-dfw-h100-004-251-012.cm.cluster>
Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com>
Co-authored-by: root <root@cw-dfw-h100-004-211-033.cm.cluster>
Co-authored-by: root <root@cw-dfw-h100-004-270-026.cm.cluster>
Co-authored-by: Charlelie Laurent <claurent@nvidia.com>
* resolving merge conflicts with main

* fixing bugs

* fixing CI errors

* fixing merge conflicts in config

* modifying Changelog

* Update config.yaml

* cpu processing in area_weighted_sampling

* fixing naming issue in domino_datapipe.py

* Update physicsnemo/models/domino/model.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update physicsnemo/models/domino/model.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update physicsnemo/models/domino/model.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update physicsnemo/models/domino/model.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update physicsnemo/models/domino/model.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update examples/cfd/external_aerodynamics/domino/src/conf/config.yaml

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update physicsnemo/models/domino/model.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Update examples/cfd/external_aerodynamics/domino/src/train.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* fixing PR comments

* addressing PR comments

* fixing CI issues

* fixing pytest issues in utils

---------

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>
* Add generic neighbor finding function that is suitable to use in FigConvNet, DoMINO, and mesh graph data pipes.

* Fix an illegal device access when  using multiple GPUs.

* Performance tuning of neighbor query

* Add warp-enabled radius search.

Also add testing.

* Update neighbor search tools to ensure we use 0 as the null index instead of -1

* Switch domino to use the new radius search function instead of ball query.

This is functionally the same, though shows a performance enhancement.

* Remove neighborlist function.  Replaced with radius_search.

* Using typing for annotations for CI

* Update examples/minimal/neighbor_list/warp_neighbor_list.py

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>

* Address nits and minor comments from PR review.

* Relocate radius search code.

* Remove old folders; goes with previous commit.

* Update test import.

* The CI container does not accept list[int] as an acceptable type
for pytorch.

* Make sure radius search is exported as a function, not a module.

* Fixing formatting, since the linter appears to have changed ....

* Remove cuda opcheck test temporarily

---------

Co-authored-by: Peter Sharpe <peterdsharpe@gmail.com>
@coreyjadams
Copy link
Collaborator Author

Moved to unified docs; closing.

@coreyjadams coreyjadams deleted the fsdp-tutorial-update branch August 13, 2025 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.