From NVIDIA Megatron-LM for visibility #18

RaymondLi0 · 2023-01-24T20:01:13Z

No description provided.

ModelCommProcessGroup integration into model interface See merge request ADLR/megatron-lm!3391

…or calculation

fix(mla): use mscale_all_dim for softmax_factor calculation See merge request ADLR/megatron-lm!2800

Use ruff linter See merge request ADLR/megatron-lm!3627

Signed-off-by: oliver könig <okoenig@nvidia.com>

Co-authored-by: root <root@batch-block7-00988.cm.cluster>

Multi-prompt inference test See merge request ADLR/megatron-lm!3568

build: Add build-backend See merge request ADLR/megatron-lm!3608

fix: Guard TE import See merge request ADLR/megatron-lm!3596

…or get_rotary_seq_len for Mamba/BERT RoPE

[Fix] Set packed_seq_params default to None for get_rotary_seq_len for Mamba/BERT RoPE See merge request ADLR/megatron-lm!3651

fix dynamic example script See merge request ADLR/megatron-lm!3653

Update MoE functional tests See merge request ADLR/megatron-lm!3419

…i-in-multi-out) Co-authored-by: Huy Vu2 <huvu@login-eos01.eos.clusters.nvidia.com> Co-authored-by: ykarnati <ykarnati@nvidia.com> Co-authored-by: Yashaswi Karnati <ykarnati@login-eos01.eos.clusters.nvidia.com> Co-authored-by: Yashaswi Karnati <ykarnati@cw-dfw-cs-001-vscode-01.cm.cluster>

Example of AVLM (audio-vision) for MiMo (multi-in-multi-out) See merge request ADLR/megatron-lm!3624

…otary_seq_len` Co-authored-by: Jason Chiu <jason.chiu@codeium.com>

fix: return int instead of tensor from `get_rotary_seq_len` See merge request ADLR/megatron-lm!3559

chore: Add local folders to gitignore See merge request ADLR/megatron-lm!3669

build: Lift modelopt and upgrade lockfile See merge request ADLR/megatron-lm!3660

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com> Co-authored-by: Siddharth Singh <sidsingh@nvidia.com>

Move cuda graph capture to core See merge request ADLR/megatron-lm!3782

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>

tests: Auto-validate weekly tests See merge request ADLR/megatron-lm!3764

Signed-off-by: oliver könig <okoenig@nvidia.com>

…futureproof against exit hangs

Add proper teardowns for cudagraphs tests to futureproof against exit hangs See merge request ADLR/megatron-lm!3885

build: Loosen transformers pin See merge request ADLR/megatron-lm!3897

…t_get_layer_offset_parametrized

fix: correct way to pass pipeline_rank in test_get_layer_offset_parametrized See merge request ADLR/megatron-lm!3899

Co-authored-by: Jon Barker <jbarker@nvidia.com>

Fix providers refactoring. See merge request ADLR/megatron-lm!3900

…for paramaters > 2-D.

Fix FSDP distributed parameter weight shapes for paramaters > 2-D. See merge request ADLR/megatron-lm!3877

chore: Add RL review group See merge request ADLR/megatron-lm!3917

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>

build: Upgrade TE to 2.7 See merge request ADLR/megatron-lm!3843

Signed-off-by: oliver könig <okoenig@nvidia.com>

…nce backend. Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>

Non-decode CUDA graphs for the dynamic inference backend. See merge request ADLR/megatron-lm!3688

Co-authored-by: Santosh Bhavani <santosh.bhavani@live.com>

Update README - Latest News See merge request ADLR/megatron-lm!3914

ci: Py312 wheels See merge request ADLR/megatron-lm!3928

RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12

RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12

ko3n1g and others added 28 commits July 17, 2025 21:26

Merge branch 'pmannan/model_interface_pg' into 'main'

26adc2d

ModelCommProcessGroup integration into model interface See merge request ADLR/megatron-lm!3391

ADLR/megatron-lm!2800 - fix(mla): use mscale_all_dim for softmax_fact…

62f0a97

…or calculation

Merge branch 'fix_softmax_factor_cal' into 'main'

e96a358

fix(mla): use mscale_all_dim for softmax_factor calculation See merge request ADLR/megatron-lm!2800

ADLR/megatron-lm!3627 - Use ruff linter

565d9ad

Merge branch 'maanug/use-ruff-lint' into 'main'

f36e170

Use ruff linter See merge request ADLR/megatron-lm!3627

Re-apply m4_remove_encoder_pp_fixed !3439

b600e38

Signed-off-by: oliver könig <okoenig@nvidia.com>

ADLR/megatron-lm!3568 - Multi-prompt inference test

2c4382f

Co-authored-by: root <root@batch-block7-00988.cm.cluster>

Merge branch 'jstjohn/inference-license-test' into 'main'

0fae489

Multi-prompt inference test See merge request ADLR/megatron-lm!3568

ADLR/megatron-lm!3608 - build: Add build-backend

b24cdc8

Merge branch 'ko3n1g/build/add-build-backend' into 'main'

f314a9b

build: Add build-backend See merge request ADLR/megatron-lm!3608

ADLR/megatron-lm!3596 - fix: Guard TE import

93b314e

Merge branch 'ko3n1g/fix/te-guard' into 'main'

1bf946b

fix: Guard TE import See merge request ADLR/megatron-lm!3596

ADLR/megatron-lm!3651 - [Fix] Set packed_seq_params default to None f…

93a8aa6

…or get_rotary_seq_len for Mamba/BERT RoPE

Merge branch 'kmorabia/mamba-rope-fix' into 'main'

510d58c

[Fix] Set packed_seq_params default to None for get_rotary_seq_len for Mamba/BERT RoPE See merge request ADLR/megatron-lm!3651

ADLR/megatron-lm!3653 - fix dynamic example script

b97d509

Merge branch 'dynamic-script-fix' into 'main'

72e290b

fix dynamic example script See merge request ADLR/megatron-lm!3653

ADLR/megatron-lm!3419 - Update MoE functional tests

6295b45

Merge branch 'denliu/update_moe_functional_tests' into 'main'

10852d7

Update MoE functional tests See merge request ADLR/megatron-lm!3419

Merge branch 'huvu/mimo_avlm' into 'main'

db41707

Example of AVLM (audio-vision) for MiMo (multi-in-multi-out) See merge request ADLR/megatron-lm!3624

chore: Version bump

41cd4e9

ADLR/megatron-lm!3559 - fix: return int instead of tensor from `get_r…

3a2f235

…otary_seq_len` Co-authored-by: Jason Chiu <jason.chiu@codeium.com>

Merge branch 'xiny/fix_get_rotary_seq_len' into 'main'

1fa6bc8

fix: return int instead of tensor from `get_rotary_seq_len` See merge request ADLR/megatron-lm!3559

ADLR/megatron-lm!3669 - chore: Add local folders to gitignore

5c00418

Merge branch 'ko3n1g/chore/update-gitignore' into 'main'

67bc80b

chore: Add local folders to gitignore See merge request ADLR/megatron-lm!3669

ADLR/megatron-lm!3660 - build: Lift modelopt and upgrade lockfile

73e7614

Merge branch 'ko3n1g/build/lift-modelopt-pin' into 'main'

653b20a

build: Lift modelopt and upgrade lockfile See merge request ADLR/megatron-lm!3660

ADLR/megatron-lm!3402 - Multi batch size cuda graphs.

3ccf7d4

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com> Co-authored-by: Siddharth Singh <sidsingh@nvidia.com>

buptzyb and others added 30 commits August 26, 2025 12:46

ADLR/megatron-lm!3782 - Move cuda graph capture to core

799cee0

Merge branch 'robinz/cudagraph_core' into 'main'

b7a6f90

Move cuda graph capture to core See merge request ADLR/megatron-lm!3782

ADLR/megatron-lm!3764 - tests: Auto-validate weekly tests

37ee3d1

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>

Merge branch 'ko3n1g/tests/thresholds-weekly' into 'main'

d6301fb

tests: Auto-validate weekly tests See merge request ADLR/megatron-lm!3764

ci: No integration tests on merge-trains

5b2cb28

Signed-off-by: oliver könig <okoenig@nvidia.com>

ci: Allow interrupt on main

7b8bbf2

Signed-off-by: oliver könig <okoenig@nvidia.com>

ci(hotfix): Non-determinism only on EXIT_CODE=0

8efa2a0

Signed-off-by: oliver könig <okoenig@nvidia.com>

ADLR/megatron-lm!3885 - Add proper teardowns for cudagraphs tests to …

6740f5e

…futureproof against exit hangs

Merge branch 'helenn-patch-legacy-cudagraph-tests' into 'main'

028f079

Add proper teardowns for cudagraphs tests to futureproof against exit hangs See merge request ADLR/megatron-lm!3885

ADLR/megatron-lm!3897 - build: Loosen transformers pin

4f6ab63

Merge branch 'ko3n1g/build/transformers-pin' into 'main'

7aad147

build: Loosen transformers pin See merge request ADLR/megatron-lm!3897

ADLR/megatron-lm!3899 - fix: correct way to pass pipeline_rank in tes…

7ceafd9

…t_get_layer_offset_parametrized

Merge branch 'zhiyul/fix_vpp_ci_test' into 'main'

bdad881

fix: correct way to pass pipeline_rank in test_get_layer_offset_parametrized See merge request ADLR/megatron-lm!3899

ADLR/megatron-lm!3900 - Fix providers refactoring.

fcbde8a

Co-authored-by: Jon Barker <jbarker@nvidia.com>

Merge branch 'vitalyk/fix-textgen' into 'main'

d1e4fc6

Fix providers refactoring. See merge request ADLR/megatron-lm!3900

ADLR/megatron-lm!3877 - Fix FSDP distributed parameter weight shapes …

b74396f

…for paramaters > 2-D.

Merge branch 'cye/fix-fsdp-dist-shape' into 'main'

1d0995d

Fix FSDP distributed parameter weight shapes for paramaters > 2-D. See merge request ADLR/megatron-lm!3877

ADLR/megatron-lm!3917 - chore: Add RL review group

500333c

Merge branch 'ko3n1g/chore/add-rl-group' into 'main'

17cb145

chore: Add RL review group See merge request ADLR/megatron-lm!3917

ADLR/megatron-lm!3843 - build: Upgrade TE to 2.7

3ec579a

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>

Merge branch 'ko3n1g/build/te-2.7' into 'main'

1deafac

build: Upgrade TE to 2.7 See merge request ADLR/megatron-lm!3843

ci(hotfix): Restart attempts

0e3d8ec

Signed-off-by: oliver könig <okoenig@nvidia.com>

ci(hotfix): Golden values

c2527ba

Signed-off-by: oliver könig <okoenig@nvidia.com>

ci(hotfix): yq path

fa4d12c

Signed-off-by: oliver könig <okoenig@nvidia.com>

ADLR/megatron-lm!3688 - Non-decode CUDA graphs for the dynamic infere…

256c855

…nce backend. Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>

Merge branch 'siddharth/non-decode-cg' into 'main'

d0d8a5c

Non-decode CUDA graphs for the dynamic inference backend. See merge request ADLR/megatron-lm!3688

ADLR/megatron-lm!3914 - Update README - Latest News

180ebf0

Co-authored-by: Santosh Bhavani <santosh.bhavani@live.com>

Merge branch 'main' into 'main'

d7ed78d

Update README - Latest News See merge request ADLR/megatron-lm!3914

ADLR/megatron-lm!3928 - ci: Py312 wheels

28925b8

Merge branch 'ko3n1g/ci/py312-support' into 'main'

a6237d0

ci: Py312 wheels See merge request ADLR/megatron-lm!3928

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

From NVIDIA Megatron-LM for visibility #18

From NVIDIA Megatron-LM for visibility #18

Uh oh!

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!

From NVIDIA Megatron-LM for visibility #18

Are you sure you want to change the base?

From NVIDIA Megatron-LM for visibility #18

Uh oh!

Conversation

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!