Skip to content

Conversation

RaymondLi0
Copy link
Collaborator

No description provided.

@RaymondLi0 RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12
@RaymondLi0 RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12
ko3n1g and others added 28 commits July 17, 2025 21:26
ModelCommProcessGroup integration into model interface

See merge request ADLR/megatron-lm!3391
fix(mla): use mscale_all_dim for softmax_factor calculation

See merge request ADLR/megatron-lm!2800
Use ruff linter

See merge request ADLR/megatron-lm!3627
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: root <root@batch-block7-00988.cm.cluster>
Multi-prompt inference test

See merge request ADLR/megatron-lm!3568
build: Add build-backend

See merge request ADLR/megatron-lm!3608
fix: Guard TE import

See merge request ADLR/megatron-lm!3596
[Fix] Set packed_seq_params default to None for get_rotary_seq_len for Mamba/BERT RoPE

See merge request ADLR/megatron-lm!3651
fix dynamic example script

See merge request ADLR/megatron-lm!3653
Update MoE functional tests

See merge request ADLR/megatron-lm!3419
…i-in-multi-out)

Co-authored-by: Huy Vu2 <huvu@login-eos01.eos.clusters.nvidia.com>
Co-authored-by: ykarnati <ykarnati@nvidia.com>
Co-authored-by: Yashaswi Karnati <ykarnati@login-eos01.eos.clusters.nvidia.com>
Co-authored-by: Yashaswi Karnati <ykarnati@cw-dfw-cs-001-vscode-01.cm.cluster>
Example of AVLM (audio-vision) for MiMo (multi-in-multi-out)

See merge request ADLR/megatron-lm!3624
…otary_seq_len`

Co-authored-by: Jason Chiu <jason.chiu@codeium.com>
fix: return int instead of tensor from `get_rotary_seq_len`

See merge request ADLR/megatron-lm!3559
chore: Add local folders to gitignore

See merge request ADLR/megatron-lm!3669
build: Lift modelopt and upgrade lockfile

See merge request ADLR/megatron-lm!3660
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Co-authored-by: Siddharth Singh <sidsingh@nvidia.com>
buptzyb and others added 30 commits August 26, 2025 12:46
Move cuda graph capture to core

See merge request ADLR/megatron-lm!3782
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
tests: Auto-validate weekly tests

See merge request ADLR/megatron-lm!3764
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Add proper teardowns for cudagraphs tests to futureproof against exit hangs

See merge request ADLR/megatron-lm!3885
build: Loosen transformers pin

See merge request ADLR/megatron-lm!3897
fix: correct way to pass pipeline_rank in test_get_layer_offset_parametrized

See merge request ADLR/megatron-lm!3899
Co-authored-by: Jon Barker <jbarker@nvidia.com>
Fix providers refactoring.

See merge request ADLR/megatron-lm!3900
Fix FSDP distributed parameter weight shapes for paramaters > 2-D.

See merge request ADLR/megatron-lm!3877
chore: Add RL review group

See merge request ADLR/megatron-lm!3917
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
build: Upgrade TE to 2.7

See merge request ADLR/megatron-lm!3843
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
…nce backend.

Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Non-decode CUDA graphs for the dynamic inference backend.

See merge request ADLR/megatron-lm!3688
Co-authored-by: Santosh Bhavani <santosh.bhavani@live.com>
Update README - Latest News

See merge request ADLR/megatron-lm!3914
ci: Py312 wheels

See merge request ADLR/megatron-lm!3928
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.