forked from NVIDIA/Megatron-LM
-
Notifications
You must be signed in to change notification settings - Fork 52
From NVIDIA Megatron-LM for visibility #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
RaymondLi0
wants to merge
5,201
commits into
bigcode-project:multi-query-attention
Choose a base branch
from
NVIDIA:main
base: multi-query-attention
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ModelCommProcessGroup integration into model interface See merge request ADLR/megatron-lm!3391
fix(mla): use mscale_all_dim for softmax_factor calculation See merge request ADLR/megatron-lm!2800
Use ruff linter See merge request ADLR/megatron-lm!3627
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: root <root@batch-block7-00988.cm.cluster>
Multi-prompt inference test See merge request ADLR/megatron-lm!3568
build: Add build-backend See merge request ADLR/megatron-lm!3608
fix: Guard TE import See merge request ADLR/megatron-lm!3596
…or get_rotary_seq_len for Mamba/BERT RoPE
[Fix] Set packed_seq_params default to None for get_rotary_seq_len for Mamba/BERT RoPE See merge request ADLR/megatron-lm!3651
fix dynamic example script See merge request ADLR/megatron-lm!3653
Update MoE functional tests See merge request ADLR/megatron-lm!3419
…i-in-multi-out) Co-authored-by: Huy Vu2 <huvu@login-eos01.eos.clusters.nvidia.com> Co-authored-by: ykarnati <ykarnati@nvidia.com> Co-authored-by: Yashaswi Karnati <ykarnati@login-eos01.eos.clusters.nvidia.com> Co-authored-by: Yashaswi Karnati <ykarnati@cw-dfw-cs-001-vscode-01.cm.cluster>
Example of AVLM (audio-vision) for MiMo (multi-in-multi-out) See merge request ADLR/megatron-lm!3624
…otary_seq_len` Co-authored-by: Jason Chiu <jason.chiu@codeium.com>
fix: return int instead of tensor from `get_rotary_seq_len` See merge request ADLR/megatron-lm!3559
chore: Add local folders to gitignore See merge request ADLR/megatron-lm!3669
build: Lift modelopt and upgrade lockfile See merge request ADLR/megatron-lm!3660
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com> Co-authored-by: Siddharth Singh <sidsingh@nvidia.com>
Move cuda graph capture to core See merge request ADLR/megatron-lm!3782
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
tests: Auto-validate weekly tests See merge request ADLR/megatron-lm!3764
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
…futureproof against exit hangs
Add proper teardowns for cudagraphs tests to futureproof against exit hangs See merge request ADLR/megatron-lm!3885
build: Loosen transformers pin See merge request ADLR/megatron-lm!3897
…t_get_layer_offset_parametrized
fix: correct way to pass pipeline_rank in test_get_layer_offset_parametrized See merge request ADLR/megatron-lm!3899
Co-authored-by: Jon Barker <jbarker@nvidia.com>
Fix providers refactoring. See merge request ADLR/megatron-lm!3900
…for paramaters > 2-D.
Fix FSDP distributed parameter weight shapes for paramaters > 2-D. See merge request ADLR/megatron-lm!3877
chore: Add RL review group See merge request ADLR/megatron-lm!3917
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
build: Upgrade TE to 2.7 See merge request ADLR/megatron-lm!3843
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
…nce backend. Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Non-decode CUDA graphs for the dynamic inference backend. See merge request ADLR/megatron-lm!3688
Co-authored-by: Santosh Bhavani <santosh.bhavani@live.com>
Update README - Latest News See merge request ADLR/megatron-lm!3914
ci: Py312 wheels See merge request ADLR/megatron-lm!3928
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.