RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness (NeurIPS 2025 Spotlight 🔥)
This repo is the official implementation of paper: RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness [NeurIPS 2025 (Spotlight, acceptance rate: 3.1%)]
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
Fanhu Zeng, Haiyang Guo, Fei Zhu, Li Shen, Hao Tang
Key words: Multimodal large language model, Model merging, Multi-task learning, Parameter-efficient tuning, Robust fine-tuning.
TL;DR: An effective parameter-efficient model merging method for multimodal large language models from the perspective of direction robustness in low-rank space
- [2025.09.18] RobustMerge is accepted by NeurIPS 2025 and selected as Spotlight !!!! 🎉
- [2025.08.03] We release fine-tuned models of eight seen dataset for a quick start of the benchmark! 🎨
- [2025.05.12] We release instructions for MM-Merging-Bench on Huggingface, feel free to try it! 🔥
- [2025.04.11] We release Evaluation script for RobustMerge. Try it now! 🎆
- [2025.02.24] RobustMerge is available on Arxiv. 🍬
In parameter-efficient model merging, for a single matrix, direction for each singular value can be viewed as task specific knowledge in low-rank space and the magnitude of singular value is the extent to which the knowledge is utilized in current task. Left: Stark singular values exist within task, leading to instability when merging between tasks. Right: As directions of large singular value are naturally robust, direction instability are more likely to happen for small values when merging specific singular vector.
Mitigating gap between singular values is effective for high-performance merged model: We prune ineffective parameters and construct scaling coefficients from inter-parameter relation directly on LoRA components to mitigate interference between tasks aroused from stark singular values difference. Additionally, we perform cross-task normalization to balance tasks of different data scales and enhance unseen task generalization.
Like LLaVA, install the packages following the steps below:
- Clone this repository
git clone https://github.com/AuroraZengfh/RobustMerge.git
cd RobustMerge
- Install Package
conda create -n robustmerge python=3.10 -y
conda activate robustmerge
pip install --upgrade pip
pip install -e .
- Install additional packages for training cases
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
Create models
folder, donwload base model LLaVA and put the checkpoint in the folder.
-- Create datasets
folder and download all dataset needed for merging.
-- Create instructions
folder and download all the instructions needed for merging.
For the constructed mllm merging benchmark including both datasets and instructions, you can find them in MM-Merging-Bench. Details of image sources for the datasets are listed as below:
Seen datasets for merging
Dataset | Image Source | Download Path |
---|---|---|
ScienceQA | ScienceQA | images |
VizWiz | VizWiz | images |
ImageNet | ImageNet | images |
VQAv2, REC | COCO2014 | images |
IconQA | IconQA | images |
Flickr30k | Flickr30k | images |
OCRVQA | OCRVQA | images |
Unseen datasets for merging
Dataset | Image Source | Download Path |
---|---|---|
AOKVQA | COCO2014 | images |
ImageNet-R | ImageNet-R | images |
Screen2words | Screen2words | images |
TabMWP | TabMWP | images |
You can also formulate your custom data and place them in the folder.
Follow standard parameter-efficient fine-tuning procedure in LLaVA to obtain individual checkpoints for each dataset.
You can alternate the foundation model according to your need.
e.g., take llava-v1.5-7b as an example
- Evaluate direct fine-tuned model
sh scripts/eval_merge/Eval_direct.sh
- Merge direct fine-tuned model
sh scripts/merge/merge_lora.sh
- Evaluate merged model
sh scripts/eval_merge/Eval_merge.sh
Note:
- '/path/to/your-fined-model' in
Eval_direct.sh
andmerge_lora.sh
is the root folder of direct fine-tuned chekpoint - '/path/to/yout/merged/checkpoint' in
merge_lora.sh
andEval_merge.sh
is the folder of merged checkpoint
We provide model weights on these eight datasets with LoRA fine-tuned for 1 epoch to empower a quick start.
Dataset | Fine-tuned Model Weights |
---|---|
ScienceQA | model-path |
VizWiz | model-path |
ImageNet | model-path |
VQAv2 | model-path |
REC | model-path |
IconQA | model-path |
Flickr30k | model-path |
OCRVQA | model-path |
If you find this work useful, consider giving this repository a star ⭐ and citing 📑 our paper as follows:
@article{zeng2025parameter,
title={RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness},
author={Zeng, Fanhu and Guo, Haiyang and Zhu, Fei and Shen, Li and Tang, Hao},
journal={arXiv preprint arXiv:2502.17159},
year={2025}
}
The code is based on LLaVA, TIES-Merging. Thanks for these great works and open sourcing!
If you find them helpful, please consider citing them as well.