-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hello Amazon Science Team,
First, thank you for your work on LC-PLM and for sharing it with the community. I've found a bug related to the model's sensitivity.
When processing two protein sequences that are very similar (differing by only one or a few amino acids), the model LcPlmForMaskedLM produces bit-for-bit identical embeddings. However, for two very different sequences, it does produce slightly different embeddings.
Furthermore, upon loading the model, the following warning was displayed. This indicates that a significant portion of the model's weights (bimamba.backbone layers) were not found in the checkpoint and were initialized randomly. This likely explains the observed lack of sensitivity:
Some weights of LcPlmForMaskedLM were not initialized from the model checkpoint at ./LC-PLM and are newly initialized: ['bimamba.backbone.layers.0.mixer.mamba_rev.in_proj.weight', 'bimamba.backbone.layers.0.mixer.mamba_rev.out_proj.weight', 'bimamba.backbone.layers.1.mixer.mamba_rev.in_proj.weight', 'bimamba.backbone.layers.1.mixer.mamba_rev.out_proj.weight',
....................................
'bimamba.backbone.layers.47.mixer.mamba_rev.in_proj.weight', 'bimamba.backbone.layers.47.mixer.mamba_rev.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
My Environment
PyTorch Version: 2.4.1+cu118
Transformers Version: 4.56.2
PyTorch CUDA Version: 11.8v