Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and vision-language capabilities
-
Updated
Feb 7, 2025 - Python
Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and vision-language capabilities
使用Decoder-only的Transformer进行时序预测,包含SwiGLU和RoPE(Rotary Positional Embedding),Time series prediction using Decoder-only Transformer, Including SwiGLU and RoPE(Rotary Positional Embedding)
🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
Code for paper "Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI"
This repository contains the implementation and experiments for comparing gradual growth methods, specifically the G_stack approach, with naive models trained from scratch. The project focuses on addressing catastrophic forgetting and improving model performance in continuous learning scenarios.
Decoder-only transformer, simplest character-level tokenization, training and text generation.
Auto regressive text generation application using decoder transformer
A decoder only approach for image reconstruction inspired by adversarial machine learning implemented in keras/tensorflow2
Decoder-only transfomer model for answering short questions using causal self-attention.
Add a description, image, and links to the decoder-only topic page so that developers can more easily learn about it.
To associate your repository with the decoder-only topic, visit your repo's landing page and select "manage topics."