Skip to content

weijieyong/smolvla-ws

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmolVLA Quickstart (Internal)

SmolVLA is easy to use for fine-tuning or integration into robotics workflows.

Prerequisites

  • Install uv:
    curl -LsSf https://astral.sh/uv/install.sh | sh
  • Make sure you have a working GPU and CUDA drivers (tested: Ubuntu 24.04, RTX 5080, CUDA 12.8).
  • Compiled and build FFmpeg 7.1.1 from source (guide)

Note

Tested System Configuration

  • OS: Ubuntu 24.04.3 LTS (x86_64)
  • Kernel: 6.16.0-061600-generic
  • GPU: NVIDIA GeForce RTX 5080
  • NVIDIA Driver: 580.65.06
  • CUDA: 12.8
  • Python: 3.10
  • uv: 0.8.4
  • FFmpeg: 7.1.1

Clone and Install

Clone the repo and install SmolVLA dependencies:

git clone --recurse-submodules https://github.com/weijieyong/smolvla-ws.git
# If you already cloned without submodules:
# git submodule update --init --recursive
cd smolvla-ws/lerobot
uv venv --python 3.10
uv pip install -e ".[smolvla]"

(Optional) For RTX 5080 GPUs

Use nightly builds for torch/torchcodec for compatibility:

uv pip uninstall torch torchvision torchcodec
uv pip install --pre torch torchvision torchcodec --index-url https://download.pytorch.org/whl/nightly/cu128

Authenticate for Model/Dataset Access

Login to HuggingFace and Weights & Biases:

hf auth login
wandb login

(Optional) Suppress Tokenizer Warnings

export TOKENIZERS_PARALLELISM=false

Fine-tune Example

Run fine-tuning on a base model with a HuggingFace dataset:
Adjust batch_size value based on your GPU's VRAM. to prevent OOM

# from the lerobot dir
uv run src/lerobot/scripts/train.py \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=aractingi/il_gym0 \
  --batch_size=48 \
  --steps=20000 \
  --output_dir=outputs/train/my_smolvla \
  --job_name=my_smolvla_training \
  --policy.device=cuda \
  --wandb.enable=true \
  --policy.push_to_hub=false

more about the training script here

Data Recording

Visualizing dataset

  • visualizing with rerun, on the lerobot/pusht dataset
# from the lerobot dir
uv run src/lerobot/scripts/visualize_dataset.py \
  --repo-id lerobot/aloha_static_coffee_new \
  --episode-index 0

Eval

Resources

About

My Exploration on SmolVLA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published