🐍 Mamba for macOS Apple Silicon

Production-ready Mamba 1 & Mamba 2 implementation optimized for Apple Silicon with official pre-trained models

Features

Mamba 1 & 2 Support - Inference of both architectures with pretrained models from Hugging Face
Text Generation - Coherent, contextual text generation
Apple Silicon Support - MPS acceleration for M1/M2/M3/M4
Dependency Management - Works without CUDA/Triton requirements
Error Handling - Robust error handling and fallbacks for both architectures
Multiple Interfaces - CLI, Python API, interactive demos

Quick Start

# 1. Clone and install
git clone https://github.com/purohit10saurabh/mamba-ssm-macos.git
cd mamba-ssm-macos
pip install -r requirements.txt

# 2. Download models 
python -m scripts.download_models mamba1    # Mamba 1 (493MB)
python -m scripts.download_models mamba2    # Mamba 2 (493MB) 

# 3. Generate text immediately  
make run-mamba1                              # Quick Mamba 1 demo
make run-mamba2                              # Quick Mamba 2 demo
python -m examples.01_demo # Interactive showcase

Architecture Comparison

Feature	Mamba 1	Mamba 2
Architecture	SSM (Selective State Space)	SSD (State Space Dual)
Training Speed	Standard	~2x faster
State Dimension	16	128 (8x larger)
Multi-head	No	Yes (via ngroups)
Memory Efficiency	Good	Better
Generation Quality	High	Higher
Model Size	129M params	129M params

Installation

Prerequisites

macOS 12.3+ with Apple Silicon (M1/M2/M3/M4)
Python 3.8+
8GB+ RAM recommended

Setup

# Clone repository
git clone https://github.com/purohit10saurabh/mamba-ssm-macos.git
cd mamba-ssm-macos

# Install dependencies (includes PyTorch with MPS support)
pip install -r requirements.txt

# Verify MPS support
python -c "import torch; print('MPS Available:', torch.backends.mps.is_available())"

Download Models

Both Models (Recommended)

make download-models  # Downloads both Mamba 1 & 2

Individual Models

python -m scripts.download_models mamba1  # Mamba 1 (original)
python -m scripts.download_models mamba2  # Mamba 2 (latest)

Usage Examples

Mamba 2 (Latest)

Quick Test

python -m examples.01_demo --interactive  # Try both models
python -m examples.01_demo --show-structure  # See organization

Makefile Commands

make run-mamba1         # Quick Mamba 1 demo
make run-mamba2         # Quick Mamba 2 demo  
make test-quick         # Fast integration test
make show-structure     # Show directory layout

Command Line Generation

Mamba 1 & 2 via Scripts

# Basic generation
python -m scripts.run_models mamba1 --prompt "The future of AI" --max-length 50
python -m scripts.run_models mamba2 --prompt "The future of AI" --max-length 30

# Custom parameters
python -m scripts.run_models mamba1 --prompt "Once upon a time" --temperature 0.8

Python API (Clean Imports)

# New organized import structure
from mamba_macos import get_device, load_and_prepare_model, generate_text_with_model

# Load any model
device = get_device()  # Automatically detects MPS/CPU
success, model, tokenizer = load_and_prepare_model("mamba1", "./models", device)

if success:
    text = generate_text_with_model(
        model, tokenizer, "The future of AI", device, max_length=50, temperature=0.7
    )
    print(text)

Learning Examples

python -m examples.01_demo      # Interactive demo
python -m examples.02_basic     # Basic API usage

Performance

Apple Silicon Results (for M1)

Model	Loading	Generation	Memory	Quality
Mamba 1	~1.0s	3-8 tok/s	~2GB	Good
Mamba 2	~1.0s	3-6 tok/s	~2GB	Better

Benchmark Results

# Test performance
make test-quick

Mamba 2 Advantages:

Similar loading speed
Better context understanding (d_state=128 vs 16)
Higher quality output (SSD architecture)
More efficient training (~2x faster during training)

Generated Examples

Mamba 2 (SSD Architecture)

"The future of artificial intelligence is a big topic in the field of artificial intelligence."

"Once upon a time, there was a man named John."

"Python is a programming language that is used to create and manipulate objects."

"The capital of France is a city of the French, and the"

Mamba 1 (SSM Architecture)

"The future of AI is not in limited solipsistic computing, but in densely-connected 
    and much richer data. In the next decade, we may be able to take advantage..."

"Once upon a time, in a land far away, there lived one lonely woman, who was 
    much respected among wolves. She resided at a rendezvous called Buguqrach..."

Repository Structure

mamba-ssm-macos/
├── 📦 src/mamba_macos/               # 🆕 Core library (clean imports)
│   ├── __init__.py                   # Package exports & version  
│   ├── utils.py                      # Device, tokenizer, generation
│   └── models.py                     # Model loading & preparation
│
├── 🔧 scripts/                       # 🆕 Utility scripts
│   ├── download_models.py            # Download both models
│   └── run_models.py                 # Run models with arguments
│
├── 🧪 tests/                         # 🆕 Organized test suite  
│   ├── unit/                         # Component-level tests
│   │   ├── test_mamba_macos.py       # Mamba 1 unit tests
│   │   ├── test_mamba2_macos.py      # Mamba 2 unit tests
│   │   └── test_generation_macos.py  # Generation tests
│   └── integration/                  # End-to-end tests
│       └── test_unified_system.py    # Complete workflow tests
│
├── 📚 examples/                       # 🆕 Curated examples
│   ├── 01_demo.py                    # 🎯 START HERE - Production demo
│   └── 02_basic.py                   # Basic forward pass
│   └── README.md                     # Examples guide
│
├── ⚙️ config/                        # 🆕 Configuration files
│   ├── pyproject.toml                # Python project config
│   └── setup.py                      # Package setup
│
├── 🛠️ tools/                         # 🆕 Development tools
│   └── run_all_tests.py              # Test runner
│
├── 🤖 models/                        # Downloaded models
│   ├── mamba1/                       # Mamba 1 files
│   └── mamba2/                       # Mamba 2 files
│
├── mamba_ssm/                        # Core implementation
│   ├── models/ & modules/            # Model architectures
│   └── ...                           # (Unchanged)
│
├── 📋 Makefile                       # 🆕 Development commands
├── 📋 requirements.txt               # 🆕 Dependencies
├── 📋 PROJECT_STRUCTURE.md           # 🆕 Structure documentation
└── 📖 README.md                      # This file

Advanced Usage

Custom Model Configuration

# Mamba 2 custom config
config = MambaConfig(
    d_model=768,
    n_layer=24,
    d_state=128,           # Larger state space
    headdim=64,           # Head dimension
    expand=2,             # Expansion factor
    ssm_cfg={"layer": "Mamba2", "d_state": 128},
    vocab_size=50288
)

# Mamba 1 custom config  
config = MambaConfig(
    d_model=768,
    n_layer=24,
    d_state=16,           # Smaller state space
    ssm_cfg={"layer": "Mamba1"},
    vocab_size=50280
)

Batch Processing

prompts = ["Prompt 1", "Prompt 2", "Prompt 3"]
for prompt in prompts:
    # Process each prompt
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(inputs.input_ids)
    print(tokenizer.decode(outputs[0]))

Fine-tuning Setup

# Prepare for fine-tuning
model.train()
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)

# Your training loop here
for batch in dataloader:
    outputs = model(batch['input_ids'], labels=batch['labels'])
    loss = outputs.loss
    loss.backward()
    optimizer.step()

Troubleshooting

Common Issues

❌ "Model files not found"

# Download models using new structure
make download-models                         # Both models
python -m scripts.download_models mamba1    # Mamba 1 only
python -m scripts.download_models mamba2    # Mamba 2 only

❌ "MPS not available"

# Check MPS support
python -c "import torch; print(torch.backends.mps.is_available())"

# If false, model will automatically use CPU

❌ Import errors

# Use new module structure
python -m examples.01_demo

# Or run with clean imports
from mamba_macos import get_device, load_and_prepare_model

❌ Slow generation

✅ First run is slower (model loading + compilation)
✅ Use shorter prompts for testing
✅ Close other apps to free memory
✅ Check Activity Monitor for memory usage

Expected Warnings (Safe to Ignore)

UserWarning: selective_scan_cuda module is not available
UserWarning: Triton is not available

These are expected - we use optimized PyTorch fallbacks.

Getting Help

📖 Read the docs: Check PROJECT_STRUCTURE.md for organization details
🧪 Run tests: make test-quick or make test
🔍 Check examples: python -m examples.01_demo --show-structure
🐛 Report issues: Create GitHub issue with error details

Learning Path

Start Here (3 Steps)

# 1. Download models
make download-models

# 2. Test basic functionality  
make run-mamba1

# 3. Explore interactively
python -m examples.01_demo

Build Something

# Use Python API
python -m examples.02_basic

# Custom generation
python -m scripts.run_models mamba1 --prompt "Your text"

Technical Details

Mamba 2 Implementation Highlights

State Space Dual (SSD) architecture from official state-spaces/mamba
Stable cumulative scan for numerical stability
Multi-head processing with ngroups design × 64 headdim
Larger state space (d_state=128) for better memory
Einsum operations for efficient tensor computations
MPS optimization for Apple Silicon acceleration

Mamba 1 Implementation Highlights

Selective State Space Model (SSM) architecture
Triton-free operation with PyTorch fallbacks
Graceful degradation when optimizations unavailable
Memory efficient selective scan implementation
Compatible with original mamba-130m weights

What's Next?

Immediate Use

Download model: Choose Mamba 1 or 2
Test functionality: Run example scripts
Try your prompts: Experiment with generation
Read examples: Learn from provided demos

Advanced Projects

Fine-tune models: Train on your data
Build applications: Use as text generation backend
Contribute: Improve implementation or docs
Research: Experiment with architectures

References

Papers

Mamba 1: Linear-Time Sequence Modeling

@article{gu2023mamba,
  title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023},
  url={https://arxiv.org/abs/2312.00752}
}

Mamba 2: Structured State Space Duality

@article{dao2024transformers,
  title={Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
  author={Dao, Tri and Gu, Albert},
  journal={arXiv preprint arXiv:2405.21060},
  year={2024},
  url={https://arxiv.org/abs/2405.21060}
}

Official Implementations

🔗 Mamba 1 & 2: state-spaces/mamba - Original PyTorch implementation
🤗 Mamba 2 Model: state-spaces/mamba2-130m - Pre-trained weights
🔬 State Space Models: state-spaces/s4 - Foundational SSM research

Related Work

Selective State Spaces: Gu et al., 2022 - S4 foundation
Hungry Hungry Hippos: Fu et al., 2023 - H3 architecture
Apple Silicon: PyTorch MPS Guide - Metal Performance Shaders

Contributing

We welcome contributions! Areas for improvement:

🐛 Bug fixes: Report and fix issues
📚 Documentation: Improve guides and examples
⚡ Performance: Optimize for specific hardware
🆕 Features: Add new capabilities
🧪 Testing: Expand test coverage

Development Setup

git clone https://github.com/purohit10saurabh/mamba-ssm-macos.git
cd mamba-ssm-macos
pip install -e ".[dev]"
pytest tests/

License

Apache 2.0 License - see LICENSE file.

Optimized for Apple Silicon • Pure Python • Production Ready

Start with python -m examples.01_demo and explore from there! ⬆️

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
config		config
examples		examples
mamba_ssm		mamba_ssm
scripts		scripts
src/mamba_macos		src/mamba_macos
tests		tests
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
AUTHORS		AUTHORS
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

License

purohit10saurabh/mamba-ssm-macos

Folders and files

Latest commit

History

Repository files navigation

🐍 Mamba for macOS Apple Silicon

Features

Quick Start

Table of Contents

Architecture Comparison

Installation

Prerequisites

Setup

Download Models

Both Models (Recommended)

Individual Models

Usage Examples

Mamba 2 (Latest)

Quick Test

Makefile Commands

Command Line Generation

Mamba 1 & 2 via Scripts

Python API (Clean Imports)

Learning Examples

Performance

Apple Silicon Results (for M1)

Benchmark Results

Generated Examples

Mamba 2 (SSD Architecture)

Mamba 1 (SSM Architecture)

Repository Structure

Advanced Usage

Custom Model Configuration

Batch Processing

Fine-tuning Setup

Troubleshooting

Common Issues

❌ "Model files not found"

❌ "MPS not available"

❌ Import errors

❌ Slow generation

Expected Warnings (Safe to Ignore)

Getting Help

Learning Path

Start Here (3 Steps)

Build Something

Technical Details

Mamba 2 Implementation Highlights

Mamba 1 Implementation Highlights

What's Next?

Immediate Use

Advanced Projects

References

Papers

Mamba 1: Linear-Time Sequence Modeling

Mamba 2: Structured State Space Duality

Official Implementations

Related Work

Contributing

Development Setup

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 44

Uh oh!

Languages

Packages