Skip to content

codewithdark-git/Building-LLMs-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Building LLMs from Scratch – A 30-Day Journey

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.


📘 Reference Book


🗓️ Weekly Curriculum Overview

🔹 Week 1: Foundations of Language Models

  • Set up the environment and tools.
  • Learn about tokenization, embeddings, and the idea of a "language model".
  • Encode input/output sequences and build basic forward models.
  • Understand unidirectional processing and causal language modeling.

🔹 Week 2: Building the Transformer Decoder

  • Explore Transformer components: attention, multi-head attention, and positional encoding.
  • Implement residual connections, normalization, and feedforward layers.
  • Build a GPT-style decoder-only transformer architecture.

🔹 Week 3: Training and Dataset Handling

  • Load and preprocess datasets like TinyShakespeare.
  • Implement batch creation, context windows, and training routines.
  • Use cross-entropy loss, optimizers, and learning rate schedulers.
  • Monitor perplexity and improve generalization.

🔹 Week 4: Text Generation and Deployment

  • Generate text using greedy, top-k, top-p, and temperature sampling.
  • Evaluate and tune generation.
  • Export and convert model for Hugging Face compatibility.
  • Deploy via Hugging Face Hub and Gradio Space.

🛠️ Getting Started

Prerequisites

  • Python 3.8+
  • PyTorch
  • NumPy
  • Matplotlib
  • JupyterLab or Notebooks
  • Hugging Face libraries: transformers, datasets, huggingface_hub
  • gradio for deployment

Installation

git clone https://github.com/codewithdark-git/Building-LLMs-from-scratch.git
cd Building-LLMs-from-scratch
pip install -r requirements.txt

📁 Project Structure

Building-LLMs-from-scratch/
├── notebooks/            # Weekly learning notebooks
├── models/               # Model architectures & checkpoints
├── data/                 # Preprocessing and datasets
├── hf_deploy/            # Hugging Face config & deployment scripts
├── theoretical/          # Podcast & theoretical discussions
├── utils/                # Helper scripts
├── requirements.txt
└── README.md

🚀 Hugging Face Deployment

This project includes:

  • Scripts to convert the model for 🤗 Transformers compatibility
  • Uploading to Hugging Face Hub
  • Launching an interactive demo on Hugging Face Spaces using Gradio

You’ll find detailed instructions inside the hf_deploy/ folder.


📚 Resources


📄 License

MIT License — see the LICENSE file for details.