This repository contains a minimal character-level recurrent neural network (RNN) language model trained on the "Tiny Shakespeare" corpus popularized by Andrej Karpathy. The code was originally authored in Google Colab and kept here both as a Jupyter notebook and as the exported Python script that mirrors the notebook cells.
RNN Language Model/RNN_Language_Model.ipynb– the original Colab notebook. Open this if you want the exact interactive environment the model was built in.rnn_language_model.py– the notebook exported to a Python script. The script still contains notebook-style commands (for example!wget), so it is best treated as reference code or executed inside an interactive environment that understands shell magics (such as Jupyter).
LICENSE– licensing information for the project.
The notebook/script expects the following software stack:
- Python 3.8+
- PyTorch (tested with version 2.x)
- Jupyter (optional, but recommended for running the notebook)
If you plan to run the code locally, install the dependencies inside a virtual environment:
python -m venv .venv
source .venv/bin/activate
pip install torch jupyterTraining uses the Tiny Shakespeare dataset downloaded from Karpathy's char-rnn repository. The notebook/script automatically fetches the data with:
!wget https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt -O tiny_shakespeare.txtYou can also download the file manually and place it in the project directory if you prefer not to use the shell command inside Jupyter.
-
Launch Jupyter (recommended):
jupyter notebook
Then open
RNN Language Model/RNN_Language_Model.ipynband run the cells sequentially. -
Or execute the exported script inside an environment that supports notebook magics (e.g.,
ipython):ipython RNN\ Language\ Model/rnn_language_model.py
During training, the script samples random contiguous batches of 64 characters and trains for 2,000 steps using an nn.RNN layer with a hidden size of 128. Progress is printed every 200 steps. After training, the model generates ~300 characters of text conditioned on the prompt "KING: ".
- Adjust
block_size,batch_size, and the training loop (range(2000)) to change the context length, batch size, or the number of optimization steps. - Modify the
hidden_sizeparameter inRNNLanguageModelto increase or decrease the capacity of the network. - Replace the dataset download URL with your own text corpus to experiment with different domains.
This project is distributed under the terms of the MIT License. See the LICENSE file for full details.
