RAGgedy is for the curious minds who want to chat with their documents without sending data to the cloud. It's a fully local RAG (Retrieval-Augmented Generation) system that runs entirely on your machine.
Created by Juan Santos at MS2-Lab
- π Completely Local - Your documents never leave your computer
- π§Έ Friendly Interface - Clean, intuitive web interface built with Streamlit
- π― Customizable Personality - Edit the AI's system prompt to match your style
- π Smart Document Processing - Handles PDFs with robust error handling
- π Real-time Progress - See exactly what's happening during indexing
- π Powerful Search - Retrieve up to 50 document chunks for comprehensive answers
- RAM: 8GB minimum (4GB available for the AI model)
- Storage: 10GB free space
- OS: macOS, Linux, or Windows
- Python: 3.8 or higher
RAGgedy is designed to run on typical laptops and desktops - no GPU required!
# Install Ollama (your local LLM server)
brew install ollama # macOS
# or visit https://ollama.com for other platforms
# Download a language model (this will take a few minutes)
ollama pull phi3:mini
Choose one of these installation methods:
# Clone the repository
git clone https://github.com/juan-ms2lab/RAGgedy.git
cd RAGgedy
# Install dependencies
pip install -r requirements.txt
# Or install as a package
pip install -e .
pip install raggedy
# Add your documents to the docs folder
mkdir -p docs
# Copy your PDF or TXT files to docs/
# Build the search index
python build_index.py rebuild docs/
# Launch RAGgedy
python start_raggedy.py
# Alternative manual launch:
streamlit run app.py
# If Streamlit asks for email, use headless mode:
echo "" | streamlit run app.py --server.headless true
Open your browser to http://localhost:8501
and start asking questions about your documents!
Ask questions about your documents and get answers with source citations. RAGgedy will find relevant passages and use them to provide informed responses.
Don't like how RAGgedy responds? Click "π― Customize System Prompt" and change how it talks:
- Make it more formal or casual
- Add domain expertise
- Change the response style
- Set specific constraints
Watch your documents get processed in real-time with progress bars and status updates. No more wondering if it's working!
Choose from different AI models based on your needs:
phi3:mini
- Best balance of performance & size (default, ~4GB)tinyllama
- Ultra-fast, minimal resources (~2GB)gemma2:2b
- Google's efficient model (~3GB)llama2:7b
- More capable but larger (~4.8GB)gpt-oss:20b
- Most capable but requires 16GB+ RAM
RAGgedy/
βββ π§Έ app.py # Main web interface
βββ π extract_and_chunk.py # Document processing magic
βββ ποΈ build_index.py # Vector database management
βββ π docs/ # Put your documents here
βββ π db/ # Search index (auto-created)
βββ π README.md # You are here
If you installed RAGgedy as a package, you can use these convenient commands:
# Launch RAGgedy with smart startup (recommended)
raggedy
# Process documents and build index
raggedy-build-index rebuild docs/
# Query from command line
raggedy-build-index query "What is machine learning?"
# Check system stats
raggedy-build-index stats
# Clear the index
raggedy-build-index clear
# Extract and process documents
raggedy-extract docs/
Or use the Python scripts directly:
python start_raggedy.py # Smart launcher
python build_index.py rebuild docs/
python build_index.py query "What is machine learning?"
python build_index.py stats
python build_index.py clear
python extract_and_chunk.py docs/
Want to tweak how documents are processed? Edit the settings in the Python files:
# Chunk size (characters per section)
chunk_size = 800
# Overlap between chunks
chunk_overlap = 150
# Number of results to retrieve
top_k = 5
RAGgedy was built with privacy in mind:
- β Everything runs locally on your machine
- β No data sent to external services
- β No API keys required
- β Your documents stay private
ollama pull phi3:mini
python build_index.py rebuild docs/
# Use headless mode to bypass email prompt:
echo "" | streamlit run app.py --server.headless true
# Or set Streamlit config to skip:
mkdir -p ~/.streamlit
echo "[browser]" > ~/.streamlit/config.toml
echo "gatherUsageStats = false" >> ~/.streamlit/config.toml
- Try an even smaller model:
tinyllama
orgemma2:2b
- Reduce chunk retrieval count in the interface
- Make sure you have at least 8GB RAM available
RAGgedy is for the makers and tinkerers. If you've got ideas, improvements, or just want to make it better, contributions are welcome!
RAGgedy is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.
You are free to:
- β Share and redistribute for non-commercial purposes
- β Adapt, remix, and build upon the material for non-commercial purposes
- β Use for personal projects, education, and research
Attribution required: Please credit Juan Santos (Imagiro) and link to this repository.
For commercial licensing, please contact juan@imagiro.com.
Built with love using:
- Ollama - Local LLM inference
- ChromaDB - Vector database
- Streamlit - Web interface
- SentenceTransformers - Text embeddings
Created with assistance from Claude AI by Anthropic
RAGgedy: Because sometimes you need an AI that's as curious about your documents as you are. π§Έ