Skip to content

juan-ms2lab/RAGgedy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧸 RAGgedy - Your Local AI Assistant

RAGgedy is for the curious minds who want to chat with their documents without sending data to the cloud. It's a fully local RAG (Retrieval-Augmented Generation) system that runs entirely on your machine.

Created by Juan Santos at MS2-Lab

✨ What Makes RAGgedy Special

  • 🏠 Completely Local - Your documents never leave your computer
  • 🧸 Friendly Interface - Clean, intuitive web interface built with Streamlit
  • 🎯 Customizable Personality - Edit the AI's system prompt to match your style
  • πŸ“š Smart Document Processing - Handles PDFs with robust error handling
  • πŸš€ Real-time Progress - See exactly what's happening during indexing
  • πŸ” Powerful Search - Retrieve up to 50 document chunks for comprehensive answers

πŸ’» System Requirements

  • RAM: 8GB minimum (4GB available for the AI model)
  • Storage: 10GB free space
  • OS: macOS, Linux, or Windows
  • Python: 3.8 or higher

RAGgedy is designed to run on typical laptops and desktops - no GPU required!

πŸš€ Quick Start

1. Install the Basics

# Install Ollama (your local LLM server)
brew install ollama  # macOS
# or visit https://ollama.com for other platforms

# Download a language model (this will take a few minutes)
ollama pull phi3:mini

2. Install RAGgedy

Choose one of these installation methods:

Option A: Install from GitHub (Recommended)

# Clone the repository
git clone https://github.com/juan-ms2lab/RAGgedy.git
cd RAGgedy

# Install dependencies
pip install -r requirements.txt

# Or install as a package
pip install -e .

Option B: Install with pip (when available)

pip install raggedy

3. Set Up Your Documents

# Add your documents to the docs folder  
mkdir -p docs
# Copy your PDF or TXT files to docs/

# Build the search index
python build_index.py rebuild docs/

# Launch RAGgedy
python start_raggedy.py

# Alternative manual launch:
streamlit run app.py

# If Streamlit asks for email, use headless mode:
echo "" | streamlit run app.py --server.headless true

4. Start Chatting!

Open your browser to http://localhost:8501 and start asking questions about your documents!

🎯 Features

Smart Document Chat

Ask questions about your documents and get answers with source citations. RAGgedy will find relevant passages and use them to provide informed responses.

Customizable AI Personality

Don't like how RAGgedy responds? Click "🎯 Customize System Prompt" and change how it talks:

  • Make it more formal or casual
  • Add domain expertise
  • Change the response style
  • Set specific constraints

Real-time Indexing

Watch your documents get processed in real-time with progress bars and status updates. No more wondering if it's working!

Multiple Models

Choose from different AI models based on your needs:

  • phi3:mini - Best balance of performance & size (default, ~4GB)
  • tinyllama - Ultra-fast, minimal resources (~2GB)
  • gemma2:2b - Google's efficient model (~3GB)
  • llama2:7b - More capable but larger (~4.8GB)
  • gpt-oss:20b - Most capable but requires 16GB+ RAM

πŸ“ Project Structure

RAGgedy/
β”œβ”€β”€ 🧸 app.py                 # Main web interface
β”œβ”€β”€ πŸ“„ extract_and_chunk.py   # Document processing magic
β”œβ”€β”€ πŸ—ƒοΈ build_index.py         # Vector database management
β”œβ”€β”€ πŸ“š docs/                  # Put your documents here
β”œβ”€β”€ πŸ” db/                    # Search index (auto-created)
└── πŸ“– README.md              # You are here

πŸ› οΈ Advanced Usage

Command Line Tools

If you installed RAGgedy as a package, you can use these convenient commands:

# Launch RAGgedy with smart startup (recommended)
raggedy

# Process documents and build index
raggedy-build-index rebuild docs/

# Query from command line  
raggedy-build-index query "What is machine learning?"

# Check system stats
raggedy-build-index stats

# Clear the index
raggedy-build-index clear

# Extract and process documents
raggedy-extract docs/

Or use the Python scripts directly:

python start_raggedy.py            # Smart launcher
python build_index.py rebuild docs/
python build_index.py query "What is machine learning?"
python build_index.py stats
python build_index.py clear
python extract_and_chunk.py docs/

Customization

Want to tweak how documents are processed? Edit the settings in the Python files:

# Chunk size (characters per section)
chunk_size = 800

# Overlap between chunks
chunk_overlap = 150

# Number of results to retrieve
top_k = 5

πŸ”’ Privacy First

RAGgedy was built with privacy in mind:

  • βœ… Everything runs locally on your machine
  • βœ… No data sent to external services
  • βœ… No API keys required
  • βœ… Your documents stay private

πŸ› Troubleshooting

"Model not found" error

ollama pull phi3:mini

"No vector database" error

python build_index.py rebuild docs/

Streamlit asks for email on startup

# Use headless mode to bypass email prompt:
echo "" | streamlit run app.py --server.headless true

# Or set Streamlit config to skip:
mkdir -p ~/.streamlit
echo "[browser]" > ~/.streamlit/config.toml
echo "gatherUsageStats = false" >> ~/.streamlit/config.toml

System running slow?

  • Try an even smaller model: tinyllama or gemma2:2b
  • Reduce chunk retrieval count in the interface
  • Make sure you have at least 8GB RAM available

🀝 Contributing

RAGgedy is for the makers and tinkerers. If you've got ideas, improvements, or just want to make it better, contributions are welcome!

πŸ“„ License

RAGgedy is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.

You are free to:

  • βœ… Share and redistribute for non-commercial purposes
  • βœ… Adapt, remix, and build upon the material for non-commercial purposes
  • βœ… Use for personal projects, education, and research

Attribution required: Please credit Juan Santos (Imagiro) and link to this repository.

For commercial licensing, please contact juan@imagiro.com.

πŸ’ Acknowledgments

Built with love using:

Created with assistance from Claude AI by Anthropic


RAGgedy: Because sometimes you need an AI that's as curious about your documents as you are. 🧸

About

RAG (Retrieval-Augmented Generation) system that runs entirely on your machine with ollama

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages