RAG Langchain V2

A Retrieval-Augmented Generation (RAG) system built with LangChain that combines local document processing with both local and cloud-based LLM support.

Architecture

graph TD
    %% Input Documents
    A[PDF Documents] --> B[Document Loader]
    
    %% Document Processing
    B --> C[Text Splitter]
    C --> D[Vector Store<br>ChromaDB]
    
    %% Query Flow
    E[User Query] --> F[Query Processing]
    D --> F
    F --> G[Embedding Model<br>nomic-embed-text]
    G --> H[Similar Documents]
    
    %% LLM Processing
    H --> I[LLM Processing]
    I --> J[Generated Response]
    
    %% LLM Options with styling
    subgraph LLM_Models[Available LLM Models]
        direction LR
        K[Local Model:<br>deepseek-r1:1.5b]
        L[Cloud Model:<br>Gemini-1.5-flash]
    end
    
    I -.-> LLM_Models

    %% Styling
    classDef primary fill:#e1f5fe,stroke:#01579b,stroke-width:2px
    classDef secondary fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    classDef storage fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    
    class A,E primary
    class B,C,F,G,H,I,J secondary
    class D storage
    class K,L primary

Project Structure

chroma/ - Vector database storage
data/ - PDF documents directory
create_database.py - Database creation script
query_data.py - Handles RAG queries and LLM interactions
get_embedding_functions.py - Embedding model configuration
constants.py - Import GEMINI_API_KEY from '.env' file
requirements.txt - Project dependencies

Requirements

Python 3.8+
LangChain
LLMs
- Local Model: deepseek-r1:1.5b
```
ollama pull deepseek-r1:1.5b
```
- Cloud Model: Google's Gemini-1.5-flash
Embedding
- Local Embedding Model: nomic-embed-text
```
ollama pull nomic-embed-text
```
ChromaDB

Installation

Clone the git repo and change directory to the project-name using the commands below:

git clone https://github.com/usman619/RAG_langchain_v2.git
cd RAG_langchain_v2
code .

Create of virtual environment in python using the commands below:
```
python -m venv venv
source venv/bin/activate
```
Installing the required packages using the command below:
```
pip install -r requirements.txt
```
If you are using Gemini API, you can set up your environment variable:
```
echo "GEMINI_API_KEY=your_api_key_here" > .env
```

Create the vector database:

python create_database.py
# reset your chroma db
python create_database.py --reset

Run the code using the command below:

python query_text.py "Your question here?"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Langchain V2

Architecture

Project Structure

Requirements

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
.gitignore		.gitignore
README.md		README.md
constants.py		constants.py
create_database.py		create_database.py
get_embedding_functions.py		get_embedding_functions.py
query_data.py		query_data.py
requirements.txt		requirements.txt

usman619/RAG_langchain_v2

Folders and files

Latest commit

History

Repository files navigation

RAG Langchain V2

Architecture

Project Structure

Requirements

Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages