🔎🤖 Natural Language to SQL Converter

This repository provides a system for retrieving data from a database using queries written in natural language.

It offers an open-source environment for experimenting with LLM-powered database search. It includes a server setup for running different models, sample code for integrating them into applications, and a Postgres service as the database.

For demonstration purposes, the dataset was used.

Key features of the application include:

Querying the database using natural language input.
Generating visualizations by the LLM from predefined plot types.
Automatic fallback logic to select relevant plots when the LLM fails.
LLM-generated tooltips explaining different parts of the SQL query.
Natural language description of the database generated by the LLM.

🗂️ Project Structure

📚 Key Libraries

LLM Host:

flask

Lightweight web framework to handle HTTP requests and route interactions with the LLM.
gunicorn

WSGI HTTP server used to serve the Flask app in a production environment.

Prompt Execution:

llama-cpp-python

Python bindings for running LLaMA models locally for prompt execution.

Plot Rendering:

bokeh

Interactive plotting library for rendering visualizations in the browser.
pandas

Data manipulation library used for preparing datasets before visualization.
scipy

Provides advanced math operations to support plot-related calculations.
squarify

Used for generating treemap visualizations based on hierarchical data.
colorcet

Color palettes used to enhance plot aesthetics and clarity.

Database Communication:

psycopg2-binary

PostgreSQL adapter for Python used to send and retrieve data via SQL queries.

📦 Requirements

See:

requirements.txt for needed packages for the LLM backend.
docker-compose.yml for services.

⚙️🔨 Installation and Usage

Clone the repository

git clone https://github.com/your-username/nl2sql-converter.git
cd nl2sql-converter

Download the LLM model:

wget -O app/backend/models/ggml-model-Q4_K.gguf https://huggingface.co/NousResearch/Nous-Capybara-7B-V1-GGUF/resolve/e6263e5fabbdcd2d682364c66ecf54b65f25aa39/ggml-model-Q4_K.gguf?download=true

Or use any other compatible model like DeepSeek-V3, Mistral, or Nous Capybara.

Configure environment variables
Copy the example file and edit it with your credentials and model settings:
```
cp .env.example .env
# Open .env in your editor and adjust DB_USER, DB_PASSWORD, LLM_MODEL_NAME, etc.
```
Start services with Docker Compose
```
docker-compose up --build -d
```
- PostgreSQL will initialize the pokemon table and load the CSV dataset.
- The Flask‐Gunicorn backend will be built and started on 0.0.0.0:${FLASK_RUN_PORT}.

Verify everything is running

# List running containers
docker ps
# You should see:
#   postgres_nl2sql   postgres:15     Up ...   5432/tcp
#   backend_nl2sql    your-backend    Up ...   5000/tcp

Access the application
Open your browser and navigate to:
```
http://localhost:${FLASK_RUN_PORT}
```

🔧 Configuration

All settings are loaded from the .env file. Keep in mind to provide the FLASK_SECRET_KEY when planning to deploy in production.

✅ Testing

To test, activate the environment and run tests:

pipenv shell
python -m unittest discover

💡 Notes

LLMs demand significant compute resources, ensure your hardware can handle intense inference workloads.
Treat this tool as an exploratory aid: models may hallucinate or lack up‑to‑date context.
Do not grant LLM agents permissions beyond read-only database access.

🧩 Contributing

This project is a starting point—many improvements can be implemented:

Testing: Add end-to-end and unit tests for the native JavaScript frontend.
Frontend Frameworks: Explore integrating React, Vue, or similar for better maintainability.
Backend Optimization: Profile and reduce CPU/memory usage during LLM inference and database queries.
Production Hardening: Improve security, error handling, and user experience for real-world deployments.
Streamlined Setup: Automate and simplify installation, configuration, and deployment.

Acknowledgments 👍

Special thanks to Larry Greski for curating and sharing the Pokémon dataset, which was used for demonstration and development purposes in this project.

📜 License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.vscode		.vscode
app		app
doc/images		doc/images
services/postgres		services/postgres
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENCE		LICENCE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔎🤖 Natural Language to SQL Converter

🗂️ Project Structure

📚 Key Libraries

📦 Requirements

⚙️🔨 Installation and Usage

🔧 Configuration

✅ Testing

💡 Notes

🧩 Contributing

Acknowledgments 👍

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Luk-kar/Natural-Language-to-SQL-Converter

Folders and files

Latest commit

History

Repository files navigation

🔎🤖 Natural Language to SQL Converter

🗂️ Project Structure

📚 Key Libraries

📦 Requirements

⚙️🔨 Installation and Usage

🔧 Configuration

✅ Testing

💡 Notes

🧩 Contributing

Acknowledgments 👍

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages