Skip to content

added sql based chatbot using groq #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
300 changes: 300 additions & 0 deletions tutorials/sql_chatbot_using_groq.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"source": [
"# 📘 SQL-Based Chatbot using Groq API\n",
"## This notebook demonstrates how to build a SQL-based answering chatbot using the Groq API.\n",
"## The chatbot interprets natural language questions, converts them into SQL queries, executes them against a local SQLite database and returns conversational answers. It leverages the power of Groq's LLaMA-3.1 models for fast and reliable responses.\n",
"## This is useful for turning structured database data into accessible, natural language insights."
],
"metadata": {
"id": "sR99e30qUuGN"
}
},
{
"cell_type": "markdown",
"source": [
"# 🔧 Installing Required Libraries\n",
"## We begin by installing the required Python libraries: `groq` and `sqlite3` (built-in in Python)."
],
"metadata": {
"id": "qT9EN_-HVeqi"
}
},
{
"cell_type": "code",
"source": [
"!pip install -q groq\n"
],
"metadata": {
"id": "0c8KKGecUylx"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 🔐 Set Groq API Key\n",
"## Directly paste your Groq API key below.\n",
"## If you don't have one, get it from the [Groq Console](https://console.groq.com/keys).\n",
"## If you don't already have an account with GroqCloud, you can create one for free.\n",
"## ⚠️ NOTE: Never share this notebook publicly with your API key still in it.\n"
],
"metadata": {
"id": "YPvCs3d3VlTv"
}
},
{
"cell_type": "code",
"source": [
"GROQ_API_KEY = \"your-groq-api-key-here\" # 🔁 Replace this with your actual API key\n",
"\n",
"from groq import Groq\n",
"\n",
"# Initialize the Groq client\n",
"client = Groq(api_key=GROQ_API_KEY)\n"
],
"metadata": {
"id": "RGPKkQQ3VVsv"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 🗃️ Create a Sample SQLite Database\n",
"## We'll create an in-memory SQLite database with a `students` table."
],
"metadata": {
"id": "9na4o1h6Wfrb"
}
},
{
"cell_type": "code",
"source": [
"import sqlite3\n",
"\n",
"conn = sqlite3.connect(\":memory:\") # Temporary in-memory database\n",
"cursor = conn.cursor()\n",
"\n",
"# Create table\n",
"cursor.execute(\"\"\"\n",
"CREATE TABLE students (\n",
" id INTEGER PRIMARY KEY,\n",
" name TEXT NOT NULL,\n",
" age INTEGER,\n",
" grade TEXT\n",
")\n",
"\"\"\")\n",
"\n",
"# Insert some sample rows\n",
"students = [\n",
" (1, \"Alice\", 20, \"A\"),\n",
" (2, \"Bob\", 21, \"B\"),\n",
" (3, \"Charlie\", 19, \"A\"),\n",
" (4, \"David\", 22, \"C\"),\n",
" (5, \"Eva\", 20, \"B\")\n",
"]\n",
"cursor.executemany(\"INSERT INTO students VALUES (?, ?, ?, ?)\", students)\n",
"conn.commit()\n"
],
"metadata": {
"id": "5OZGVG3-WdRW"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 📑 Define a Function to Convert Natural Language to SQL\n",
"## This function calls the Groq LLM to turn user questions into valid SQL queries.\n"
],
"metadata": {
"id": "mwTANnJCWqWe"
}
},
{
"cell_type": "code",
"source": [
"def generate_sql_query(question: str, table_schema: str) -> str:\n",
" system_prompt = f\"\"\"\n",
"You are a data analyst assistant.\n",
"Convert the user's natural language question into a syntactically correct SQLite query using the schema below.\n",
"Respond ONLY with the SQL query (no comments, no explanation).\n",
"\n",
"Schema:\n",
"{table_schema}\n",
"\"\"\"\n",
"\n",
" messages = [\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": question}\n",
" ]\n",
"\n",
" response = client.chat.completions.create(\n",
" model=\"llama-3-8b-instruct\",\n",
" messages=messages,\n",
" temperature=0.2,\n",
" max_tokens=150,\n",
" )\n",
"\n",
" return response.choices[0].message.content.strip()\n"
],
"metadata": {
"id": "QiQcYAowWnh6"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 🔍 Extract Schema from the Database\n",
"## This helper function dynamically fetches table column information to inform the model.\n"
],
"metadata": {
"id": "33PXfYsDW8Fv"
}
},
{
"cell_type": "code",
"source": [
"def get_table_schema(connection, table_name=\"students\"):\n",
" cursor = connection.cursor()\n",
" cursor.execute(f\"PRAGMA table_info({table_name});\")\n",
" schema = cursor.fetchall()\n",
" formatted_schema = f\"Table: {table_name}\\nColumns:\\n\"\n",
" for col in schema:\n",
" formatted_schema += f\" - {col[1]} ({col[2]})\\n\"\n",
" return formatted_schema\n",
"\n",
"schema = get_table_schema(conn)\n",
"print(schema)"
],
"metadata": {
"id": "Ah0IT2ELW6ml"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 🧪 Try the SQL Generator\n",
"## We test our `generate_sql_query` function with a simple question.\n"
],
"metadata": {
"id": "tG1cJ9x4XK_1"
}
},
{
"cell_type": "code",
"source": [
"question = \"Which students have grade A?\"\n",
"sql_query = generate_sql_query(question, schema)\n",
"print(\"Generated SQL:\\n\", sql_query)"
],
"metadata": {
"id": "kvOwtv0tXNdU"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# ▶️ Execute SQL on SQLite\n",
"## This function runs the generated SQL query and returns the results.\n"
],
"metadata": {
"id": "3dt0-JcpXRwv"
}
},
{
"cell_type": "code",
"source": [
"def execute_sql_query(connection, query: str):\n",
" try:\n",
" cursor = connection.cursor()\n",
" cursor.execute(query)\n",
" return cursor.fetchall()\n",
" except sqlite3.Error as e:\n",
" return f\"SQL error: {e}\"\n",
"\n",
"results = execute_sql_query(conn, sql_query)\n",
"print(\"Results:\\n\", results)"
],
"metadata": {
"id": "N7PpZuRUXVSR"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 🤖 Full Chatbot Pipeline\n",
"## This wrapper puts everything together: schema → prompt → SQL → result."
],
"metadata": {
"id": "Yu-gxs0BXXvC"
}
},
{
"cell_type": "code",
"source": [
"def chatbot_answer(question: str) -> str:\n",
" schema = get_table_schema(conn)\n",
" sql = generate_sql_query(question, schema)\n",
" result = execute_sql_query(conn, sql)\n",
" return f\"Generated SQL:\\n{sql}\\n\\nAnswer:\\n{result}\"\n",
"\n",
"# Try it out\n",
"print(chatbot_answer(\"List the names of students who are 20 years old.\"))"
],
"metadata": {
"id": "wLwr-SSuXaQE"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# 📙 Conclusion\n",
"## In this notebook, we built a simple yet powerful SQL-aware chatbot using Groq's LLaMA models and SQLite.\n",
"\n",
"## ✅ Highlights:\n",
"## - We used LLMs to translate natural language into SQL.\n",
"## - Extracted schema dynamically and executed generated queries.\n",
"## - Used Groq's blazing-fast inference for near-instant interaction.\n",
"\n",
"## 🚀 Next Steps:\n",
"## - Connect this to real-world databases (PostgreSQL, MySQL).\n",
"## - Extend support to multiple tables and JOINs.\n",
"## - Add a front-end using Streamlit or Gradio."
],
"metadata": {
"id": "kq9lYQ9ZXge4"
}
}
]
}