NL-to-SQL / Data Intelligence Live

QueryNLQ

Ask your database questions in plain English.

PythonPydantic-AIPydantic-GraphDjangoMilvusAzure OpenAICeleryRedisPostgreSQL
QueryNLQ interface showing natural language query converted to SQL

A 14-agent AI pipeline that turns natural language into production-grade SQL — against databases with 500+ tables. Not a ChatGPT wrapper. A self-correcting, type-safe agentic system with vector-based schema discovery, multi-stage verification, and real-time progress tracking.

See It Work

User asks:

“What were our top 10 customers by revenue last quarter?”

QueryNLQ processes through 14 specialized agents:

  1. Classifies intent → complex SQL query
  2. Plans approach → join orders + customers, aggregate revenue, filter by date range
  3. Searches 500+ table schema via Milvus → finds orders, customers, invoices tables
  4. Generates SQL with pruned schema context
  5. Verifies by executing in read-only transaction → 10 rows, no errors
  6. Reviews for logical correctness and performance
  7. Returns results + the SQL for full transparency
SELECT c.name, SUM(i.amount) as total_revenue
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN invoices i ON o.id = i.order_id
WHERE o.order_date >= '2025-10-01' AND o.order_date < '2026-01-01'
GROUP BY c.name
ORDER BY total_revenue DESC
LIMIT 10;

Time: under 3 seconds for simple queries. 15-30 seconds for complex multi-table analysis.

Follow-up: “Now filter that by the Asia region” → QueryNLQ maintains conversation context and modifies the query.

Complex analytical queries that work:

  • “Run a cohort analysis — group customers by signup month and show retention over 6 months” → multi-CTE query with date bucketing, self-joins, and pivot logic
  • “Analyze sentiment trends across support tickets by product category over the last quarter” → aggregation across multiple tables with text-based classification and time-series grouping

These aren’t cherry-picked demos. They represent the kind of multi-step analytical queries that typically require a data analyst writing custom SQL — handled end-to-end by the agent pipeline.

Results

  • 90%+ query accuracy on production schemas with 500+ tables
  • Sub-3s response for simple queries via the quick-pass path
  • Self-correcting — verification + review loops catch and fix SQL errors before execution
  • Zero unauthorized data access — read-only transactions, column-level permissions at the validation layer
  • Real-time progress — users see which agent is working via WebSocket updates

The Problem

Business users drown in data they can’t access. At one MSME finance company, every ad-hoc question — “What’s our receivables aging by region?” or “Which customers haven’t ordered in 90 days?” — required a developer, a Slack thread, and a 2-day turnaround. Finance teams needed answers in minutes, not days.

The problem gets harder at enterprise scale. Databases with 500+ tables, cryptic column names like cust_acct_xref_id, and complex relationships across schemas. No single LLM prompt can hold 500 table DDLs. Single-prompt NL-to-SQL tools break at 20-30 tables.

Architecture: 14-Agent Query Pipeline

The query pipeline is orchestrated by Pydantic-Graph — a stateful directed graph where each node is a specialized Pydantic-AI agent with its own system prompt, output schema, and validation rules.

Entry: Intent Classification The UserInteraction agent classifies the query into six categories: general knowledge, simple SQL, complex SQL, data analysis, schema metadata, or needs clarification. A “quick pass” handles simple queries inline in under 3 seconds; complex queries are dispatched to Celery workers for async processing.

Planning: Approach → Execution → Tasks Three planning agents decompose complex queries. The ApproachPlanner generates a numbered strategy. The ExecutionPlanner converts it into concrete SQL steps. The TaskPlanner breaks it into discrete, executable tasks — each becomes an independent SQL generation cycle.

Discovery: Semantic Search → Schema Retrieval → Table Selection For each task, the SemanticSearch agent queries Milvus with an embedding of the enriched query. It returns the top-k most relevant tables from the 500+ table index. The SchemaRetrieval agent fetches their full JSON schemas. If there’s ambiguity, TableSelection asks the user to confirm.

Generation: SQL → Verification → Review (self-correcting loop) The SQLGeneration agent produces SQL using the pruned schema, task instructions, and conversation history. SQLVerification executes it in a read-only transaction to check for errors and anomalies. SQLReviewer validates logical correctness and performance. If either finds issues, the query loops back to SQLGeneration with specific feedback — a self-correcting cycle that runs until the SQL passes all checks.

Safety: Error Handling & Loop Detection An ErrorHandler agent analyzes failures and routes to recovery. A LoopDetectionManager tracks per-agent visit counts — if an agent is visited too many times, the system breaks the loop and asks the user to rephrase. No infinite cycles.

Architecture: Schema Embedding Pipeline

Before a single query can be answered, the database must be indexed. This is the pipeline that makes QueryNLQ work at enterprise scale.

1. Extract — SQLAlchemy’s inspector extracts every table, column, type, constraint, index, and foreign key. A ThreadPoolExecutor (10 workers) processes tables in parallel. Row counts and sample values are collected for context.

2. Describe — A Pydantic-AI agent generates natural-language descriptions of every table and column — using column names, types, sample values, and relationships as context. Runs as Celery batch tasks with dynamic sizing: the system monitors active tasks and adjusts batch size to stay within concurrent LLM call limits. Memory monitored per-batch to prevent OOM.

3. Enrich — Each table and column gets an “enriched embedding text” — a structured summary including description, expanded types, sample values, relationships, indexes, purpose, and synonyms. This enriched text is what gets embedded, not just the raw column name.

4. Embed — Azure OpenAI’s text-embedding-3-large (3072 dimensions) generates vectors with token-aware batching. Embeddings are unit-normalized for inner product similarity.

5. Store — Two Milvus collections (tables + columns) with HNSW indexing. When the schema changes, the system diffs live vs. stored, deletes orphans, and upserts new entries — no full re-index needed.

Key Technical Decisions

  • Pydantic-Graph, not LangGraph — Type-safe node definitions with Pydantic output schemas enforced at every stage. Each agent has explicit input/output contracts, not free-form string passing.

  • 14 specialized agents, not one mega-prompt — A single prompt with the full schema and all instructions hits context limits and produces unreliable SQL. Specialized agents with focused prompts and constrained outputs are more reliable and auditable.

  • Milvus vector search, not prompt stuffing — Stuffing 500 table DDLs into a prompt is impossible. Semantic search over embedded schema descriptions returns the 10 most relevant tables per query, keeping the SQL generation prompt focused.

  • sqlglot + SQLAlchemy for validation, not LLM self-check — SQL is validated by parsing (sqlglot), executing in a read-only transaction (SQLAlchemy), and checking results — not by asking the LLM “is this correct?”

  • Celery + Redis for async — Complex queries take 15-30 seconds across multiple agent hops. Celery workers process asynchronously while WebSocket pushes real-time progress to the UI.

  • Quick pass for simple queries — UserInteraction runs inline with a 3-second timeout. Simple classification and general questions are handled without Celery overhead.

Tech Stack

  • Agent Framework: Pydantic-AI 1.60 — structured output schemas, output validators, multi-provider support
  • Orchestration: Pydantic-Graph 1.60 — stateful directed graph with typed state and node routing
  • LLM Providers: Azure OpenAI (GPT-5.4), Anthropic Claude, Google Gemini, Groq
  • Vector Database: Milvus 2.5 — HNSW indexing, 3072-dim embeddings, two collections (tables + columns)
  • Embeddings: Azure OpenAI text-embedding-3-large
  • SQL Parsing: sqlglot 26 + SQLAlchemy 2.0
  • Backend: Django 5.2 + django-ninja (FastAPI-style routing)
  • Async: Celery 5.4 + Redis + Django Channels (WebSocket)
  • Database: PostgreSQL

Visual Architecture

How the 14 agents collaborate — and how databases are indexed for semantic search.

Query Pipeline Flow

💬

User Query

Natural language input

🧠

User Interaction Agent

Intent classification · Clarification · Quick pass (3s)

📋

Approach Planner

Strategy generation

⚙️

Execution Planner

Concrete SQL steps

📦

Task Planner

Discrete task queue

🔍

Semantic Search

Milvus vector lookup

📊

Schema Retrieval

JSON schema pruning

Table Selection

Disambiguation

SQL Generation

Pydantic-AI agent

🧪

SQL Verification

Execute in transaction

👁️

SQL Reviewer

Logic + performance

🎯

SQL Execution

Results returned · Max 1000 rows · WebSocket push

🛡️

Error Handler + Loop Detection

Recovery routing · Graceful degradation · Per-agent visit thresholds

Schema Embedding Pipeline

How QueryNLQ indexes a 500+ table database for semantic search — before a single query is answered.

1

Extract

SQLAlchemy inspector extracts tables, columns, types, keys, indexes, relationships, sample values

10 parallel workers
2

Describe

Pydantic-AI agent generates natural-language descriptions for every table and column

Celery batch tasks · Dynamic sizing
3

Enrich

Build structured embedding text: description + types + sample values + relationships + purpose

Table + column level
4

Embed

Azure OpenAI text-embedding-3-large generates 3072-dim vectors with token-aware batching

Unit-normalized · Async
5

Store

Milvus vector DB with HNSW indexing. Two collections: tables + columns. Incremental sync on schema changes.

Semantic search ready
500+ tables supported · Memory-monitored · Partial failure recovery

Want to build something like this?

Let's talk about your project. No commitment, no slides — just a conversation about what's possible.

Get in Touch