SQL Generation

Technique Context: 2020–2024

Introduced: Text-to-SQL research has been an active area of natural language processing since the late 2010s, but the field accelerated dramatically between 2020 and 2024. The Spider benchmark, introduced in 2018 by Yale researchers, established the first large-scale, cross-database evaluation standard for text-to-SQL systems, testing a model’s ability to generalize across unfamiliar database schemas. By 2023, the BIRD benchmark raised the bar further by incorporating real-world database values, external knowledge requirements, and execution-based evaluation — reflecting the messiness of production databases far more accurately than earlier academic datasets. The arrival of large language models fundamentally changed the approach: instead of training specialized semantic parsers, practitioners could now prompt general-purpose models like GPT-4, Claude, and Gemini with schema definitions and natural language questions to produce SQL directly. DIN-SQL, DAIL-SQL, and other prompt-based methods achieved state-of-the-art results on Spider by decomposing complex queries into sub-problems and providing schema context strategically.

Modern LLM Status: SQL generation is one of the strongest practical capabilities of modern language models, with frontier models achieving over 85% execution accuracy on complex cross-database benchmarks when given well-structured prompts. The key insight from research is that model performance depends far more on how the schema is presented and how the question is decomposed than on model size alone. Models still struggle with ambiguous column names, implicit join paths, database-specific functions, and queries requiring multi-step reasoning over unfamiliar schemas. The techniques covered here — providing complete schema context, specifying the SQL dialect, decomposing complex questions, and validating against expected output — are essential because even the most capable models produce incorrect or inefficient SQL when prompted without adequate schema and constraint information.

The Core Insight

Schema Is the Prompt’s Foundation

SQL generation prompting combines a natural language question with a database schema definition to enable AI models to produce syntactically correct, semantically accurate SQL queries. Unlike general code generation where the model draws on broad programming knowledge, SQL generation demands precise alignment between three elements — the database schema that defines what data exists and how tables relate, the natural language question that expresses what the user wants to know, and the constraints and dialect requirements that govern how the query must be structured for a specific database engine.

The core insight is that effective SQL generation requires providing the DATABASE SCHEMA, the NATURAL LANGUAGE QUESTION, and any CONSTRAINTS or dialect requirements. A prompt that simply asks “write a SQL query for sales data” without schema context forces the model to hallucinate table names, column types, and join relationships. But when you provide the actual CREATE TABLE statements, specify which database engine the query will run on, and clarify any ambiguous terms in the question, the model can produce queries that execute correctly on the first attempt.

Think of it like asking a new database analyst to write a report query. Without access to the database documentation, they will guess at table structures and write queries that fail. Hand them the entity-relationship diagram, explain the naming conventions, clarify which “date” field you mean, and tell them whether to optimize for readability or performance — and they produce production-ready SQL. The same principle applies to language models: the quality of the generated SQL is directly proportional to the quality of the schema context in the prompt.

Why Schema Context Transforms SQL Accuracy

When a model receives a natural language question without schema context, it must invent table names, guess column types, and assume relationships — producing queries that look plausible but fail on execution. Structured SQL prompts eliminate this guesswork by providing the complete data context the model needs: table definitions with column names and types, primary and foreign key relationships, sample data values for disambiguation, the target SQL dialect (PostgreSQL, MySQL, SQL Server, SQLite), any business rules or naming conventions, and whether the output should prioritize readability, performance, or both. The difference between a hallucinated query that references nonexistent tables and a production-ready query that handles edge cases, uses appropriate indexes, and follows the organization’s SQL style comes down entirely to the schema and constraint information in the prompt.

The SQL Generation Process

Four steps from natural language to executable SQL

1

Provide the Schema

Include the complete database schema relevant to the question. This means CREATE TABLE statements with column names, data types, primary keys, foreign keys, and any constraints such as NOT NULL, UNIQUE, or CHECK. If the database has dozens of tables, include only those relevant to the question to avoid overwhelming the context window, but always include enough for the model to trace join paths. Adding brief comments describing what each table and ambiguous column represents significantly improves accuracy, especially when column names are abbreviated or use internal naming conventions that the model cannot infer from the name alone.

Example

Provide CREATE TABLE statements for orders, customers, and order_items, including foreign key relationships and column comments explaining that “cust_tier” refers to the customer loyalty tier (bronze, silver, gold, platinum) and “ord_dt” is the order placement date.

2

State the Question in Natural Language

Express the data retrieval or manipulation need in clear, unambiguous natural language. Avoid vague terms that could map to multiple columns — instead of “show me the date,” specify “show the order placement date.” If the question involves aggregation, make the grouping explicit. If it involves filtering, specify the exact conditions. The more precisely the natural language question maps to specific columns and operations, the more accurate the generated SQL will be. Complex analytical questions benefit from being broken into sub-questions that can be addressed with subqueries or CTEs.

Example

“For each customer loyalty tier, show the total revenue from orders placed in 2024, the average order value, and the number of distinct customers. Only include tiers with more than 100 orders. Sort by total revenue descending.”

3

Specify Constraints and Dialect

Define the target SQL dialect and any constraints that affect query construction. PostgreSQL, MySQL, SQL Server, Oracle, and SQLite each have different syntax for date functions, string operations, window functions, CTEs, and pagination. Specify whether the query should use specific optimization patterns (such as EXISTS instead of IN for correlated subqueries), follow organizational SQL style guidelines (uppercase keywords, aliasing conventions), or adhere to performance constraints (avoid full table scans, use specific indexes). If the query will run on a large dataset, mention the approximate table sizes so the model can make informed optimization decisions.

Example

“Generate PostgreSQL-compatible SQL. Use CTEs for readability. The orders table has approximately 50 million rows, so avoid correlated subqueries. Use uppercase for SQL keywords. Alias all tables with meaningful abbreviations.”

4

Validate and Optimize

Review the generated SQL for correctness and request refinements. Ask the model to explain its join logic, verify that all WHERE clauses correctly filter the intended data, check that GROUP BY includes all non-aggregated columns, and confirm that the query handles NULL values appropriately. For complex queries, ask the model to walk through the execution plan conceptually, identifying potential bottlenecks. If the query produces unexpected results, provide sample data and expected output so the model can debug and correct the logic. Iterative refinement is normal — even expert SQL developers rarely write complex analytical queries correctly on the first attempt.

Example

“Verify that the LEFT JOIN between orders and order_items correctly handles orders with no line items. Explain why you chose COALESCE for the revenue calculation. Would this query benefit from a covering index on orders(cust_tier, ord_dt)?”

See the Difference

Why structured SQL prompts produce dramatically better queries

Prompt

Write a SQL query for sales data.

Response

SELECT * FROM sales WHERE date > ’2024-01-01’ ORDER BY amount DESC;

The model invents a “sales” table, guesses at column names like “date” and “amount,” uses SELECT * which is inefficient, and has no join logic, aggregation, or filtering beyond a hardcoded date.

Hallucinated schema, no joins, no aggregation, will not execute

VS

Prompt

Given the schema: orders(order_id PK, customer_id FK, ord_dt DATE, status VARCHAR), order_items(item_id PK, order_id FK, product_id FK, qty INT, unit_price DECIMAL), customers(customer_id PK, cust_tier VARCHAR). Write a PostgreSQL query: total revenue by customer tier for completed orders in 2024, with order count. Use CTEs and uppercase keywords.

Response

WITH completed_2024 AS ( SELECT o.order_id, c.cust_tier, oi.qty * oi.unit_price AS line_total FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id INNER JOIN order_items oi ON o.order_id = oi.order_id WHERE o.status = ’completed’ AND o.ord_dt BETWEEN ’2024-01-01’ AND ’2024-12-31’ ) SELECT cust_tier, SUM(line_total) AS total_revenue, COUNT(DISTINCT order_id) AS order_count FROM completed_2024 GROUP BY cust_tier ORDER BY total_revenue DESC;

Correct joins, proper aggregation, CTE structure, dialect-aware, executable

SQL Generation in Action

See how structured prompts produce production-ready queries

Complex Join Query Generation

Prompt

“Given the following schema: employees(emp_id PK, name VARCHAR, dept_id FK, hire_date DATE, salary DECIMAL, manager_id FK REFERENCES employees), departments(dept_id PK, dept_name VARCHAR, budget DECIMAL, location VARCHAR), projects(proj_id PK, proj_name VARCHAR, start_date DATE, end_date DATE, dept_id FK), assignments(assign_id PK, emp_id FK, proj_id FK, role VARCHAR, hours_allocated INT). Write a PostgreSQL query that finds all departments where the total salary of employees exceeds 80% of the department budget, along with the number of active projects (where end_date is NULL or in the future) and the average tenure of employees in years. Include department name, location, total salary, budget, budget utilization percentage, active project count, and average tenure. Sort by budget utilization descending.”

Why This Works

The prompt provides the complete schema with all table relationships, specifies a multi-table analytical query that requires joining four tables with different aggregation levels, and defines exact output columns with clear calculations. By specifying PostgreSQL as the dialect, the model can use appropriate date functions (EXTRACT, AGE) rather than guessing. The budget utilization threshold (80%) gives the model a clear WHERE or HAVING condition. Without this schema context, the model would have to invent table structures and would almost certainly get the self-referencing manager relationship, the budget utilization calculation, and the active project filter logic wrong.

Analytical Report Query

Prompt

“Schema: transactions(txn_id PK, account_id FK, txn_date TIMESTAMP, amount DECIMAL, txn_type VARCHAR, category VARCHAR, merchant VARCHAR), accounts(account_id PK, user_id FK, account_type VARCHAR, opened_date DATE), users(user_id PK, signup_date DATE, region VARCHAR, plan_tier VARCHAR). Write a SQL Server query for a monthly spending report: for each user region and plan tier combination, calculate the month-over-month percentage change in total transaction amount for the last 12 months. Use window functions for the MoM calculation. Include only regions with at least 50 active users. Format the percentage change to two decimal places. Use CTEs for each logical step.”

Why This Works

This prompt combines schema provision with a sophisticated analytical requirement — month-over-month percentage change using window functions. By specifying SQL Server as the dialect, the model will use FORMAT or CAST for decimal formatting and SQL Server-specific date functions like DATEADD and DATEDIFF rather than PostgreSQL equivalents. The CTE requirement ensures the query is readable and maintainable. The active user threshold adds a non-trivial filtering step that requires counting distinct users per region before joining. This level of specificity transforms what would be a vague “show me spending trends” into a precisely defined analytical query that a data team could deploy directly into a reporting dashboard.

Data Migration Query

Prompt

“Source schema (MySQL): legacy_products(prod_code VARCHAR PK, description TEXT, price FLOAT, cat_code VARCHAR, active TINYINT, last_modified DATETIME). Target schema (PostgreSQL): products(product_id UUID PK DEFAULT gen_random_uuid(), name VARCHAR(255), description TEXT, price_cents INTEGER, category_id INTEGER FK REFERENCES categories, is_active BOOLEAN, created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ). Categories mapping: cat_code ’EL’ maps to category_id 1, ’CL’ to 2, ’FD’ to 3, ’HW’ to 4. Write the PostgreSQL INSERT...SELECT migration query that transforms the data: convert price from dollars (FLOAT) to cents (INTEGER), map category codes to IDs, convert TINYINT active flag to BOOLEAN, set created_at to the source last_modified value, and set updated_at to NOW(). Handle NULL prices by defaulting to 0 cents. Skip inactive products older than 2020.”

Why This Works

Data migration queries are among the most error-prone SQL tasks because they involve cross-dialect type conversions, business logic transformations, and data quality handling simultaneously. This prompt succeeds because it provides both source and target schemas explicitly, defines every transformation rule (dollars to cents, category mapping, type conversions), specifies NULL handling behavior, and includes a filtering condition for stale data. The explicit category mapping table prevents the model from guessing at code-to-ID relationships. Without these details, a migration query would almost certainly miscalculate the price conversion (forgetting to round after multiplying by 100), miss the NULL handling, or produce syntax that works in MySQL but fails in PostgreSQL.

When to Use SQL Generation

Best for translating data questions into precise, executable queries

Perfect For

Ad Hoc Data Analysis

Translating business questions into SQL on the fly without requiring analysts to memorize complex schema structures, join paths, or dialect-specific syntax for every query they need to run.

Complex Analytical Queries

Building multi-join queries with window functions, CTEs, subqueries, and advanced aggregations that would take significant time to write manually and are prone to subtle logical errors.

Cross-Dialect Translation

Converting queries between database engines — translating MySQL-specific syntax to PostgreSQL, adapting Oracle PL/SQL to SQL Server T-SQL, or migrating SQLite queries to a production-grade system.

Query Optimization and Refactoring

Taking an existing slow or poorly structured query and generating an optimized version that uses better join strategies, appropriate indexing hints, or more efficient aggregation patterns for the target database engine.

Skip It When

Security-Critical Production Queries

If the generated SQL will be executed directly in a production environment without human review, especially with user-supplied input parameters, the risk of SQL injection or unintended data modification requires hand-crafted, parameterized queries with thorough security auditing.

Simple CRUD Operations

For basic INSERT, UPDATE, DELETE, or simple SELECT statements that an ORM or query builder handles well, AI-generated SQL adds unnecessary complexity without meaningful benefit over standard application frameworks.

Schema Design and DDL

When the primary task is designing the database schema itself rather than querying it, database modeling tools and migration frameworks provide better guardrails for maintaining referential integrity and managing schema evolution over time.

Highly Proprietary Database Functions

When queries rely heavily on vendor-specific stored procedures, custom extensions, or proprietary syntax that the model has limited training data for, domain-specific expertise will outperform generated SQL for these niche cases.

Use Cases

Where SQL generation delivers the most value

Business Intelligence Queries

Translating executive and stakeholder questions into multi-dimensional analytical queries with grouping sets, running totals, year-over-year comparisons, and cohort analyses that power dashboards and strategic decision-making.

Database Migration Scripts

Generating INSERT...SELECT statements, data transformation queries, and cross-dialect conversions for migrating data between different database systems while handling type mismatches, encoding differences, and schema restructuring.

Data Analysis Automation

Enabling data scientists and analysts to explore datasets by describing what they want to find in plain language, rapidly iterating through different query approaches without manually writing each variation from scratch.

Report Generation

Building the SQL backbone for recurring reports — monthly financial summaries, customer retention analyses, inventory turnover calculations, and compliance audits — with properly structured aggregations and formatting.

Schema Exploration

Generating queries to understand unfamiliar databases — discovering table relationships, identifying data patterns, profiling column distributions, and mapping out foreign key chains across large schemas with hundreds of tables.

Performance Tuning

Analyzing slow queries and generating optimized alternatives that use better join orders, replace correlated subqueries with joins, add appropriate index hints, and restructure aggregations to reduce execution time on large datasets.

Where SQL Generation Fits

SQL generation bridges natural language understanding and database interaction in AI-assisted development

Raw SQL Manual Query Writing Hand-crafted queries requiring full syntax knowledge

Query Builders Programmatic Abstraction ORMs and builder patterns generating SQL from code

AI SQL Generation Natural Language to SQL LLMs translating human questions into executable queries

Natural Language Databases Conversational Data Access Direct question-answering over structured data

Combine Schema Context with Decomposition for Complex Queries

SQL generation works best when you pair schema context with prompting techniques from other frameworks. For multi-step analytical queries, apply chain-of-thought reasoning to break the question into sub-queries that build on each other. Use few-shot learning by providing example question-SQL pairs from your specific database to teach the model your naming conventions and preferred query patterns. Apply self-verification by asking the model to mentally execute the query against sample data and confirm the output matches expectations. For particularly complex joins or business logic, use least-to-most decomposition — start with the simplest query that retrieves the base data, then incrementally add joins, filters, aggregations, and window functions until the full analytical query is assembled correctly.

Related Techniques

Explore complementary code generation techniques

Foundation Code Prompting The foundational framework for all code generation tasks — covering strategies for specifying programming language, defining function signatures, providing context, and guiding AI models to produce correct, maintainable code across any language including SQL.

Complement Structured Output Techniques for constraining AI output to specific formats like JSON, XML, or CSV — directly complementary to SQL generation when you need query results formatted for downstream processing, API responses, or data pipeline integration.

Evolution Program of Thought Extends code generation into reasoning — using executable code (including SQL) as an intermediate reasoning step to solve complex problems, where the model generates and conceptually runs queries to derive answers that pure text reasoning would struggle with.

Explore SQL Generation

Apply structured SQL generation techniques to your own databases or build data-driven prompts with our tools.

Prompt Builder All Foundations

SQL Generation

Schema Is the Prompt’s Foundation

The SQL Generation Process

Provide the Schema

State the Question in Natural Language

Specify Constraints and Dialect

Validate and Optimize

See the Difference

Vague Prompt

Structured SQL Prompt

Practice Responsible AI

SQL Generation in Action

When to Use SQL Generation

Perfect For

Skip It When

Use Cases

Business Intelligence Queries

Database Migration Scripts

Data Analysis Automation

Report Generation

Schema Exploration

Performance Tuning

Where SQL Generation Fits

Related Techniques

Explore SQL Generation