Chain-of-Thought Prompting

Technique Context: 2022

Introduced: Chain-of-Thought (CoT) prompting was published in 2022 by Wei et al. at Google Brain. The landmark paper demonstrated that including intermediate reasoning steps in few-shot examples — rather than just input-output pairs — dramatically improved LLM performance on arithmetic, commonsense, and symbolic reasoning benchmarks. The technique was deceptively simple: show the model how to think, not just what to answer, and it would follow suit. This single insight launched an entire family of reasoning-enhancement techniques.

Modern LLM Status: Chain-of-Thought reasoning has been deeply integrated into modern LLM architectures. Claude, GPT-4, and Gemini all employ internal reasoning processes inspired by CoT. Many models now “think” step-by-step by default for complex queries. However, explicitly prompting for Chain-of-Thought remains valuable when you need visible, auditable reasoning trails, when working with smaller models that do not reason automatically, or when tackling problems where the default reasoning depth is insufficient. CoT is the foundation technique from which nearly all modern prompt-based reasoning methods descend.

The Core Insight

Make the Thinking Visible

When you ask a language model a complex question, it normally generates an answer in a single leap — internally compressing all reasoning into a final token prediction. This works for simple tasks, but for multi-step problems the model can silently skip steps, conflate variables, or lose track of intermediate results. The error hides inside the black box.

Chain-of-Thought changes the game by externalizing the reasoning process. By providing few-shot examples that include intermediate steps — not just questions and answers, but questions, step-by-step reasoning, and then answers — you teach the model to generate its own reasoning chain before committing to a conclusion. Each generated token in the chain becomes context for the next, creating a scaffold that guides the model toward the correct answer.

Think of it like showing your work on a math exam. The answer alone might be right or wrong, but the work reveals exactly where the logic holds or breaks. Chain-of-Thought gives LLMs the same “scratch paper” that humans rely on for complex problem-solving.

Why Steps Beat Leaps

When a model generates intermediate reasoning tokens, each step constrains the probability distribution for the next step. A correct “Step 1” makes a correct “Step 2” far more likely — the reasoning chain creates a self-reinforcing path toward accuracy. Without these intermediate tokens, the model must compress all reasoning into a single forward pass, which is where complex problems overwhelm its capacity and errors emerge.

The Chain-of-Thought Process

Three stages from problem to reasoned answer

1

Provide Reasoning Demonstrations

Include few-shot examples in your prompt that demonstrate not just the final answer, but every intermediate reasoning step. Each example should walk through the problem methodically — identifying relevant information, performing intermediate calculations or inferences, and arriving at the answer through visible logic.

Example

“Q: Roger has 5 tennis balls. He buys 2 more cans of 3 tennis balls each. How many does he have now? A: Roger started with 5 balls. 2 cans of 3 balls each means 2 × 3 = 6 new balls. 5 + 6 = 11. The answer is 11.”

2

Present the Target Problem

After one or more reasoning demonstrations, present your actual question in the same format. The model recognizes the pattern from your examples and generates its own chain of reasoning steps before producing a final answer. The demonstrations serve as a template that the model follows structurally.

Example

“Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many do they have?” — The model now knows to show its work before answering.

3

Model Generates Reasoning Chain and Answer

The model produces a step-by-step reasoning chain mirroring the structure of your demonstrations, then derives its final answer from the chain. Each intermediate step is visible, auditable, and debuggable. If the answer is wrong, you can trace exactly where the reasoning went off track and refine your approach accordingly.

Example

“A: The cafeteria started with 23 apples. They used 20, leaving 23 − 20 = 3 apples. They bought 6 more, so 3 + 6 = 9. The answer is 9.”

See the Difference

Why showing reasoning steps produces dramatically better answers

Prompt

A store sells notebooks for $4 each. Sarah buys 3 notebooks and pays with a $20 bill. She then uses her change to buy as many pencils as possible at $1.50 each. How many pencils can she buy?

Response

Sarah can buy 6 pencils.

Wrong answer, no visible reasoning, no way to diagnose the error

VS

Step-by-Step Reasoning

Step 1: Calculate the cost of notebooks.
3 notebooks × $4 each = $12.

Step 2: Calculate Sarah’s change.
$20 − $12 = $8 remaining.

Step 3: Determine how many pencils she can buy.
$8 ÷ $1.50 per pencil = 5.33 pencils.
Since she cannot buy a fraction of a pencil, she can buy 5 pencils.

Final Answer

Sarah can buy 5 pencils. She spends $12 on notebooks, receives $8 in change, and each pencil costs $1.50 — so 5 pencils cost $7.50 with $0.50 left over.

Correct answer, every step visible and verifiable

Chain-of-Thought in Action

See how step-by-step reasoning improves different problem types

Math Reasoning

Problem

“A farmer has 3 fields. The first field produces 240 bushels of wheat, the second produces 180, and the third produces 310. He needs to reserve 15% of his total harvest for next year’s seeds. How many bushels can he sell?”

Chain-of-Thought Response

Step 1: Calculate total harvest.
240 + 180 + 310 = 730 bushels total.

Step 2: Calculate seed reserve (15% of total).
730 × 0.15 = 109.5 bushels reserved for seeds.

Step 3: Calculate sellable bushels.
730 − 109.5 = 620.5 bushels available to sell.

Answer: The farmer can sell 620.5 bushels of wheat (or 620 whole bushels if partial bushels cannot be sold).

Logic Puzzle

Problem

“Three friends — Alice, Bob, and Carol — each ordered a different drink: coffee, tea, and juice. Alice did not order coffee. The person who ordered tea is sitting between the other two. Bob is sitting on the far left. What did each person order?”

Chain-of-Thought Response

Step 1: Establish the seating arrangement.
Bob is on the far left. The tea drinker sits between the other two, so the tea drinker is in the middle seat.

Step 2: Determine who sits in the middle.
Since Bob is on the far left, the middle seat is either Alice or Carol. The person on the far right is the remaining friend.

Step 3: Apply the constraint about Alice.
Alice did not order coffee. If Alice were in the middle, she would drink tea (not coffee) — that works. If Alice were on the right, she could drink tea or juice but not coffee.

Step 4: Test Alice in the middle seat.
If Alice is in the middle, she drinks tea. Carol is on the right. Bob is on the left. Bob and Carol drink coffee and juice in some order. No constraints prevent this. This arrangement is valid.

Step 5: Verify by testing Carol in the middle.
If Carol is in the middle, Carol drinks tea. Alice is on the right. Bob is on the left. Bob and Alice drink coffee and juice. But Alice cannot order coffee, so Alice drinks juice and Bob drinks coffee. This also works, giving us two possible solutions. However, re-reading the puzzle: “The person who ordered tea is sitting between the other two” — both solutions satisfy all constraints.

Answer: With the given constraints, there are two valid solutions: (1) Bob has coffee, Alice has tea (middle), Carol has juice; or (2) Bob has coffee, Carol has tea (middle), Alice has juice. Both satisfy all stated conditions.

Multi-Step Analysis

Problem

“A company’s revenue grew 20% in Q1, declined 10% in Q2, grew 15% in Q3, and declined 5% in Q4. If they started the year at $1 million in quarterly revenue, what was their Q4 revenue, and did they end the year higher or lower than they started?”

Chain-of-Thought Response

Step 1: Calculate Q1 revenue.
Starting at $1,000,000 with 20% growth: $1,000,000 × 1.20 = $1,200,000.

Step 2: Calculate Q2 revenue.
Q1 revenue with 10% decline: $1,200,000 × 0.90 = $1,080,000.

Step 3: Calculate Q3 revenue.
Q2 revenue with 15% growth: $1,080,000 × 1.15 = $1,242,000.

Step 4: Calculate Q4 revenue.
Q3 revenue with 5% decline: $1,242,000 × 0.95 = $1,179,900.

Step 5: Compare to starting revenue.
$1,179,900 vs. $1,000,000 starting revenue — that is $179,900 higher, or about 18% above where they started.

Answer: Q4 revenue was $1,179,900. Despite two quarters of decline, the company ended the year approximately 18% higher than where they started.

When to Use Chain-of-Thought

Best for problems that require multi-step reasoning

Perfect For

Arithmetic and Math Word Problems

Multi-step calculations, percentages, unit conversions, and word problems that require combining several operations in sequence.

Commonsense Reasoning

Questions that require chaining everyday knowledge together — understanding cause and effect, physical relationships, or social dynamics across multiple steps.

Symbolic and Logical Reasoning

Constraint satisfaction, deductive logic, pattern recognition, and formal reasoning tasks where each inference builds on the previous one.

Debugging and Error Diagnosis

Tracing through code logic, identifying where a process breaks, or systematically evaluating potential root causes of a failure.

Skip It When

Simple Factual Lookups

Questions with single-step answers — “What is the capital of France?” does not benefit from step-by-step reasoning.

Creative Generation Tasks

Writing stories, poetry, or brainstorming sessions where freeform generation is the goal — structured reasoning steps can constrain creative output.

Latency-Sensitive Applications

When speed matters more than accuracy — CoT generates significantly more tokens, increasing both response time and API costs for each query.

Use Cases

Where Chain-of-Thought delivers the most value

Financial Calculations

Walk through compound interest, tax computations, investment returns, and budgeting problems where each calculation feeds into the next.

Code Debugging

Trace execution paths step by step, identify where variable states diverge from expectations, and systematically isolate the root cause of bugs.

Scientific Analysis

Work through experimental data interpretation, hypothesis testing, and multi-variable analysis where each conclusion depends on prior findings.

Educational Tutoring

Demonstrate problem-solving methods to students by making every reasoning step explicit and instructive, turning answers into learning opportunities.

Decision Analysis

Evaluate complex decisions by reasoning through criteria, trade-offs, and consequences step by step, producing transparent and well-supported recommendations.

Legal and Policy Reasoning

Trace through regulatory requirements, case law, and policy implications step by step to build well-reasoned compliance assessments or legal arguments.

Where Chain-of-Thought Fits

CoT is the foundation technique that launched modern prompt-based reasoning

Standard Prompting Direct Answers Input-output pairs with no reasoning

Chain-of-Thought Step-by-Step Reasoning Intermediate steps in few-shot examples

Zero-Shot CoT Trigger Phrases “Let’s think step by step” without examples

Tree of Thought Branching Exploration Multiple reasoning paths evaluated in parallel

The Technique That Started It All

Chain-of-Thought was the breakthrough that proved prompting alone could unlock reasoning capabilities in large language models. Before Wei et al.’s 2022 paper, the prevailing assumption was that LLMs needed fine-tuning or architectural changes to reason well. CoT showed that simply demonstrating reasoning in the prompt was enough — and that insight spawned an entire research ecosystem of reasoning-enhancement techniques: Zero-Shot CoT, Self-Consistency, Tree of Thought, Self-Ask, and dozens more.

Related Techniques

Explore techniques that extend Chain-of-Thought reasoning

Simplification Zero-Shot Chain-of-Thought Achieve step-by-step reasoning without any examples — just add “Let’s think step by step” to your prompt and the model generates its own reasoning chain.

Enhancement Self-Consistency Generate multiple Chain-of-Thought reasoning paths for the same problem, then select the most common answer — majority voting over diverse reasoning chains.

Evolution Tree of Thought Extends CoT from a single reasoning chain to a branching tree — exploring multiple reasoning paths, evaluating each, and backtracking when a path leads to a dead end.

Make Your Reasoning Visible

Try Chain-of-Thought reasoning on your own complex problems or build step-by-step prompts with our interactive tools.

Prompt Builder All Foundations

Chain-of-Thought Prompting

Make the Thinking Visible

The Chain-of-Thought Process

Provide Reasoning Demonstrations

Present the Target Problem

Model Generates Reasoning Chain and Answer

See the Difference

Standard Prompting

Chain-of-Thought

Practice Responsible AI

Chain-of-Thought in Action

When to Use Chain-of-Thought

Perfect For

Skip It When

Use Cases

Financial Calculations

Code Debugging

Scientific Analysis

Educational Tutoring

Decision Analysis

Legal and Policy Reasoning

Where Chain-of-Thought Fits

Related Techniques

Make Your Reasoning Visible