Selection-Inference (SI)
Logical reasoning fails when models try to do everything at once — selecting relevant information AND drawing conclusions in a single step. Selection-Inference separates these into two alternating modules: one that identifies the right premises, and one that draws valid inferences from them.
Introduced: Selection-Inference was published at ICLR 2023 by Creswell et al. The technique addresses a core weakness in LLM reasoning: when given a set of facts and asked to draw a conclusion, models often select irrelevant premises or make invalid inferential leaps. SI introduces a two-module architecture that alternates between a Selection module (which identifies the most relevant premises for the next reasoning step) and an Inference module (which draws a single valid conclusion from the selected premises). This disciplined alternation achieved over 100% improvement on logical reasoning benchmarks.
Modern LLM Status: The Selection-Inference pattern has become increasingly relevant as models are used for complex logical reasoning in legal, scientific, and financial domains. While modern frontier models have improved at reasoning, they still benefit from the explicit separation of premise-selection from inference. The technique is particularly valuable when working with large knowledge bases where selecting the right information is as important as reasoning correctly from it. In production systems, the two-module pattern maps naturally to retrieval-augmented generation (RAG) architectures.
Separate Selection from Inference
Standard prompting gives a model all available information and asks for a conclusion — a cognitive overload that leads to errors. Selection-Inference breaks reasoning into two strictly separated operations.
The Selection module looks at all available facts and picks only the ones relevant to the current reasoning step. The Inference module takes those selected facts and draws exactly one valid conclusion. Then the cycle repeats: the new conclusion becomes part of the available facts, and Selection picks the next relevant set. This alternation continues until the final answer is reached.
Think of it like a legal team where one person gathers evidence and another person argues the case — each expert focuses on what they do best, and the quality of the overall argument improves because neither is distracted by the other’s job.
When humans solve logic puzzles, they don’t try to use all information simultaneously. They identify relevant clues, draw a small inference, then look for more relevant clues. Selection-Inference formalizes this natural pattern. By forcing the model to explicitly state which premises it’s using before drawing any conclusion, errors become visible: either the wrong premises were selected, or the inference from correct premises was invalid. This decomposition turns opaque reasoning into a debuggable pipeline.
The Selection-Inference Process
Five stages from knowledge base to auditable conclusion
Present the Knowledge Base
Provide the model with all available facts, rules, and premises. These could come from a document, database, or prior reasoning steps.
“Facts: (1) All mammals breathe air. (2) Whales are mammals. (3) Fish breathe through gills. (4) Whales live in water. (5) Animals that live in water and breathe air must surface periodically.”
Selection Step
The Selection module examines all available information and identifies the minimal set of premises relevant to the current reasoning goal. It explicitly states which facts it’s choosing and why.
Selected premises: Fact (1) “All mammals breathe air” and Fact (2) “Whales are mammals” — these are relevant because we need to determine how whales breathe.
Inference Step
The Inference module takes ONLY the selected premises and draws a single, valid conclusion. No additional information is used — just the explicitly selected facts.
Inference: From facts (1) and (2), we conclude: “Whales breathe air.”
Update and Iterate
The new conclusion is added to the available knowledge base. The process returns to the Selection step, now with an enriched set of facts. This alternation continues until the reasoning goal is achieved.
Updated knowledge base now includes Fact (6): “Whales breathe air.” Selection module now picks Facts (4), (5), and (6) for the next inference cycle.
Chain Completion
When the final conclusion answers the original question, the full chain of selection-inference pairs forms a complete, auditable reasoning trace.
Final inference: From Facts (4), (5), and (6) — “Whales must surface periodically.” The complete chain shows every premise used and every conclusion drawn.
See the Difference
Why separating selection from inference produces more reliable reasoning
Direct Reasoning
Given: (1) All mammals breathe air. (2) Whales are mammals. (3) Fish breathe through gills. (4) Whales live in water. (5) Animals that live in water and breathe air must surface periodically. Conclusion: Do whales need to surface?
Yes, whales need to surface because they live in water but are mammals that breathe air.
Selection-Inference
Selection 1: Picks Facts (1) and (2) — relevant to determining how whales breathe.
Inference 1: “Whales breathe air.”
Selection 2: Picks new conclusion + Fact (4) + Fact (5) — relevant to surfacing behavior.
Inference 2: “Whales must surface periodically.”
Yes, whales must surface periodically. Reasoning trace: Facts (1)+(2) → “Whales breathe air” → combined with Facts (4)+(5) → “Whales must surface periodically.”
Practice Responsible AI
Always verify AI-generated content before use. AI systems can produce confident but incorrect responses. When using AI professionally, transparent disclosure is both best practice and increasingly a legal requirement.
48 US states now require AI transparency in key areas. Critical thinking remains your strongest tool against misinformation.
Selection-Inference in Action
See how alternating selection and inference improves logical reasoning
A client claims breach of contract. The knowledge base includes the contract terms, communications between parties, state contract law statutes, and relevant case precedents. The question: “Does the client have grounds for a breach of contract claim?”
Selection 1: Contract Section 4.2 (delivery timeline) + Email from vendor dated March 15 (acknowledging delay).
Inference 1: The vendor acknowledged failing to meet the contractual delivery deadline.
Selection 2: Inference 1 + Contract Section 7.1 (remedies for late delivery) + State UCC §2-711 (buyer’s remedies).
Inference 2: Late delivery constitutes a material breach under both the contract terms and applicable state law.
Selection 3: Inference 2 + Client communications (no waiver of delivery timeline) + Precedent: Smith v. Allied Corp (similar facts, breach upheld).
Inference 3: The client did not waive the deadline and has grounds consistent with relevant precedent.
Final conclusion: The client has strong grounds for a breach of contract claim based on documented late delivery, applicable contract remedies, and supporting case law. Note: Always verify AI-generated legal analysis with qualified legal counsel.
A patient presents with fatigue, joint pain, and a butterfly-shaped facial rash. The knowledge base includes symptom descriptions, lab results (elevated ANA, low complement levels), family history, and diagnostic criteria for several autoimmune conditions.
Selection 1: Symptom: butterfly-shaped facial rash + Diagnostic criteria for systemic lupus erythematosus (SLE).
Inference 1: The malar (butterfly) rash is one of the 11 ACR classification criteria for SLE.
Selection 2: Inference 1 + Lab result: elevated ANA titer + Lab result: low complement C3/C4.
Inference 2: Positive ANA and low complement levels are two additional ACR criteria, bringing the total to 3 of 11.
Selection 3: Inference 2 + Symptom: joint pain (non-erosive arthritis) + ACR threshold (4 of 11 criteria needed for classification).
Inference 3: With 4 criteria met (malar rash, positive ANA, low complement, arthritis), the patient meets the ACR classification threshold for SLE.
Final conclusion: The evidence supports a preliminary classification of SLE based on 4 of 11 ACR criteria. Each criterion was independently selected and verified. Note: AI-assisted analysis should always be reviewed and confirmed by qualified medical professionals.
Prove that the sum of two even numbers is always even. The knowledge base includes the definition of even numbers, properties of integer addition, and basic algebraic axioms.
Selection 1: Definition: An even number can be expressed as 2k where k is an integer.
Inference 1: Let the two even numbers be 2a and 2b, where a and b are integers.
Selection 2: Inference 1 + Property: Integer addition is closed (sum of integers is an integer) + Distributive property of multiplication over addition.
Inference 2: 2a + 2b = 2(a + b), and since a + b is an integer (closure), 2(a + b) fits the form 2k.
Selection 3: Inference 2 + Definition of even numbers (2k form).
Inference 3: Since 2(a + b) is of the form 2k where k = (a + b) is an integer, the sum is even by definition.
Final conclusion: The sum of two even numbers is always even. QED. Every step explicitly cites the axiom or definition used. Note: Verify mathematical proofs independently before relying on AI-generated reasoning.
When to Use Selection-Inference
Best for logical reasoning tasks requiring explicit evidence trails
Perfect For
Tasks with many available premises where selecting the right evidence is as important as reasoning correctly from it.
Working with large knowledge bases, document collections, or databases where precise evidence selection is critical.
Building formal arguments, legal cases, or mathematical proofs where each step must cite specific supporting evidence.
Situations where you need to verify which evidence was used for each conclusion — compliance, safety-critical systems, or regulated industries.
Skip It When
When the relevant information is obvious and doesn’t need explicit selection — the overhead of alternating modules adds no value.
Writing, brainstorming, or open-ended generation where there are no premises to select from — SI is designed for logical reasoning.
When latency matters more than auditability — alternating between selection and inference modules adds processing time at each step.
Use Cases
Where Selection-Inference delivers the most value
Legal Document Analysis
Select relevant clauses, statutes, and precedents from large legal corpora, then draw precise legal conclusions grounded in explicitly cited evidence.
Scientific Literature Review
Systematically select relevant findings from research papers and draw evidence-based conclusions with clear citation trails.
Medical Diagnostic Reasoning
Alternate between selecting relevant symptoms and test results and inferring possible diagnoses through structured differential analysis.
Financial Audit Trails
Select relevant financial records and regulations at each step, building auditable inference chains for compliance verification.
Compliance Verification
Map regulatory requirements to specific organizational practices by selecting applicable rules and inferring compliance status at each checkpoint.
Academic Research Synthesis
Build systematic literature reviews by selecting relevant findings from multiple sources and drawing synthesized conclusions with traceable evidence.
Where Selection-Inference Fits
SI separates what was mixed in earlier reasoning approaches
Selection-Inference maps perfectly to retrieval-augmented generation (RAG) architectures. The Selection module corresponds to the retrieval step (finding relevant documents), while the Inference module corresponds to the generation step (reasoning from retrieved context). Making this mapping explicit can improve RAG quality by ensuring each inference is grounded in specifically cited retrieved passages.
Related Techniques
Explore complementary reasoning techniques
Separate to Strengthen
Apply the selection-inference pattern to your reasoning tasks or explore other structured techniques.