Self-Generated ICL
What if the model could write its own few-shot examples? Self-Generated In-Context Learning has the model produce demonstration examples before tackling the actual task — bootstrapping its own context to deliver more consistent, well-calibrated outputs without any manually curated data.
Introduced: Self-Generated ICL was proposed in 2022 by Kim et al. as a method to overcome one of the most persistent bottlenecks in few-shot prompting: the need for high-quality, manually curated demonstration examples. The key observation was that large language models already possess sufficient knowledge to generate their own task demonstrations — and that these self-created examples can serve as effective in-context learning signals when prepended to the actual query.
Modern LLM Status: Self-Generated ICL is an active and practical technique. The approach has the model generate its own few-shot examples before tackling the actual task, eliminating the need for manually curated demonstrations. With modern LLMs being strong generators, self-generated examples are often surprisingly high quality and well-suited to the task at hand. The technique is particularly valuable when labeled data is scarce or unavailable, when working in specialized domains where example creation requires expertise, or when you want the model to calibrate its own understanding of the task before answering. Recent work has shown that self-generated examples can match or exceed the performance of human-written examples for many NLP tasks.
Let the Model Teach Itself by Example
Few-shot prompting works remarkably well — but it has an uncomfortable dependency: someone has to write the examples. Crafting high-quality demonstrations takes time, domain expertise, and careful formatting. For specialized tasks like medical coding, legal extraction, or niche classification, creating even three good examples can require a subject-matter expert and hours of iteration.
Self-Generated ICL removes that bottleneck entirely. Instead of asking a human to write demonstrations, you ask the model itself: “Generate three examples of this task being performed correctly.” The model draws on its training data to produce plausible input-output pairs, and those self-created examples then serve as the few-shot context for the real query. The model becomes both the teacher and the student.
Think of it like a musician warming up before a performance. Rather than jumping straight into the concert piece, the musician plays scales and short passages in the same key and tempo — calibrating their hands, ears, and focus. Self-Generated ICL gives the model the same warm-up: it rehearses the task pattern before the real performance begins.
Large language models have internalized millions of task patterns from their training data. When you ask a model to generate examples of a task, it draws on that vast implicit knowledge to produce demonstrations that are naturally well-formatted, contextually appropriate, and stylistically consistent. These self-generated examples often capture subtle task requirements — like output length, tone, and structure — that are difficult to specify in instructions alone. The model essentially externalizes its own understanding of the task, then uses that externalized knowledge as a reference point.
The Self-Generated ICL Process
Four stages from task description to bootstrapped solution
Define the Task Format
Describe the task clearly and specify the desired input-output format. The model needs to understand what a good example looks like before it can generate one. Include any constraints on structure, length, or style that the final output should follow.
“Task: Given a customer support email, extract the following fields in JSON format: issue_type, product_name, urgency_level (low/medium/high), and a one-sentence summary.”
Prompt for Example Generation
Ask the model to generate several examples of the task being performed correctly. Specify diversity requirements to prevent repetitive or narrow examples. Request that the generated demonstrations cover different scenarios, edge cases, and input variations to give the model a well-rounded calibration set.
“Generate 3 diverse examples of this extraction task. Each example should include a realistic customer email and the correct JSON output. Vary the issue types, urgency levels, and writing styles across examples.”
Assemble the Self-Generated Demonstrations
Collect the model-generated examples and format them as a standard few-shot prompt, treating them as if they were human-curated demonstrations. This step can be done automatically in a pipeline or manually reviewed for quality. The assembled prompt now contains task instructions followed by the self-generated example pairs.
The three generated email-to-JSON pairs are formatted as: “Email: [generated input] → Output: [generated JSON]” repeated three times, creating a standard few-shot block ready for use.
Solve the Target Task
Present the actual query after the self-generated demonstrations. The model now has context from its own examples to guide its response on the real input. The self-created demonstrations prime the model’s output format, reasoning patterns, and task-specific conventions — producing a response that is more consistent and better calibrated than a zero-shot attempt.
The real customer email is appended after the three self-generated examples. The model extracts the JSON fields using the same format and granularity it established in its own demonstrations, producing a consistent and well-structured output.
See the Difference
How self-generated examples transform cold-start performance
No Examples (Cold Start)
Classify this product review as positive, negative, or neutral and explain your reasoning: “The battery lasts longer than expected but the screen scratches too easily.”
This review is mixed. It has both positive and negative aspects. I would classify it as neutral.
Self-Generated ICL
The model first generates three classified reviews with structured reasoning, establishing a consistent format: sentiment label, key positive signals, key negative signals, and a weighted final judgment. Each self-generated example demonstrates the expected output depth.
Classification: Mixed-Negative
Positive signals: Battery life exceeded expectations (durability praise).
Negative signals: Screen scratches too easily (build quality criticism affecting daily use).
Judgment: While the battery praise is genuine, the screen durability issue affects the core user experience more directly. Weighted classification leans negative.
Practice Responsible AI
Always verify AI-generated content before use. AI systems can produce confident but incorrect responses. When using AI professionally, transparent disclosure is both best practice and increasingly a legal requirement.
48 US states now require AI transparency in key areas. Critical thinking remains your strongest tool against misinformation.
Self-Generated ICL in Action
See how bootstrapped examples improve task performance across domains
“Generate 3 examples of summarizing a scientific abstract into 3 plain-language bullet points. Each bullet should be one sentence, avoid jargon, and highlight the key finding, method, and implication respectively.”
The model generates three abstract-to-bullets examples covering topics like gene therapy, climate modeling, and materials science. Each example demonstrates the exact three-bullet format with consistent tone and complexity level.
When given a real abstract about a novel protein folding algorithm, the model follows the pattern it established: Bullet 1 (Finding): Researchers developed an algorithm that predicts protein shapes with 95% accuracy in under a minute. Bullet 2 (Method): The approach combines graph neural networks with evolutionary sequence data to model atomic-level interactions. Bullet 3 (Implication): Faster protein structure prediction could accelerate drug discovery timelines from years to months.
“Generate 3 examples of converting an undocumented Python function into a fully documented version. Include a Google-style docstring with a summary line, parameter descriptions with types, return type, and one usage example. Vary the function complexity across examples.”
The model creates three before/after pairs: a simple utility function, a data processing function with multiple parameters, and an async function with error handling. Each demonstrates consistent docstring formatting, type annotations, and usage examples.
When given a real undocumented function that calculates weighted moving averages, the model applies the same documentation style: a clear summary line, each parameter described with its type and purpose, the return value documented with its shape and meaning, and a concise usage example showing typical invocation. The output mirrors the formatting conventions established in its self-generated demonstrations.
“Generate 3 examples of extracting structured fields from invoice text. For each example, create a realistic invoice paragraph and extract: vendor_name, invoice_date, total_amount, currency, and line_items (as an array). Use JSON output format. Vary invoice styles and complexity.”
The model generates three invoice-to-JSON pairs: a simple single-item invoice, a multi-line invoice with tax and shipping, and an international invoice with currency conversion. Each output uses identical JSON field names and consistent formatting conventions.
When presented with a real invoice from a cloud hosting provider with multiple service tiers and prorated charges, the model extracts all fields using the exact schema it defined in its examples. The line_items array captures each service tier separately, the currency field matches the format from demonstrations, and edge cases like prorated amounts are handled consistently because the self-generated examples established how to represent partial charges.
When to Use Self-Generated ICL
Best for bootstrapping demonstrations when curated examples are unavailable
Perfect For
When creating task examples requires significant time, cost, or domain expertise — the model can bootstrap its own demonstrations in seconds.
Tasks in medicine, law, finance, or engineering where manual example creation requires expert knowledge that may not be readily available.
When you need a working few-shot solution quickly without the overhead of curating a demonstration set — ideal for proof-of-concept testing.
When the primary goal is ensuring the model understands the expected output structure — self-generated examples anchor format consistency across responses.
Skip It When
When you already have well-curated, human-verified demonstrations — these are generally more reliable than self-generated ones and should be preferred.
Self-generated demonstrations may contain plausible but incorrect facts. If the example content itself must be factually verified, human curation is safer.
Tasks where an incorrect self-generated example could bias the model’s final answer — if a wrong pattern is established in the examples, it may propagate into the real output.
Use Cases
Where Self-Generated ICL delivers the most value
Cold-Start Prototyping
Launch a new classification or extraction task without any pre-existing labeled data by having the model generate its own training-like examples on the fly.
Documentation Generation
Generate example documentation patterns first, then apply that consistent style to new code, APIs, or processes — ensuring uniformity without a style guide.
Data Formatting
Establish output schemas by generating sample transformations first, then processing real data through the self-taught format for consistent structured output.
Style Transfer
Have the model generate examples of writing in a target voice or tone, then apply that calibrated style to new content with higher consistency than instruction-only approaches.
Report Standardization
Bootstrap standardized report formats by generating example reports first, then transforming raw data or notes into the established template with consistent sections and detail levels.
Template Creation
Generate diverse template examples for emails, proposals, or tickets, then use those self-created templates to produce new documents that follow a coherent structural pattern.
Where Self-Generated ICL Fits
Self-Generated ICL bridges zero-shot simplicity and curated few-shot quality
Self-generated examples are not guaranteed to be perfect. For higher-stakes applications, consider filtering strategies: generate more examples than you need and select the best subset, use the model to self-evaluate its own examples for correctness, or combine self-generated demonstrations with even one or two verified human examples as anchors. This hybrid approach captures the convenience of automation while maintaining a quality floor set by trusted references.
Related Techniques
Explore techniques that complement or extend self-generated demonstrations
Bootstrap Your Own Examples
Try self-generated demonstrations or build prompts with our tools.