Improving Code Generation by Training with Natural Language Feedback

Status
✓ complete
Domain
arxiv.org
Archived
2026-04-10 23:13:39

Plaintext Content

(1.2 KB)
The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient. We further show that ILF can be seen as a form of minimizing the KL divergence to the ground truth distribution and demonstrate a proof-of-concept on a neural program synthesis task. We use ILF to improve a Codegen-Mono 6.1B model's pass@1 rate by 38% relative (and 10% absolute) on the Mostly Basic Python Problems (MBPP) benchmark, outperforming both fine-tuning on MBPP and fine-tuning on repaired programs written by humans. Overall, our results suggest that learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations for improving an LLM's performance on code generation tasks.

Archived Page Preview (Full Archive)

Page Captures

Screenshot

Page screenshot

133.7 KB

PDF Document

📄View PDF

200.0 KB

MHTML Archive

📦Download MHTML

139.5 KB

Archived Files

TypeFileSizeDedupActions
HTML (Original)raw.html46.3 KB—
HTML (With Banner)view.html49.6 KB—
HTML (Full Archive)complete.html55.5 KB—
Screenshotscreenshot.webp133.7 KB—
PDFpage.pdf200.0 KB—
MHTML Archivecomplete.mhtml139.5 KB—

Total Size: 624.6 KB

Archive Jobs (5)

JobStatusStartedCompletedDurationDetails
Fetch HTML✓ completed2026-04-10 23:13:332026-04-10 23:13:330.000s—
Monolith✓ completed2026-04-10 23:13:332026-04-10 23:13:330.000s56883 bytes
Screenshot✓ completed2026-04-10 23:13:332026-04-10 23:13:363.0s136916 bytes
PDF✓ completed2026-04-10 23:13:362026-04-10 23:13:371.000s204818 bytes
MHTML✓ completed2026-04-10 23:13:372026-04-10 23:13:392.0s142804 bytes

Archive Metadata

Archive ID616
Link ID616
Created At2026-04-10 23:13:32
Statuscomplete
Retry Count0
Is NSFWfalse
Content Typetext
Link Info
Original URLhttps://arxiv.org/abs/2303.16749
Normalized URLhttps://arxiv.org/abs/2303.16749
Domainarxiv.org
Last Archived At2026-04-10 23:13:39

Compare with Another Archive

Enter an archive ID to compare content differences.