본문으로 건너뛰기

Juhyeon's Blog

❯

❯

❯

❯

Decomposing LLM Self Correction The Accuracy Correction Paradox and Error Depth Hypothesis

Decomposing LLM Self-Correction - The Accuracy-Correction Paradox and Error Depth Hypothesis

2026년 2월 11일2분 분량

Introduction

LLM의 intrinsic self-correction(외부 피드백 없이 자체 수정)이 실제로 비효과적이라는 연구 결과
Self-correction을 error detection, error localization, error correction 세 하위 능력으로 분해
GPT-3.5, DeepSeek, Claude 세 모델 cross-model 실험

Related Papers

Self-refinement, self-correction 연구
Metacognition in LLMs

Methods

GSM8K-Complex (n=500/model, 346 total errors)에서 실험
Error detection, localization, correction 각각 측정
Error location hint 제공 효과 분석

Results

Accuracy-Correction Paradox: 약한 모델(GPT-3.5, 66% accuracy)이 강한 모델(DeepSeek, 94%)보다 1.6배 높은 correction rate
Error Depth Hypothesis: 강한 모델의 오류가 더 깊어 self-correction이 어려움
Error detection rate 모델별 차이 큼 (10%~82%), 그러나 detection이 correction을 예측하지 않음
Error location hint 제공이 오히려 모든 모델에서 성능 저하

Discussion

Model capability와 self-improvement의 선형 관계 가정에 대한 도전
Self-refinement pipeline 설계에 중요한 시사점

공유하기

그래프 뷰

Introduction
Related Papers
Methods
Results
Discussion

Properties

Author: Yin Li
Comment: 약한 모델이 강한 모델보다 높은 self-correction rate를 보이는 Accuracy-Correction Paradox 발견
IsTargetPaper: true
Journal/Conference: arXiv
Published Year: 2025
Reading Status: Not Started
Review Date: 2026-02-01
Topic: LLM self-correction, metacognition, error detection
URL: https://arxiv.org/abs/2601.00828

백링크

Architecture
Fundamentals
LLMs
Memory
self-consciousness
Unlabeled
Vision

Created with Quartz v4.5.2 © 2026

GitHub
Blog