본문으로 건너뛰기

Juhyeon's Blog

❯

❯

❯

❯

CoRE Enhancing Metacognition with Label free Self evaluation in LRMs

CoRE - Enhancing Metacognition with Label-free Self-evaluation in LRMs

2026년 2월 11일2분 분량

Introduction

Large Reasoning Models (LRMs)가 수학, 프로그램 합성 등에서 인상적인 능력 시현
그러나 overthinking 문제: 과도하고 중복된 추론 단계로 추론 비효율성 발생
LRM 자기평가 질문: 외부 라벨 없이 모델이 자신의 추론 궤적의 정확성을 어떻게 평가할 수 있는가?
Chain-of-Reasoning Embedding (CoRE) 제안: 라벨 없는 자기평가를 위한 잠재 공간의 히든 스테이트 시리즈

Related Papers

Large Reasoning Models 연구
Overthinking 및 추론 효율성 연구
Self-evaluation 및 calibration 연구

Methods

CoRE: LRM의 중간 추론 단계에 대한 라벨 없는 자기평가를 가능하게 하는 히든 스테이트 시리즈
CoRE 궤적의 기하학적 속성 분석
중복 추론이 주기적 변동(cyclical fluctuations)으로 나타남 발견 - 반복적이고 무의식적인 reflection/exploration에 해당
CoRE-Eval: 훈련 없는, 라벨 없는 자기평가 프레임워크로 패턴 감지 및 조기 종료 동적 결정

Results

GSM8K, MATH-500, AIME 벤치마크에서 테스트
7B~32B 모델 크기에서 실험
CoT 길이 13.7%~33.2% 감소
답변 정확도 약 10% 향상
32B 모델로 AIME에서 70.0% 정확도 달성

Discussion

메타인지 능력 향상을 통한 추론 효율성 개선
외부 라벨 없이 자기평가 가능
중복 추론 패턴의 기하학적 특성 발견
향후 연구: 다른 도메인 및 모델에 적용 확장

공유하기

그래프 뷰

Introduction
Related Papers
Methods
Results
Discussion

Properties

Author: Haoxi Li et al.
Comment: 라벨 없이 LRM의 메타인지 향상
IsTargetPaper: true
Journal/Conference: arXiv
Linked Bases: [[templates.base]]
Published Year: 2025
Reading Status: Not Started
Review Date: 2026-02-03
Topic: LLM Metacognition, Self-Evaluation
URL: https://arxiv.org/abs/2507.06087

백링크

Architecture
Fundamentals
LLMs
Memory
self-consciousness
Unlabeled
Vision

Created with Quartz v4.5.2 © 2026

GitHub
Blog