본문으로 건너뛰기

Juhyeon's Blog

❯

❯

❯

❯

Quantifying Self Awareness of Knowledge in Large Language Models

Quantifying Self-Awareness of Knowledge in Large Language Models

2026년 2월 11일2분 분량

Introduction

Hallucination prediction이 LLM self-awareness의 증거로 해석되지만, 실제로는 question-side shortcut에 의한 것일 수 있음
Question-awareness와 model-side introspection을 분리하기 위해 Approximate Question-side Effect (AQE) 제안
SCAO (Semantic Compression by Answering in One word) 방법으로 genuine self-awareness 강화

Related Papers

LLM hallucination detection 연구
Self-knowledge probing 연구

Methods

AQE: question-side shortcut의 기여도를 정량화하는 metric
SCAO: 모델이 한 단어로 답하도록 하여 question-side cue를 줄이고 model-side signal을 강화
여러 데이터셋에서 self-awareness 분석

Results

기존 hallucination prediction 성능의 상당 부분이 question의 표면적 패턴에 의존
SCAO는 question-side cue가 줄어든 설정에서 일관된 성능 달성
Genuine self-awareness를 촉진하는 데 효과적

Discussion

Self-awareness 측정 시 shortcut과 실제 introspection을 구분하는 것이 중요
기존 벤치마크의 한계를 제시하는 중요한 방법론적 기여

공유하기

그래프 뷰

Introduction
Related Papers
Methods
Results
Discussion

Properties

Author: Yeongbin Seo et al.
Comment: Hallucination prediction에서 question-side shortcut vs genuine self-awareness를 분리하여 정량화
IsTargetPaper: true
Journal/Conference: arXiv
Published Year: 2025
Reading Status: Not Started
Review Date: 2026-02-01
Topic: LLM self-awareness, hallucination prediction, introspection
URL: https://arxiv.org/abs/2509.15339

백링크

Architecture
Fundamentals
LLMs
Memory
self-consciousness
Unlabeled
Vision

Created with Quartz v4.5.2 © 2026

GitHub
Blog