본문으로 건너뛰기
Juhyeon's Blog
Search
검색
다크 모드
라이트 모드
탐색기
태그: alignment
6건의 항목
2026년 4월 13일
How Far Are We From AGI - Are LLMs All We Need
paper
AGI
LLM
survey
capabilities
reasoning
perception
memory
metacognition
alignment
embodied-AI
roadmap
2026년 4월 13일
LLM_as_Judge_GenToJudgment_2025_LLM_Evaluation
paper
LLM_Evaluation
LLM_as_Judge
taxonomy
EMNLP
alignment
reasoning
bias
survey
2026년 4월 13일
Training language models to follow instructions with human feedback - InstructGPT
instructgpt
rlhf
alignment
openai
baseline-selection
hyperparameters
2026년 4월 13일
The Consciousness Cluster - Preferences of Models that Claim to be Conscious
self-consciousness
alignment
fine-tuning
consciousness-cluster
AI-safety
paper
downstream-preferences
emergent-misalignment
2026년 4월 13일
Taken out of context - On measuring situational awareness in LLMs
paper
situational_awareness
OOC_reasoning
AI_safety
LLM_evaluation
emergent_capabilities
alignment
FSPM_prerequisite
2026년 4월 13일
The Alignment Problem from a Deep Learning Perspective
paper
alignment
instrumental_convergence
deceptive_alignment
reward_hacking
power_seeking
situational_awareness
RLHF
AI_safety
FSPM
ICLR2024