본문으로 건너뛰기

Juhyeon's Blog

태그: alignment

6건의 항목

  • 2026년 4월 13일

    How Far Are We From AGI - Are LLMs All We Need

    • paper
    • AGI
    • LLM
    • survey
    • capabilities
    • reasoning
    • perception
    • memory
    • metacognition
    • alignment
    • embodied-AI
    • roadmap
  • 2026년 4월 13일

    LLM_as_Judge_GenToJudgment_2025_LLM_Evaluation

    • paper
    • LLM_Evaluation
    • LLM_as_Judge
    • taxonomy
    • EMNLP
    • alignment
    • reasoning
    • bias
    • survey
  • 2026년 4월 13일

    Training language models to follow instructions with human feedback - InstructGPT

    • instructgpt
    • rlhf
    • alignment
    • openai
    • baseline-selection
    • hyperparameters
  • 2026년 4월 13일

    The Consciousness Cluster - Preferences of Models that Claim to be Conscious

    • self-consciousness
    • alignment
    • fine-tuning
    • consciousness-cluster
    • AI-safety
    • paper
    • downstream-preferences
    • emergent-misalignment
  • 2026년 4월 13일

    Taken out of context - On measuring situational awareness in LLMs

    • paper
    • situational_awareness
    • OOC_reasoning
    • AI_safety
    • LLM_evaluation
    • emergent_capabilities
    • alignment
    • FSPM_prerequisite
  • 2026년 4월 13일

    The Alignment Problem from a Deep Learning Perspective

    • paper
    • alignment
    • instrumental_convergence
    • deceptive_alignment
    • reward_hacking
    • power_seeking
    • situational_awareness
    • RLHF
    • AI_safety
    • FSPM
    • ICLR2024

키보드 단축키

/ 또는 Ctrl+K검색
?단축키 도움말
Esc모달 닫기

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Blog