본문으로 건너뛰기
Juhyeon's Blog
Search
검색
다크 모드
라이트 모드
탐색기
태그: fine-tuning
4건의 항목
2026년 6월 04일
Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior
paper-review
LLM-safety
emergent-misalignment
parameter-subspace
linear-mode-connectivity
fine-tuning
interpretability
self-knowledge
weight-geometry
theory
2026년 6월 04일
Simulating lexical decision times with large language models to supplement megastudies and crowdsourcing
LLM-application
psycholinguistics
lexical-decision
reaction-time
megastudy
fine-tuning
GPT-4o
behavioral-data-simulation
English-Lexicon-Project
critical-review
construct-validity
regression-oracle
pre-training-contamination
2026년 6월 04일
The Consciousness Cluster - Preferences of Models that Claim to be Conscious
paper
self-consciousness
alignment
fine-tuning
consciousness-cluster
AI-safety
downstream-preferences
emergent-misalignment
2026년 6월 04일
Training language models to follow instructions with human feedback - InstructGPT
paper
RLHF
alignment
LLM
InstructGPT
PPO
reward-model
OpenAI
NeurIPS2022
human-feedback
fine-tuning