본문으로 건너뛰기
Juhyeon's Blog
Search
검색
다크 모드
라이트 모드
탐색기
태그: Training
7건의 항목
2026년 6월 04일
AgentTuning - Enabling Generalized Agentabilities for LLMS
Agent
InstructionTuning
LLM
AgentLM
Llama2
SFT
Generalization
Training
2026년 6월 04일
Annotation-Efficient Universal Honesty Alignment for LLMs
Paper
LLM
HonestyAlignment
Calibration
SelfConsistency
AnnotationEfficiency
Training
ICLR2026
Safety
Hallucination
2026년 6월 04일
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Training
LLM
Reliability
Calibration
KnowledgeBoundary
SelfAwareness
Hallucination
DST
Alignment
2026년 6월 04일
Odds-Ratio Preference Optimization(ORPO)
Paper
RL
Alignment
PreferenceOptimization
ORPO
RLHF-Alternative
ReferenceFree
EMNLP2024
Training
2026년 6월 04일
Playing Atari with Deep Reinforcement Learning
Deep-RL
DQN
Q-Learning
Experience-Replay
Atari
CNN
Value-Based
Training
DeepMind
Foundational
2026년 6월 04일
Proximal Policy Optimization Algorithms
RL
PolicyGradient
PPO
ActorCritic
TRPO
OpenAI
Training
RLHF
2026년 6월 04일
Towards Ontology-Enhanced Representation Learning for Large Language Models
paper
LLM
Ontology
RepresentationLearning
ContrastiveLearning
KnowledgeInjection
Biomedical
Training