본문으로 건너뛰기

Juhyeon's Blog

태그: LLM

43건의 항목

2026년 6월 04일
A Comprehensive Survey of Self-Evolving AI Agents - A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
2026년 6월 04일
AgentFold - Long-Horizon Web Agents with Proactive Context Management
2026년 6월 04일
AgentTuning - Enabling Generalized Agentabilities for LLMS
2026년 6월 04일
Annotation-Efficient Universal Honesty Alignment for LLMs
2026년 6월 04일
Automatic Prompt Optimization with Gradient Descent and Beam Search
2026년 6월 04일
Belief in the Machine - Investigating Epistemological Blind Spots of Language Models
2026년 6월 04일
Benchmark Self-Evolving - A Multi-Agent Framework for Dynamic LLM Evaluation
2026년 6월 04일
Berkeley Function Calling Leaderboard (BFCL)
2026년 6월 04일
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
2026년 6월 04일
Can LLMs Lie - Investigation beyond Hallucination
2026년 6월 04일
Cognitive Dissonance - Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness
2026년 6월 04일
Concept Incongruence - An Exploration of Time and Death in Role Playing
2026년 6월 04일
Do Retrieval Augmented Language Models Know When They Dont Know
2026년 6월 04일
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
2026년 6월 04일
GraphReader - Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models
2026년 6월 04일
How Far Are We From AGI - Are LLMs All We Need
2026년 6월 04일
If an LLM Were a Character Would It Know Its Own Story - Evaluating Lifelong Learning in LLMs
2026년 6월 04일
Incomplete Tasks Induce Shutdown Resistance in Some Frontier LLMs
2026년 6월 04일
Is Your Code Generated by ChatGPT Really Correct! Rigorous Evaluation of Large Language Models for Code Generation
2026년 6월 04일
Know Your Limits - A Survey of Abstention in Large Language Models
2026년 6월 04일
Knowing What LLMs DO NOT Know - A Simple Yet Effective Self-Detection Method
2026년 6월 04일
LACIE - Listener-Aware Finetuning for Confidence Calibration in Large Language Models
2026년 6월 04일
LLM Theory of Mind and Alignment - Opportunities and Risks
2026년 6월 04일
Logic-RL - Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
2026년 6월 04일
LongBench - A Bilingual, Multitask Benchmark for Long Context Understanding
2026년 6월 04일
MEM1 - Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents
2026년 6월 04일
MemAgent - Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
2026년 6월 04일
Motivation in Large Language Models
2026년 6월 04일
Reasoning Models Struggle to Control their Chains of Thought
2026년 6월 04일
Social-R1 - Towards Human-like Social Reasoning in LLMs
2026년 6월 04일
Surgical Cheap and Flexible - Mitigating False Refusal in Language Models via Single Vector Ablation
2026년 6월 04일
The Geometry of Truth - Emergent Linear Structure in LLM Representations of True and False Statements
2026년 6월 04일
Thinking Faithful and Stable - Mitigating Hallucinations in LLMs via Internal Consistency
2026년 6월 04일
Towards Ontology-Enhanced Representation Learning for Large Language Models
2026년 6월 04일
Training Compute-Optimal Large Language Models
2026년 6월 04일
Training language models to follow instructions with human feedback - InstructGPT
2026년 6월 04일
Uncertainty-Based Abstention in LLMs Improves Safety
2026년 6월 04일
Weak-to-Strong Generalization - Eliciting Strong Capabilities With Weak Supervision
2026년 6월 04일
Harb et al. (2025) — GPT-4o·Gemini의 NimStim 얼굴 감정 인식 평가
2026년 6월 04일
Using large language models to estimate features of multi-word expressions: Concreteness, valence, arousal
2026년 6월 04일
GPT-4 Emulates Average-Human Emotional Cognition from a Third-Person Perspective
2026년 6월 04일
Affective Computing in the Era of Large Language Models: A Survey from the NLP Perspective
2026년 6월 04일
LLMs_Do_Not_Simulate_Human_Psychology_2025

키보드 단축키

`/` 또는 `Ctrl`+`K`	검색
`?`	단축키 도움말
`Esc`	모달 닫기

Created with Quartz v4.5.2 © 2026

GitHub
Blog