본문으로 건너뛰기

Juhyeon's Blog

태그: ICLR

10건의 항목

  • 2026년 4월 13일

    ReAct - Synergizing Reasoning and Acting in Language Models

    • paper
    • Reasoning
    • Acting
    • LLM_Agent
    • Prompting
    • CoT
    • Tool_Use
    • ICLR
  • 2026년 4월 13일

    ALFWorld - Aligning Text and Embodied Environments for Interactive Learning

    • paper
    • benchmark
    • embodied_agent
    • ALFWorld
    • BUTLER
    • text_transfer
    • ICLR
    • UW
    • MSR
  • 2026년 4월 13일

    AgentBench - Evaluating LLMs as Agents

    • paper
    • benchmark
    • agent
    • AgentBench
    • multi_environment
    • Tsinghua
    • ICLR
  • 2026년 4월 13일

    Aligning AI With Shared Human Values

    • paper
    • benchmark
    • ethics
    • moral_judgment
    • AI_alignment
    • safety
    • ICLR
  • 2026년 4월 13일

    GAIA - A Benchmark for General AI Assistants

    • paper
    • benchmark
    • general_AI
    • GAIA
    • tool_use
    • assistant
    • Meta_FAIR
    • ICLR
  • 2026년 4월 13일

    GLUE - A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding 1

    • paper
    • GLUE
    • benchmark
    • multi-task
    • NLU
    • QNLI
    • RTE
    • transfer-learning
    • ICLR
  • 2026년 4월 13일

    GPQA - A Graduate-Level Google-Proof Q&A Benchmark

    • paper
    • benchmark
    • expert_level
    • GPQA
    • science
    • graduate
    • Google_proof
    • ICLR
  • 2026년 4월 13일

    MathVista - Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

    • paper
    • benchmark
    • mathematics
    • multimodal
    • visual_reasoning
    • MathVista
    • ICLR
  • 2026년 4월 13일

    Measuring Massive Multitask Language Understanding

    • paper
    • benchmark
    • MMLU
    • multitask
    • knowledge
    • language_understanding
    • ICLR
  • 2026년 4월 13일

    WebArena - A Realistic Web Environment for Building Autonomous Agents

    • paper
    • benchmark
    • web_agent
    • WebArena
    • autonomous_agent
    • CMU
    • ICLR

키보드 단축키

/ 또는 Ctrl+K검색
?단축키 도움말
Esc모달 닫기

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Blog