Juhyeon's Blog

태그: LLM-Safety

2건의 항목

2026년 6월 04일
HarmBench - A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
2026년 6월 04일
JULI - Jailbreak Large Language Models by Self-Introspection