by Moonlight

  1. ๐Ÿง  ReAct๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์—์„œ ์ถ”๋ก (reasoning)๊ณผ ํ–‰๋™(acting)์„ ์ƒํ˜ธ ๊ต์ฐจํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ํ†ตํ•ฉํ•˜์—ฌ ์‹œ๋„ˆ์ง€๋ฅผ ์ฐฝ์ถœํ•˜๋Š” ์ƒˆ๋กœ์šด ํ”„๋กฌํ”„ํŠธ ๊ธฐ๋ฐ˜ ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
  2. ๐Ÿ’ก ์ด ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ถ”๋ก  ํŠธ๋ ˆ์ด์Šค(reasoning traces)๋ฅผ ํ†ตํ•ด ํ–‰๋™ ๊ณ„ํš์„ ์œ ๋„, ์ถ”์  ๋ฐ ์—…๋ฐ์ดํŠธํ•˜๋ฉฐ, ํ–‰๋™์„ ํ†ตํ•ด ์™ธ๋ถ€ ํ™˜๊ฒฝ๊ณผ ์ƒํ˜ธ ์ž‘์šฉํ•˜์—ฌ CoT(Chain-of-Thought) ์ถ”๋ก ์˜ ํ™˜๊ฐ ๋ฐ ์˜ค๋ฅ˜ ์ „ํŒŒ ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•˜๊ณ  ํ•ด์„ ๊ฐ€๋Šฅ์„ฑ์„ ๋†’์ž…๋‹ˆ๋‹ค.
  3. ๐Ÿš€ HotpotQA, ALFWorld, WebShop ๋“ฑ ๋‹ค์–‘ํ•œ ์–ธ์–ด ๋ฐ ์˜์‚ฌ๊ฒฐ์ • ๋ฒค์น˜๋งˆํฌ์—์„œ ReAct๋Š” ๋‹จ ํ•œ๋‘ ๊ฐœ์˜ in-context ์˜ˆ์‹œ๋งŒ์œผ๋กœ ์ตœ์‹ (state-of-the-art) ๊ธฐ์ค€์„ ์„ ํฌ๊ฒŒ ๋Šฅ๊ฐ€ํ•˜๋Š” ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.