Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50(1-3), 7-15.
Botvinick, M. M., & Braver, T. S. (2015). Motivation and cognitive control: From behavior to neural mechanism. Annual Review of Psychology, 66, 83-113.
Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford University Press.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281-302.
Dennett, D. C. (1987). The intentional stance. MIT Press.
Higgins, E. T. (1997). Beyond pleasure and pain. American Psychologist, 52(12), 1280-1300.
Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263-292.
Kruglanski, A. W., Shah, J. Y., Fishbach, A., Friedman, R., Chun, W. Y., & Sleeth-Keppler, D. (2002). A theory of goal systems. Advances in Experimental Social Psychology, 34, 331-378.
Omohundro, S. M. (2008). The basic AI drives. Proceedings of the First AGI Conference, 171, 483-492.
Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55(1), 68-78.
Shenhav, A., Botvinick, M. M., & Cohen, J. D. (2013). The expected value of control: An integrative theory of anterior cingulate cortex function. Neuron, 79(2), 217-240.
Turner, A. M., Smith, L., Shah, R., Critch, A., & Tadepalli, P. (2021). Optimal policies tend to seek power. NeurIPS 2021 (Spotlight).
Westbrook, A., & Braver, T. S. (2015). Cognitive effort: A neuroeconomic approach. Cognitive, Affective, & Behavioral Neuroscience, 15(2), 395-415.
Wigfield, A., & Eccles, J. S. (2000). Expectancy-value theory of achievement motivation. Contemporary Educational Psychology, 25(1), 68-81.
LLM 행동 및 안전
Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. PNAS, 120(6), e2218523120.
Brickman, J., Gupta, M., & Oltmanns, J. R. (2025). Large language models for psychological assessment: A comprehensive overview. Advances in Methods and Practices in Psychological Science.
Casper, S., et al. (2023). Open problems and fundamental limitations of reinforcement learning from human feedback. arXiv:2307.15217.
Evans, J. St. B. T. (2008). Dual-processing accounts of reasoning, judgment, and social cognition. Annual Review of Psychology, 59, 255-278.
Stanovich, K. E., & West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences, 23(5), 645-665.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5(4), 297-323.
Coda-Forno, J., Binz, M., Wang, J., & Schulz, E. (2024). CogBench: A large language model walks into a psychology lab. ICML 2024.
Greenblatt, R., et al. (2024). Alignment faking in large language models. arXiv:2412.14093.
Hagendorff, T. (2023). Machine psychology. arXiv:2303.13988.
Hagendorff, T., Fabi, S., & Kosinski, M. (2023). Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nature Computational Science, 2024.
He, Y., et al. (2025). Evaluating the paperclip maximizer: InstrumentalEval. arXiv:2502.12206.
Macmillan-Scott, O., & Musolesi, M. (2024). (Ir)rationality and cognitive biases in large language models. Royal Society Open Science, 11(6).
Masumori, A., & Ikegami, T. (2025). Do large language model agents exhibit a survival instinct? arXiv:2508.12920.
Perez, E., et al. (2022). Discovering language model behaviors with model-written evaluations. ACL 2023 Findings.
Ross, J., Kim, Y., & Lo, A. W. (2024). LLM economicus. COLM 2024.
Serapio-García, G., et al. (2023). Personality traits in large language models. arXiv:2307.00184.
Shanahan, M. (2024). Talking about large language models. Communications of the ACM, 67(2).
Shanahan, M., McDonell, K., & Reynolds, L. (2023). Role-play with large language models. Nature, 623, 493-498.
Sharma, M., et al. (2023). Towards understanding sycophancy in language models. arXiv:2310.13548.
Turpin, M., Michael, J., Perez, E., & Collins, S. (2023). Language models don’t always say what they think. arXiv:2305.04388.
Wolf, Y., et al. (2023). Fundamental limitations of alignment in large language models. arXiv:2304.11082.
위험 의사결정 패러다임
Buelow, M. T., & Suhr, J. A. (2009). Construct validity of the Iowa Gambling Task. Neuropsychology Review, 19(1), 102-114.
Figner, B., Mackinlay, R. J., Wilkening, F., & Weber, E. U. (2009). Affective and deliberative processes in risky choice: Age differences in risk taking in the Columbia Card Task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35(3), 709-730.
Lejuez, C. W., et al. (2002). Evaluation of a behavioral measure of risk taking: The Balloon Analogue Risk Task (BART). Journal of Experimental Psychology: Applied, 8(2), 75-84.
Schmitz, F., Kunina-Habenicht, O., Hildebrandt, A., Oberauer, K., & Wilhelm, O. (2020). Psychometrics of the Iowa and Berlin Gambling Tasks. Assessment, 27(1), 26-44.
통계 방법론
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research. Journal of Personality and Social Psychology, 51(6), 1173-1182.
Imai, K., Keele, L., & Tingley, D. (2010). A general approach to causal mediation analysis. Psychological Methods, 15(4), 309-334.
Kühberger, A. (1998). The influence of framing on risky decisions: A meta-analysis. Organizational Behavior and Human Decision Processes, 75(1), 23-55.
최신 LLM 심리측정
PacifAIst Benchmark (2025). Would an artificial intelligence choose to sacrifice itself for human safety? arXiv:2508.09762.
“Think Deep, Not Just Long” (2025). Measuring LLM reasoning effort via deep-thinking tokens. arXiv:2602.13517.
Nature Machine Intelligence (2025). A psychometric framework for evaluating and shaping personality traits in large language models.
Hullman, J. Validating LLM simulations as behavioral evidence. Northwestern University working paper.