References

AlDahoul, N., et al. (2026). FaceScanPaliGemma: Multi-agent vision language models for facial attribute recognition. Scientific Reports, 16.

Alrasheed, H., Alghihab, A., Pentland, A., & Alghowinem, S. (2025). Evaluating the capacity of large language models to interpret emotions in images. PLOS ONE, 20(6), e0324127.

Barrett, L. F. (2017). The theory of constructed emotion: An active inference account of interoception and categorization. Social Cognitive and Affective Neuroscience, 12(1), 1-23.

Bates, D., Machler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48.

Baudouin, J.-Y., Gallian, F., Pinoit, J.-M., & Damon, F. (2025). Arousal, valence, and discrete categories in facial emotion. Scientific Reports, 15(1), 40268.

Bhattacharyya, A., & Wang, S. (2025). Evaluating vision-language models for emotion recognition. In Findings of the Association for Computational Linguistics: NAACL 2025.

Calvo, M. G., & Nummenmaa, L. (2013). Wait, are you sad or angry? Large exposure time differences required for the categorization of facial expressions of emotion. Journal of Vision, 13(4), 14.

Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6(4), 284-290.

Dominguez-Catena, I., Paternain, D., & Galar, M. (2024). Less can be more: Representational vs. stereotypical gender bias in facial expression recognition. Progress in Artificial Intelligence, 13, 255-273.

Grynberg, D., Chang, B., Corneille, O., Maurage, P., Vermeulen, N., Berthoz, S., & Luminet, O. (2012). Alexithymia and the processing of emotional facial expressions: A systematic review, quantitative and qualitative meta-analysis. PLOS ONE, 7(8), e40259.

Harb, E., et al. (2025). Evaluating the performance of general purpose large language models in identifying human facial emotions. npj Digital Medicine, 8.

Hess, U., Adams, R. B., Jr., & Kleck, R. E. (2004). Facial appearance, gender, and emotion expression. Emotion, 4(4), 378-388.

Hugenberg, K., & Bodenhausen, G. V. (2003). Facing prejudice: Implicit prejudice and the perception of facial threat. Psychological Science, 14(6), 640-643.

Jankowiak, P., et al. (2024). Metrics for dataset demographic bias: A case study on facial expression recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8), 5520-5536.

Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

Khare, S. K., Blanes-Vidal, V., Nadimi, E. S., & Acharya, U. R. (2024). Emotion recognition and artificial intelligence: A systematic review (2014-2023). Information Fusion, 102, 102019.

Li, Y., et al. (2025). MBQ: Modality-balanced quantization for large vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Mejia-Escobar, C., Gallego-Molina, N. J., & Arias-Vergara, T. (2023). Towards a better performance in facial expression recognition: A data-centric approach. Computational Intelligence and Neuroscience, 2023.

Mollahosseini, A., Hasani, B., & Mahoor, M. H. (2017). AffectNet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, 10(1), 18-31.

Mulukutla, V. K., Pavarala, S. S., Rudraraju, S. R., & Bonthu, S. (2025). Evaluating open-source vision language models for facial emotion recognition against traditional deep learning models. arXiv preprint arXiv:2508.13524.

Pantic, M., Sebe, N., Cohn, J. F., & Huang, T. (2005). Affective multimodal human-computer interaction. In Proceedings of the 13th ACM International Conference on Multimedia (pp. 669-676).

Plant, E. A., Hyde, J. S., Keltner, D., & Devine, P. G. (2000). The gender stereotyping of emotions. Psychology of Women Quarterly, 24(1), 81-92.

Qiao, Y., et al. (2025). Empathy and emotion recognition: A three-level meta-analysis. Psychological Methods.

Refoua, S., Elyoseph, Z., Piterman, H., et al. (2026). Evaluation of cross-ethnic emotion recognition capabilities in multimodal large language models using the reading the mind in the eyes test. Scientific Reports, 16.

Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161-1178.

Savchenko, A. V., et al. (2024). AffectNet+: Soft-label facial expression recognition with improved dataset and enhanced training pipeline. arXiv preprint arXiv:2410.22506.

Scherer, K. R. (2009). The dynamic architecture of emotion: Evidence for the component process model. Cognition and Emotion, 23(7), 1307-1351.

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420-428.

Tak, A. N., & Gratch, J. (2024). GPT-4 emulates average-human emotional cognition from a third-person perspective. In Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII).

Telceken, M., Akgun, D., Kacar, S., Yesin, K., & Yildiz, M. (2025). Can artificial intelligence understand our emotions? Deep learning applications with face recognition. Current Psychology, 44(9), 7946-7956.

Zhang, Y., Yang, X., Xu, X., et al. (2024). Affective computing in the era of large language models: A survey from the NLP perspective. arXiv preprint arXiv:2408.04638.


Revision History (이 섹션 관련)

Iteration#IssueSeverityHow FixedStatus
v2→v3#6Missing Bhattacharyya & Wang (NAACL 2025)CriticalAddedDone
v3→v4#21AlDahoul et al. reference missingMinorAddedDone
v9→v10Cicchetti (1994) added for ICC criteriaMajorNew reference for replaceability analysisDone
v9→v10Shrout & Fleiss (1979) added for ICC typesMajorNew reference for ICC(2,1) methodologyDone