Summary

ANOVA/OLS의 extended version

OLS가 “모든 요인이 전집에서 고정된 상수(평균)”로 취급한다면,
LMM은 “일부 요인은 집단이나 피험자 간 변동을 반영하는 확률적 요인(확률 분포, random effect)”으로 취급

Implementation

import statsmodels.formula.api as smf
model = smf.mixedlm(
	"RT ~ C(condition) * C(group)",  # group = between-subject factor
	data=df,
	groups=df["subject"],
	re_formula="~C(condition)"  # random intercept + random slope by condition
	)
result = model.fit()
print(result.summary())

Expected Output

Mixed Linear Model Regression Results
====================================================
Model: MixedLM    Dependent Variable: RT
Group variable: subject
No. Observations: 20
----------------------------------------------------
	                Coef.   Std.Err.   z    P>|z|
Intercept           503.2     9.1    55.3  <0.001
C(condition)[T.B]   -40.9     6.5    -6.3  <0.001
Group Var            870.4
Residual            104.3

====================================================

Random Effect

NOTE

“random effect”: 결국 반복되는 단위에 대응.
간단히 말해 데이터의 비독립성(dependence) 을 모델링.

Within-subject factor (조건이 같은 피험자에게 반복됨)
→ 반드시 subject를 random effect로 둬야 함.

조건별 분산이 모델링하고 싶을 떈,
→ random_slope 추가.

OLS vs Linear Mixed Effects Model

Summary

feature에 의한 effect를 fixed하게 볼 것이냐 아니냐의 차이
OLS:

group(categorical feature에 의해 나뉘는)에 의한 효과를 하나로 봄.(fixed effect)
→ 관심 대상 전체 집단의 평균 effect( $β$ )를 알고 싶다.

그룹 내 개인차에 대한 고려 x

$y_{ij} = β_{0} + β_{1} x_{ij} + ε_{ij}, ε_{ij} \sim N (0, σ^{2})$

여기서 $β_{0}$ , $β_{1}$ 은 모든 i (예: 피험자), j (측정)에 대해 동일.

LMM:

OLS에 더불어 그룹 내에서 개인차도 고려하고자 제안.
$y_{ij} = (β_{0} + b_{0 i}) + (β_{1} + b_{1 i}) x_{ij} + ε_{ij}$

$β_{0}$ , $β_{1}$ : 전체 평균(고정효과)

$b_{0 i}$ , $b_{1 i}$ : 피험자 i의 편차(랜덤효과)

$[b_{0 i} b_{1 i}] \sim N ([00], [σ_{b 0}^{2} ρ σ_{b 0} σ_{b 1} ρ σ_{b 0} σ_{b 1} σ_{b 1}^{2}])$

→ 각 피험자마다 고유한 intercept와 slope를 가짐.
→ 하지만 그 값들이 “전적으로 자유로운 상수”가 아니라,
“어떤 분포(보통 정규분포)에서 온 확률표본”으로 간주됨.

Parameters

NOTE

기본적인 regression formula를 아래와 같이 생각했을 때,
$Y_{ij} = β_{0} + β_{1} X_{ij} + ε_{ij}$

random intercept

Summary

$Y_{ij} = β_{0} + β_{1} X_{ij} + b_{0 i} + ε_{ij}$

$b_{0 i}$ = participant i 의 random intercept
→ 참가자 당 즉, MixedLM에 group 파라미터로 주는 value 당 하나의 intercept 부여.

random slope

Summary

$Y_{ij} = β_{0} + (β_{1} + b_{1 i}) X_{ij} + b_{0 i} + ε_{ij}$

$b_{1 i}$ = participant i 의 **X에 대한 random slope

→ 같은 X 값을 넣어도 서로 다른 참가자는 다른 효과(기울기)를 보임

variance component formula

Summary

$Y_{ijk} = β_{0} + β_{1} X_{ijk} + b_{0 i} + u_{0 j}^{(v o w e l)} + w_{0 k}^{(w or d)} + ε_{ijk}$

$u_{0 j}^{(v o w e l)}$ = 모음 j 의 random intercept

$w_{0 k}^{(w or d)}$ = 단어 k 의 random intercept

역할

특정 categorical factor의 각 수준(level)마다 독립적인 random variance를 허용하는 구조.

하지만 이것들은 intercept처럼 작동하지 않음 (즉, 각 관측치에 additive shift를 주지 않음).

대신 group-level random intercept과는 독립적인, 추가적인 분산 성분을 부여하는 방식.

Juhyeon's Blog

탐색기

Linear Mixed Effects Model

Random Effect

OLS vs Linear Mixed Effects Model

Parameters

random intercept

random slope

variance component formula

그래프 뷰

목차

Properties

백링크