Latent Profile Analysis

Person-Centered Quantitative Methods · 01

A person-centered approach to identifying hidden subgroups

一种以个体为中心的隐藏亚组识别方法

New to Person-Centered Methods?
什么是以人为中心的方法?

Most statistical methods you've learned (regression, ANOVA, SEM) ask: "How do variables relate to each other?" But what if different people follow entirely different patterns? Latent Profile Analysis (LPA) flips the question — instead of studying variables, it studies people, looking for hidden subgroups who share similar response patterns across multiple indicators. Prerequisites: basic understanding of means, variances, and the concept of probability distributions. Software: R (tidyLPA) or Mplus.

你学过的大多数统计方法(回归、方差分析、SEM)都在问:"变量之间如何相关?"但如果不同的人遵循完全不同的模式呢?潜在剖面分析(LPA)反转了问题——不研究变量,而是研究,寻找在多个指标上具有相似反应模式的隐藏亚组。前置知识:对均值、方差和概率分布概念的基本理解。软件:R(tidyLPA)或 Mplus。

Variable-Centered vs. Person-Centered

以变量为中心 vs. 以个体为中心

Variable-Centered
以变量为中心

Focus: Relationships among variables

Assumption: Single homogeneous population

Examples: Regression, SEM, ANOVA, CFA

"Does higher intrinsic motivation predict better achievement?"

焦点:变量之间的关系

假设:样本来自单一同质总体

举例:回归、SEM、ANOVA、CFA

"内在动机每提高一个单位,成绩是否提高 0.3 分?"
Person-Centered
以个体为中心

Focus: Identifying subgroups of individuals

Assumption: Heterogeneous — distinct latent subgroups exist

Examples: LPA, LCA, Cluster Analysis

"What types of motivation patterns exist among students?"

焦点:识别不同的个体亚组

假设:总体是异质的——存在不同的潜在亚组

举例:LPA、LCA、聚类分析

"学生中存在哪些不同的动机组合模式?"

Person-Centered Analytical Methods

以个体为中心的分析方法

Cluster Analysis

聚类分析

  • Distance-based grouping (e.g., K-means)
  • No underlying statistical model
  • No formal test for optimal clusters
  • Hard assignment only
  • 基于距离的分组(如 K-means)
  • 没有底层统计模型
  • 没有正式的最优簇数检验
  • 只能硬分类
Today's Focus
本页重点

LPA

  • Continuous indicators
  • Model-based (Finite Mixture Model)
  • Statistical fit indices for class enumeration
  • Probabilistic assignment
  • 连续型指标
  • 基于模型(有限混合模型)
  • 有统计拟合指标来确定类别数
  • 概率性分类

LCA

  • Categorical indicators
  • Same framework as LPA
  • Item response probabilities instead of means
  • Probabilistic assignment
  • 分类型指标
  • 与 LPA 同一框架
  • 用项目反应概率代替均值
  • 概率性分类

How LPA Works

LPA 的工作原理

LPA is built on a Finite Mixture Model: observed data is a mixture of K normal distributions, each representing a hidden subgroup. Using the EM algorithm, LPA infers how many subgroups exist, the mean and variance of each on each indicator, and each person's probability of belonging to each subgroup.

LPA 建立在有限混合模型之上:观测数据是 K 个正态分布的混合,每个分布代表一个隐藏的亚组。通过 EM 算法,LPA 推断出有多少个亚组、每个亚组在每个指标上的均值和方差,以及每个个体属于各亚组的概率。

01

Probabilistic Classification

概率性分类

Each individual has a probability of belonging to each class — not hard assignment. Classification uses the highest posterior probability.

e.g., A student may have posterior probabilities of .82, .13, and .05 for Profiles 1, 2, and 3 — they would be assigned to Profile 1.

每个个体都有属于每个类别的概率——不是硬性分配。最终分类取后验概率最高的类别。

例如:某学生属于 Profile 1、2、3 的后验概率分别为 .82、.13、.05 → 分配到 Profile 1。
02

Statistical Fit Indices

统计拟合指标

BIC (Bayesian Information Criterion) and BLRT (Bootstrapped Likelihood Ratio Test) allow systematic model comparison. Entropy measures classification precision (0–1); values > .80 indicate good quality (Nylund et al., 2007).

BIC(贝叶斯信息准则)和 BLRT(自助似然比检验)可用于系统性的模型比较。Entropy 衡量分类精度(0–1),> .80 表示分类质量好(Nylund et al., 2007)。

LPA Workflow

LPA 工作流程

1
Theory &
Indicators
理论 &
指标选择
Select indicators from theoretical framework
基于理论框架选择指标变量
2
Fit 1 to K
Models
拟合 1 到 K
个模型
Incrementally increase class number
逐步增加类别数量
3
Compare
Fit Indices
比较
拟合指标
BIC, BLRT, entropy
4
Interpret
Profiles
解读
剖面图
Read profile plot, name classes
阅读剖面图,命名各类别
5
Validate
验证
Covariates & outcomes (three-step)
协变量 & 结果变量(三步法)
Practical Tips
实用建议

Sample size: N ≥ 300–500 (Nylund-Gibson & Choi, 2018). Random starts: ≥ 500 to avoid local solutions. Smallest class: ≥ 5–8% of sample. Entropy: > .80 indicates good classification — but do NOT use it to select K.

样本量:N ≥ 300–500(Nylund-Gibson & Choi, 2018)。随机起始值:≥ 500 以避免局部最优解。最小类别比例:≥ 样本的 5–8%。Entropy:> .80 表示分类质量好——但不要用它来选择 K。


Article Example

文献示例

Wang, C. K. J., Liu, W. C., Nie, Y., Chye, S., Lim, B. S. C., Liem, G. A., Tay, E. G., Hong, Y.-Y., & Chiu, C.-Y. (2017). Latent profile analysis of students' motivation and outcomes in mathematics: An organismic integration theory perspective. Heliyon, 3(6), e00308. https://doi.org/10.1016/j.heliyon.2017.e00308
Organismic Integration Theory (SDT) × Latent Profile Analysis

Research Design & Indicator Selection

研究设计与指标选择

Organismic Integration Theory (OIT), a sub-theory of Self-Determination Theory, posits that academic motivation is not a simple "high vs. low" dimension but a continuum of self-determination — from external regulation to intrinsic motivation. Variable-centered methods would ask: "For every one-unit increase in intrinsic motivation, does achievement increase by 0.3 points?" But in reality, every student has a simultaneous score on all four motivational types — creating a unique combination pattern.

The person-centered question: Are there distinct subgroups of students characterized by unique combinations of these motivations?

The study hypothesized that (H1) at least 4 distinct profiles would emerge based on OIT motivation types, and (H2) more autonomous profiles would show higher effort, value, competence, and extra time on math. Four motivation types from the Self-Regulation Questionnaire–Academic (SRQ-A; Ryan & Connell, 1989) served as LPA indicators — external regulation (4 items), introjected regulation (4 items), identified regulation (3 items), and intrinsic motivation (3 items) — measured on a 7-point Likert scale. The sample comprised N = 1,151 secondary school students (679 males, 444 females, 28 unreported; age 13–17, M = 14.69, SD = .58) from 5 schools in Singapore. Models were estimated in Mplus 7.2 using the MLR estimator with 10,000 random starts (500 best solutions retained).

有机整合理论(OIT)是自我决定理论的子理论,认为学业动机不是简单的"高 vs. 低"维度,而是一个从外部调节到内在动机的自我决定连续体。以变量为中心的方法会问:"内在动机每提高一个单位,成绩是否提高 0.3 分?"但现实中,每个学生在四种动机类型上同时拥有得分——形成独特的组合模式。

以个体为中心的研究问题:学生中是否存在具有独特动机组合的不同亚组?

研究假设:(H1) 基于 OIT 动机类型至少会出现 4 个不同的剖面;(H2) 更自主的剖面在努力、价值感、胜任感和数学学习时间上表现更好。四种动机类型来自学业自我调节问卷(SRQ-A; Ryan & Connell, 1989)作为 LPA 指标——外部调节(4 题)、内摄调节(4 题)、认同调节(3 题)和内在动机(3 题)——采用 7 点 Likert 量表。样本为新加坡 5 所学校的 N = 1,151 名中学生(679 男,444 女,28 未报告;年龄 13–17 岁,M = 14.69, SD = .58)。使用 Mplus 7.2 的 MLR 估计器,10,000 个随机起始值(保留 500 个最佳解)。

Step 1: Model Comparison & Class Enumeration

步骤一:模型比较与类别数确定

Latent Profile Fit Statistics for Models Based on the Four Motivational Types
Table 2. Latent profile fit statistics for models with 1–8 profiles based on the four motivational types.
表 2. 基于四种动机类型的 1–8 个剖面模型的拟合统计量。
Decision Rules (Nylund et al., 2007; Nylund-Gibson & Choi, 2018)
决策规则(Nylund et al., 2007; Nylund-Gibson & Choi, 2018)

BIC: Lower is better — look for the "elbow" where decline slows. BLRT: Most accurate across all conditions — significant p means K > K−1. aLMR: Adjusted Lo-Mendell-Rubin test — non-significant p suggests current K is sufficient. When indices disagree: Prioritize BIC + BLRT, combined with theoretical interpretability and class size. Here, the 4-profile solution was selected: the aLMR became non-significant beyond 4 profiles, fit improvements were marginal, and each profile was theoretically interpretable.

BIC:越低越好——寻找下降趋缓的"拐点"。BLRT:在所有条件下最准确——p 显著说明 K 优于 K−1。aLMR:调整后的 Lo-Mendell-Rubin 检验——p 不显著说明当前 K 已足够。指标不一致时:优先考虑 BIC + BLRT,结合理论可解释性和类别大小。本研究选择了 4 剖面方案:aLMR 在 4 剖面之后变得不显著,拟合改善微弱,且每个剖面理论上可解释。

Step 2: Interpreting the Profile Plot

步骤二:解读剖面图

Profile plot showing four motivation profiles across four SDT indicators
Figure 1. Four motivation profiles across four SDT indicators (Extreg = External Regulation, Intro = Introjected Regulation, Ident = Identified Regulation, Intmot = Intrinsic Motivation).
图 1. 四种动机剖面在四个 SDT 指标上的表现(Extreg = 外部调节, Intro = 内摄调节, Ident = 认同调节, Intmot = 内在动机)。
5.8%
Low Motivation
低动机型
Near-average external regulation but very low introjected, identified, and intrinsic motivation (n = 67)
外部调节接近平均水平,但内摄调节、认同调节和内在动机都很低(n = 67)
Near 5% threshold — may be unstable with smaller samples
接近 5% 阈值——在较小样本中可能不稳定
10.2%
Externally Driven
外部驱动型
High external & identified regulation, but very low intrinsic motivation — regulated by external demands (n = 118)
高外部调节和认同调节,但内在动机很低——受外部要求驱动(n = 118)
50.7%
Autonomous
自主型
High identified regulation & intrinsic motivation — the most self-determined and largest group (n = 584)
高认同调节和内在动机——最具自我决定性、也是最大的群体(n = 584)
33.2%
Moderate
中等型
Low identified regulation & intrinsic motivation with moderate external and introjected regulation (n = 382)
低认同调节和内在动机,中等外部调节和内摄调节(n = 382)
Reading Profile Plots
如何阅读剖面图

Focus on the shape of the line (the pattern across indicators), not just absolute levels. Name each profile based on its most distinctive features.

关注线条的形状(各指标上的模式),而非仅看绝对水平。根据每个剖面最独特的特征来命名。

Step 3: Outcome Validation

步骤三:结果验证

Do the profiles differ on meaningful academic outcomes?

这些剖面在有意义的学业结果上是否存在差异?

Outcome validation plot showing four profiles across hours, effort, value, and competence
Figure 2. Outcome differences across four profiles (Hrs = Math Study Time, Effort = Self-Reported Effort, Value = Task Value, Comp = Perceived Competence).
图 2. 四种剖面在结果变量上的差异(Hrs = 数学学习时间, Effort = 自评努力, Value = 任务价值, Comp = 感知胜任力)。

Autonomous Advantage

自主型优势

The Autonomous profile (P3) consistently outperformed all other groups across every outcome: effort (3 > 2 > 4 > 1), task value (3 > 2 = 4 > 1), perceived competence (3 > 4 > 2 = 1), and math study hours (3 > 4 = 2 = 1). High autonomous motivation led to the most adaptive outcomes.

自主型(P3)在所有结果变量上均优于其他组:努力(3 > 2 > 4 > 1)、任务价值(3 > 2 = 4 > 1)、感知胜任力(3 > 4 > 2 = 1)和数学学习时间(3 > 4 = 2 = 1)。高自主动机带来最适应性的结果。

Effort Is Graded by Self-Determination

努力随自我决定程度递增

Effort showed a clear gradient across profiles: P3 > P2 > P4 > P1. Notably, the Externally Driven group (P2) reported higher effort than the Moderate group (P4), suggesting external pressure can sustain effort — but the Autonomous group's effort still surpassed all others.

努力在各剖面间呈现清晰的梯度:P3 > P2 > P4 > P1。值得注意的是,外部驱动组(P2)的努力高于中等组(P4),说明外部压力能维持努力——但自主组的努力仍然高于所有其他组。

Competence Requires Intrinsic Interest

胜任感需要内在兴趣

The Externally Driven profile (P2) showed no advantage in perceived competence over the Low Motivation group (P1), despite P2's higher external and identified regulation (2 = 1). In contrast, even the Moderate group (P4) outperformed P2 in competence, suggesting that intrinsic interest — not external pressure — is essential for building academic confidence.

外部驱动型(P2)在感知胜任力上相比低动机组(P1)没有任何优势,尽管 P2 的外部调节和认同调节更高(2 = 1)。相反,中等组(P4)在胜任感上优于 P2,说明内在兴趣——而非外部压力——是建立学业自信的关键。


Extensions

方法拓展

Cross-Group Comparison
跨组比较

Morin et al. (2016) Six-Step Framework:

1. Configural similarity (same # of profiles?)
2. Structural similarity (same means?)
3. Dispersion similarity (same variances?)
4. Distributional similarity (same proportions?)
5. Predictive similarity (same predictors?)
6. Explanatory similarity (same outcomes?)

Example: N. America vs. France — 5 profiles found in both groups; structural similarity supported, but distributional differences detected.

Morin et al. (2016) 六步框架:

1. 形态相似性(剖面数量相同?)
2. 结构相似性(均值相同?)
3. 离散相似性(方差相同?)
4. 分布相似性(比例相同?)
5. 预测相似性(预测因素相同?)
6. 解释相似性(结果变量相同?)

示例:北美 vs. 法国——两组均发现 5 个剖面;结构相似性得到支持,但分布差异被检测到。
Longitudinal Extensions
纵向拓展

Latent Transition Analysis (LTA) — Tracks how individuals transition between profiles over time. Estimates transition probabilities.

Growth Mixture Modeling (GMM) — Identifies distinct developmental trajectory classes (e.g., increasing, stable, declining).

Other: Multilevel LCA/LPA, factor mixture models, Bayesian estimation for small samples.

潜在转变分析(LTA)——追踪个体如何随时间在剖面之间转变,估计转变概率。

增长混合模型(GMM)——识别不同的发展轨迹类别(如上升、稳定、下降)。

其他:多层次 LCA/LPA、因子混合模型、小样本的贝叶斯估计。

References & Software

参考文献与软件

Key References

核心文献

  • BeginnerNylund-Gibson, K., & Choi, A. Y. (2018). Ten frequently asked questions about latent class analysis. Translational Issues in Psychological Science, 4(4), 440–461. https://doi.org/10.1037/tps0000176
  • Fit IndicesNylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14(4), 535–569. https://doi.org/10.1080/10705510701575396
  • Multi-GroupMorin, A. J. S., Meyer, J. P., Creusier, J., & Biétry, F. (2016). Multiple-group analysis of similarity in latent profile solutions. Organizational Research Methods, 19(2), 231–254. https://doi.org/10.1177/1094428115621148
  • AppliedWang, C. K. J., Liu, W. C., Nie, Y., et al. (2017). Latent profile analysis of students' motivation and outcomes in mathematics. Heliyon, 3(6), e00308. https://doi.org/10.1016/j.heliyon.2017.e00308

Software

软件工具

Mplus

Gold standard. Most flexible, best support for LPA/LCA.

黄金标准。最灵活,对 LPA/LCA 支持最好。

R — tidyLPA

Free, user-friendly. Good for learning and basic analyses.

免费、易用。适合学习和基础分析。

Python — sklearn

Gaussian Mixture. Lacks BLRT/entropy but adequate for exploration.

高斯混合模型。缺少 BLRT/entropy,但足以做探索性分析。