The effects of interventions with physical activity components on adolescent mental health: Systematic review and meta-analysis
Read full paper →- Authors
- Ruth D. Neill, Katrina Lloyd, Paul Best, Mark A. Tully
- Journal
- Mental health and physical activity
- Year
- 2020
- Citations
- 27
TL;DR
A rigorous meta-analysis of 13 RCTs found that physical activity interventions produced no statistically significant improvement in anxiety or depression in adolescents aged 10–19, meaning the evidence is currently too weak and inconsistent to confidently prescribe a specific exercise dose for adolescent mental health. ---
What they tested
The review asked whether interventions containing a physical activity component — either physical activity alone or as part of a multi-component programme (e.g., exercise plus education, exercise plus counselling) — could reduce anxiety, depression, or stress in adolescents compared to a control condition (usually no intervention, treatment as usual, or health advice only).
Physical activity types included:
Yoga (3 studies)
Aerobic exercise — walking, badminton, climbing, Frisbee, etc. (4 studies)
Strength, endurance, and resistance training (2 studies)
Fitness circuits (2 studies)
Dance movement therapy (1 study)
Football (1 study)
---
Who was studied
**Total participants across all included studies:** 1,928 adolescents (1,726 contributing data to the depression meta-analysis; 1,233 to the anxiety meta-analysis)
**Age range:** 10–19 years
**Settings:** Predominantly school-based (10 studies), plus hospital/clinical (4 studies) and community (2 studies)
**Countries:** Majority in the USA (7 studies); single studies from England, Portugal, China, Korea, Colombia, Germany, and Uganda
**Notable subpopulations:** Adolescents with obesity (3 studies), low-income or poverty backgrounds (3 studies), high depression scores or receiving treatment for depression (2 studies), youth offenders (1 study), Hispanic background (1 study), chronic neck pain (1 study), rural community (1 study)
**Sex:** 11 studies included both males and females; 1 female-only; 1 male-only
**Publication years:** 1982–2018
---
How they measured it
Mental health outcomes were measured using a wide variety of self-reported questionnaires, which is a key source of methodological heterogeneity. Specific instruments reported across studies included:
**Depression:** Multiple instruments used (not standardised across studies — hence results were pooled as Standardised Mean Differences, SMDs)
**General anxiety:** Multiple questionnaires (7 studies)
**State and trait anxiety:** Measured separately in 2 studies (e.g., using instruments like the State-Trait Anxiety Inventory)
**Test anxiety:** 1 study (Khalsa et al., 2012)
**Stress subdomains:** 2 studies measured different components of stress, including emotional regulation, problem solving, positive thinking, cognitive restructuring, secondary engagement, emotional expression, and acceptance (Frank et al., 2017); and social stress (Khalsa et al., 2012)
Because studies used different scales, all pooled effect sizes are **Standardised Mean Differences (SMDs)**, which express effects in standard deviation units rather than on any single scale.
---
Methodology
**Study design:** Systematic review and meta-analysis, following PRISMA guidelines. Only randomised controlled trials (RCTs) and cluster-RCTs were eligible for inclusion — this is the strongest study design for establishing causation.
**Search:** Nine electronic databases searched in November 2019 (including Cochrane CENTRAL, Medline, Embase, PsycINFO, CINAHL, ERIC), plus grey literature via Google Scholar and Open Grey, and manual journal searches for 2016–2019. No publication date limit was imposed.
**Screening:** Two independent reviewers screened titles, abstracts, and full texts, resolving disagreements by consensus. A third reviewer was available as a tiebreaker.
**Statistical approach:** Random-effects meta-analyses were conducted, which is appropriate when included studies vary in participants, interventions, and settings (as they do here). Effect sizes are SMDs. Heterogeneity was quantified using I² (0% = no heterogeneity; 25% = low; 50% = moderate; 75% = high). Subgroup analyses compared purely physical activity interventions vs. multi-component interventions. Funnel plots were used to check for publication bias.
**Why the design matters:** All included studies were RCTs, which means randomisation should (in principle) control for confounds. However, RCTs cannot be double-blinded in physical activity research — participants always know whether they exercised — so performance bias is structurally unavoidable. The use of a random-effects model accounts for real-world variation between interventions, which is appropriate here.
**What this design can prove:** Whether, on average across diverse adolescent populations and exercise types, physical activity interventions produce better mental health outcomes than control conditions.
**What this design cannot prove:**
Which specific type, dose, or intensity of physical activity works best
Whether effects differ meaningfully by age, sex, socioeconomic status, or baseline mental health severity
Whether effects persist beyond the end of the intervention (no long-term follow-up was common)
Causal mechanisms
**Major methodological weaknesses:**
**High risk of bias in most included studies:** 8 of 13 studies were rated high risk of bias; 3 were unclear; only 2 were rated low risk. The main problems were lack of blinding, inadequate reporting of attrition, and absent study protocols.
**Outcome heterogeneity:** Different measurement instruments across studies make direct comparisons imprecise even after SMD standardisation.
**Intervention heterogeneity:** Wildly different exercise types, durations (4 weeks to 6 months), and settings make it hard to draw specific conclusions.
**Publication bias:** Funnel plot asymmetry suggests that small studies with null results may be underrepresented — if anything, this would inflate the apparent effect.
**Self-report only:** All mental health outcomes were assessed by self-reported questionnaire, introducing potential response bias.
**Only 13 studies:** This is a small base for drawing firm conclusions about an entire population.
---
Key findings
**Anxiety (7 studies, n = 1,233):**
Mean anxiety at post-intervention follow-up: SMD = 0.04, 95% CI −0.20 to 0.28, I² = 55% — **not statistically significant**
Change in anxiety from baseline: SMD = −0.33, 95% CI −0.68 to 0.03, I² = 0% — **not statistically significant** (though the confidence interval just grazes zero, suggesting a possible small effect worth investigating)
Subgroup analysis (PA-only vs. multi-component): no significant difference between types
**Depression (12 studies, n = 1,726):**
Mean depression at post-intervention follow-up: SMD = 0.09, 95% CI −0.20 to 0.40, I² = 72% — **not statistically significant**; high heterogeneity
Change in depression from baseline: SMD = −0.11, 95% CI −0.29 to 0.07, I² = 0% — **not statistically significant**
Subgroup analysis: PA-only (p = 0.37) and multi-component (p = 0.25) — neither significant
**Stress (2 studies, narrative synthesis only, could not be meta-analysed):**
Frank et al. (2017) — school-based yoga RCT — found statistically significant improvements in emotional regulation (p = 0.05), positive thinking (p = 0.01), secondary engagement (p = 0.01), and cognitive restructuring (p = 0.01); no significant effects for problem solving, emotional expression, or acceptance
Khalsa et al. (2012) — yoga RCT — found no significant difference in social stress (p = 0.15)
**Anxiety (narrative studies not included in meta-analysis):**
Andias et al. (2018): no significant difference in state (p = 0.09) or trait anxiety (p = 0.11) after a 4-week intervention
Hilyer et al. (1982): significant improvement in trait anxiety (p < 0.001) in a fitness-plus-counselling programme for youth offenders; state anxiety not significant (p = 0.06)
---
Effect magnitude
The pooled SMD for anxiety at follow-up is 0.04 — essentially