Precision exercise medicine: understanding exercise response variability
Read full paper →- Authors
- Robert Ross, Bret H. Goodpaster, Lauren G. Koch, Mark A. Sarzynski, Wendy M. Kohrt, Neil M. Johannsen, James S. Skinner, Alex Castro, Brian A. Irving, Robert C. Noland, Lauren M. Sparks, Guillaume Spielmann, Andrew G. Day, Werner Pitsch, Will G. Hopkins, Claude Bouchard
- Journal
- British Journal of Sports Medicine
- Year
- 2019
- Citations
- 260
TL;DR
When people follow the same exercise programme, their improvement in cardiorespiratory fitness varies enormously — from zero gain to a 50% increase — and about 50% of that variability is explained by genetics, meaning you cannot predict your own response from someone else's results.
What they tested
This is a consensus statement and narrative review, not a single experiment. The authors synthesised evidence from:
**Human twin and family studies** (particularly the HERITAGE Family Study) that measured how much VO₂max (maximal oxygen uptake, a measure of cardiorespiratory fitness) changed after a standardised exercise programme.
**Animal selection experiments** where rats were bred over 15 generations to be either "high response trainers" (HRT) or "low response trainers" (LRT) to the same exercise dose.
**Inbred rodent strain comparisons** where genetically uniform mouse and rat strains were given identical exercise programmes and their fitness gains compared.
The primary outcome was **change in cardiorespiratory fitness (CRF)**, measured as:
In humans: change in VO₂max (mL/kg/min) after 20 weeks of supervised exercise.
In rats: change in maximal treadmill running distance (ΔDIST in metres) after 8 weeks of standardised training.
In mice: change in time to exhaustion (minutes) after 4 weeks of training.
No comparator group was tested in the traditional sense — this was a review of existing evidence on *variability* rather than a head-to-head intervention trial.
Who was studied
The review draws on multiple study populations:
**HERITAGE Family Study:** ~742 sedentary adults from ~200 nuclear families (parents and adult offspring), aged 17–65 years, from five clinical centres in the USA and Canada. Participants were healthy, non-smoking, sedentary (no regular exercise for ≥6 months), with no medications affecting cardiovascular or metabolic function.
**Twin studies:** Monozygotic (identical) and dizygotic (fraternal) twin pairs, ranging from ~50 to ~200 pairs depending on the specific study.
**Rat selection experiments:** 152 genetically heterogeneous N:NIH rats (both sexes) as the founding population, then selectively bred over 15 generations to create LRT and HRT lines.
**Inbred mouse studies:** 24 inbred mouse strains (genetically uniform within each strain), both sexes.
**Inbred rat studies:** 10 commonly used inbred rat strains.
How they measured it
**Humans:** VO₂max measured via graded exercise test on a cycle ergometer or treadmill with breath-by-breath gas analysis. Standard protocol: warm-up, then incremental increases in workload every 2–3 minutes until volitional exhaustion, with criteria including respiratory exchange ratio ≥1.1, heart rate within 10 bpm of age-predicted maximum, and plateau in oxygen consumption. Measured before and after 20 weeks of exercise training.
**Rats (selection experiment):** Maximal treadmill running distance (ΔDIST) measured using a graded treadmill test to exhaustion before and after an 8-week standardised training programme. Treadmill speed and incline increased progressively until the rat could no longer maintain pace.
**Mice (inbred strains):** Time to exhaustion on a motorised treadmill, measured before and after a 4-week training programme.
**Heritability estimates:** Calculated using variance components analysis in family studies (comparing within-family vs between-family variance) and by comparing variance between inbred strains vs within inbred strains.
Methodology
**Study design:** This is a narrative review and consensus statement, not a primary study. However, the authors critically evaluate the methodological requirements for *properly* studying individual response variability, which is the core contribution.
**Key design considerations discussed:**
1. **Randomised controlled designs with multiple pretests and post-tests:** The gold standard for quantifying individual response. A single pre-test and post-test cannot distinguish true response from measurement error and day-to-day fluctuation. The authors recommend at least 2–3 pre-training measurements and 2–3 post-training measurements to estimate the "true" change for each individual.
2. **Crossover designs:** Each participant serves as their own control, receiving both exercise and no-exercise (or different exercise doses) in random order with a washout period. This controls for between-person confounding but requires that the effect of the first period does not carry over.
3. **Repeated measures designs:** Multiple measurements across time within individuals, allowing estimation of within-person variability and detection of true responders vs non-responders.
**Why design matters:** The central problem is that if you only measure VO₂max once before and once after training, the observed change includes:
The true training effect
Measurement error (biological and technical)
Day-to-day biological fluctuation (which can be 3–5% for VO₂max)
Without multiple baseline and post-training measurements, you cannot determine whether someone who shows no improvement is a "true non-responder" or just had a bad measurement day. The authors estimate that the "typical error" (within-subject standard deviation) for VO₂max is about 2–3% in well-controlled lab settings. This means an individual needs to show a change of at least ~5–6% (about 2× the typical error) to be considered a "true responder" with reasonable confidence.
**What this design can prove:**
That interindividual variability in exercise response exists beyond measurement error.
That a substantial portion of this variability (about 50%) is genetic.
That non-response to exercise is real for some individuals at standard doses.
**What it cannot prove:**
That non-responders cannot improve with a different dose, intensity, or type of exercise.
The specific genes responsible (though candidate gene studies exist, they are not the focus here).
Whether the same variability exists for other outcomes (e.g., blood pressure, insulin sensitivity, body composition) — the review focuses on CRF.
**Major methodological weaknesses acknowledged by the authors:**
Most existing studies used only single pre- and post-training measurements, making it impossible to distinguish true non-responders from measurement error.
Many studies lacked a no-exercise control group, so regression to the mean and time effects cannot be ruled out.
The HERITAGE study, while rigorous, used a fixed exercise dose (55–75% of VO₂max, 3 days/week, 20 weeks) — results may not generalise to other doses or modalities.
Animal studies used forced treadmill running, which may involve stress responses not present in voluntary human exercise.
Key findings
**Primary finding: Exercise response variability is large and real**
In the HERITAGE Family Study, after 20 weeks of identical exercise training, the change in VO₂max ranged from **−2% to +58%** across individuals. About **10–15% of participants showed no measurable improvement** (change less than the typical error of measurement).
In rat selection experiments, the founding heterogeneous population showed ΔDIST ranging from **−339 metres to +627 metres** after 8 weeks of training — a nearly 1000-metre spread.
After 15 generations of selective breeding, LRT rats actually **lost** an average of 65 metres of running capacity with training, while HRT rats gained an average of 223 metres.
**Secondary finding: Heritability of CRF trainability is ~50%**
In the HERITAGE study, after adjusting for baseline VO₂max, age, sex, and body mass, **heritability accounted for approximately 50% of the variance** in VO₂max response to training.
In inbred mouse strains, broad-sense heritability was **0.58 for change in running time** and **0.54 for gain in total work performed**.
In inbred rat strains, strain (genotype) explained the majority of variance in ΔDIST, with sex and initial body weight having no significant influence.
**Secondary finding: Non-response is not due to lack of effort**
In the HERITAGE study, all exercise sessions were supervised, heart rate was monitored to ensure target intensity, and compliance was >95%. Non-response was not attributable to poor adherence.
In animal studies, training was forced (treadmill), eliminating motivation as a confound.
**Secondary finding: Baseline fitness does not predict trainability**
In the rat selection experiments, LRT and HRT lines had similar baseline running capacity before training. The difference only emerged in response to training.
In human studies, baseline VO₂max was a weak predictor of VO₂max change (r² typically <0.05).
**Secondary finding: Response variability is not limited to CRF**
The authors note that similar variability exists for other cardiometabolic traits (blood pressure, insulin sensitivity, HDL cholesterol), though the review focuses on CRF.
Effect magnitude
**In plain English:**
If 100 people follow the same exercise programme (e.g., jogging 30 minutes, 3 times per week for 5 months), their fitness gains will range from **zero to about 50% improvement**.
About **1 in 7 people will see no measurable improvement** in their VO₂max — their fitness will stay the same despite doing the exercise.
About **1 in 7 people will improve by 30% or more** — a massive gain that would take them from "poor" to "excellent" fitness.
The average person improves by about **15–20%** , but this average hides enormous individual differences.
**Genetics explains about half** of why some people improve a lot and others improve little. The other half is a mix of diet, sleep, stress, hormones, and other factors not yet identified.
**Being a "non-responder" to one type of exercise does not mean you cannot respond to another type** — the review notes that some non-responders to endurance training may respond to high-intensity interval training or resistance training, though this is not systematically tested.
**Comparison for context:**
A 15% increase in VO₂max (the average response) is roughly equivalent to the difference between a 40-year-old sedentary person and a 40-year-old who exercises regularly — about 3–4 mL/kg/min.
A 50% increase (top responders) is like going from "very poor" fitness to "excellent" fitness — equivalent to reversing 10–15 years of age-related decline.
Zero improvement means the exercise provided no cardiovascular benefit for that individual, at least for that specific outcome.
Limitations
**What the authors acknowledge:**
1. **Focus on CRF only:** The review explicitly limits itself to cardiorespiratory fitness. Variability in other exercise responses (muscle strength, blood pressure, glucose metabolism, body composition) may follow different patterns and have different heritability estimates.
2. **Single exercise dose:** Most evidence comes from moderate-intensity continuous training (55–75% VO₂max, 3–5 days/week). Results may not generalise to high-intensity interval training, resistance training, or different volumes.
3. **Lack of control groups in many studies:** Many of the foundational twin and family studies did not include a no-exercise control group, making it difficult to separate true training effects from regression to the mean or time effects.
4. **Measurement error challenges:** Most existing studies used single pre- and post-training measurements, which inflates apparent variability and makes it impossible to identify true non-responders with confidence.
5. **Animal model limitations:** Rodent studies used forced treadmill running, which involves stress and may not reflect voluntary human exercise behaviour. Also, rodent training durations (4–8 weeks) are proportionally longer relative to lifespan than typical human studies.
6. **Population limits:** The HERITAGE study was predominantly white (about 70% white, 30% black). Heritability estimates may differ in other ethnic groups.
**What a critical reader would add:**
7. **No mechanistic explanation:** The review documents *that* variability exists and *that* genetics plays a role, but does not identify specific genes, pathways, or mechanisms. This limits actionable insights.
8. **No dose-response data:** The review does not address whether non-responders at standard doses would respond at higher doses. Some evidence (not in this review) suggests that increasing exercise volume or intensity can convert some non-responders to responders.
9. **Publication bias:** Studies showing large variability are more interesting and more likely to be published than studies showing uniform responses. The true extent of variability may be smaller than reported.
10. **Industry funding:** The symposium was supported by the Pennington Biomedical Research Center and various NIH grants. No direct industry funding is declared, but some authors have consulted for exercise equipment or pharmaceutical companies.
11. **Consensus statement limitations:** This is expert opinion, not a systematic review or meta-analysis. The authors selected studies they considered most relevant, which introduces selection bias.
Practical takeaways
For someone running their own n=1 experiment:
### What to test
**Specific intervention:** A standardised endurance exercise programme modelled on the HERITAGE protocol:
**Frequency:** 3 days per week
**Intensity:** 55–75% of your heart rate reserve (or 65–85% of max heart rate) — this corresponds to "brisk" to "somewhat hard" effort
**Duration:** 30–50 minutes per session
**Type:** Treadmill walking/jogging, cycling, or elliptical — choose one modality and stick with it
**Total programme:** 20 weeks (the HERITAGE duration)
**Alternative to test:** If you show no improvement after 20 weeks, try:
High-intensity interval training (4×4 minutes at 85–95% max heart rate, 3×/week)
Resistance training (3 sets of 8–12 reps, 3×/week)
Higher volume endurance training (5–6 days/week, 45–60 minutes)
### Minimum meaningful duration
**For detecting a change in VO₂max:** At least **8–12 weeks** of consistent training. The HERITAGE study used 20 weeks, but measurable changes typically appear by 8–12 weeks.
**For determining if you are a "non-responder":** You need **at least 2 baseline measurements** (taken on separate days, at the same time of day) and **2 post-training measurements** to distinguish true change from measurement noise.
**For a crossover test** (e.g., comparing endurance vs HIIT): Each phase should be **12–16 weeks** with a **4–6 week washout** (return to sedentary or very light activity) between phases.
### What to measure
**Primary metric:** Estimated VO₂max (or VO₂peak)
**Gold standard:** Lab-based graded exercise test with gas analysis (expensive, ~$200–$400)
**Good alternative:** Submaximal fitness test — e.g., the **YMCA submaximal cycle test** or **Rockport 1-mile walk test**. These estimate VO₂max from heart rate response to a standard workload. Accuracy is ±10–15% but sufficient for tracking change.
**DIY option:** Use a fitness watch with VO₂max estimation (Garmin, Apple Watch, Polar). These are less accurate (±15–20%) but can track trends if you wear them consistently.
**Secondary metrics (to see if you are a "responder" in other ways):**
Resting heart rate (measure upon waking, before getting out of bed)
Heart rate during a standard submaximal effort (e.g., 5-minute jog at the same speed)
Blood pressure (resting, measured at the same time each day)
Body weight and waist circumference
Subjective energy and mood (daily 1–10 rating)
**Measurement schedule:**
**Baseline:** 2–3 measurements over 1–2 weeks (to establish typical error)
**During training:** Measure every 4 weeks
**Post-training:** 2 measurements in the final week
### Key confounds to control for
1. **Measurement conditions:** Always test at the same time of day, at least 24 hours after your last exercise session, and at least 2 hours after eating. Avoid caffeine and alcohol for 12 hours before testing.
2. **Sleep:** Poor sleep reduces VO₂max by 5–10%. Track sleep quality and duration. If you sleep poorly the night before a test, reschedule.
3. **Hydration and nutrition:** De