Systematic ReviewWikiLeading journalOmega-3 Probiotics NutritionHigh evidence score

Systematic Review of Gut Microbiota and Major Depression

Authors: Stephanie Cheung, Ariel Goldenthal, Anne‐Catrin Uhlemann, J. John Mann, Jeffrey M. Miller, M. Elizabeth Sublette
Journal: Frontiers in Psychiatry
Year: 2019
DOI: 10.3389/fpsyt.2019.00034
Citations: 593

TL;DR

This systematic review of six human case-control studies found that people with major depressive disorder (MDD) have different gut bacteria compared to healthy controls, but the specific bacteria that differ vary wildly between studies — no single bacterial species or group consistently predicts depression, suggesting that focusing on what bacteria *do* (their metabolic functions) may be more useful than just listing which ones are present.

What they tested

The researchers conducted a systematic review asking one core question: **Do people with major depressive disorder have a different composition of gut bacteria compared to healthy controls?**

They searched PubMed for human case-control studies that:

Compared gut microbiota from stool samples in people with MDD versus healthy controls

Used DNA sequencing or quantitative PCR to identify bacterial types (taxa)

Were published in English before February 28, 2018

They did not test any intervention themselves. Instead, they synthesised findings across six existing studies, looking at which bacterial groups (from broad phyla down to specific genera) were higher or lower in depressed versus non-depressed people.

The outcome measures were:

**Alpha diversity:** How many different types of bacteria live in one person's gut (within-sample diversity)

**Beta diversity:** How different the bacterial communities are between groups (between-sample diversity)

**Relative abundance of specific bacterial taxa:** What percentage of the total bacterial population each group represents

Who was studied

The six studies included a total of **392 participants** — 204 with MDD and 188 healthy controls. Here are the specifics of each study:

|-------|---------|---------|-------------|------------|----------|

**Important population details:**

Five of six studies were conducted in Asia (four in China, one in Japan); one was in Norway

All MDD participants were on psychiatric medications (33–100% on antidepressants across studies)

One study (Lin et al.) standardised treatment to exactly 10 mg escitalopram per day

Two studies included participants on antipsychotics

Only three studies documented smoking status, and smokers were not matched between groups

One study (Naseribafrouei et al.) used controls who had neurological symptoms but no diagnosed condition — not truly healthy controls

Depression severity ranged from moderate (HAM-17 score ≥23) to severe (HAM-24 score ≥20)

How they measured it

Each study collected **stool samples** and analysed bacterial DNA using one of two methods:

1. **16S rRNA gene sequencing** (five studies): This technique reads a specific "barcode" gene that all bacteria share but varies enough to identify different types. It can detect hundreds of bacterial species at once but only tells you *which* bacteria are present, not how many of each there are in absolute terms.

2. **Quantitative PCR (qPCR)** (one study, Aizawa et al.): This method counts specific bacterial groups you already know you're looking for. It's more precise for targeted bacteria but misses everything you didn't think to test for.

**Diversity metrics used:**

**Alpha diversity:** Shannon index, Simpson's index, Chao1 richness estimator, ACE, Evenness — these measure how many different types of bacteria and how evenly they're distributed within one person's sample

**Beta diversity:** Principal Coordinate Analysis (PCoA) of Unweighted UniFrac distances — this measures how different the whole bacterial community is between groups

**Depression assessment tools:**

Hamilton Depression Rating Scale (HAM-17, HAM-21, or HAM-24 — different versions with 17, 21, or 24 items; higher scores = more severe depression)

Montgomery-Åsberg Depression Rating Scale (MADRS; 10 items, higher = more severe)

Mini International Neuropsychiatric Interview (MINI) — structured diagnostic interview

Structured Clinical Interview for DSM-IV (SCID) — gold standard diagnostic tool

Methodology

**Study design:** This is a **systematic review** — not a new experiment, but a rigorous synthesis of existing studies. The authors followed a structured protocol: they searched PubMed with specific terms ("depression," "depressive disorder," "stool," "fecal," "gut," "microbiome"), had two independent raters review all results and reach consensus on inclusion, and extracted data on bacterial taxa that differed between groups.

**What the included studies did (and why it matters):**

All six studies were **case-control designs** — they compared people who already had depression to people who didn't. This is fundamentally different from an experiment where you change something and measure the effect.

**Why case-control design matters for interpretation:**

**Cannot prove causation:** If depressed people have different gut bacteria, we don't know if the bacteria caused the depression, the depression changed the bacteria, or something else (diet, medication, stress) changed both

**No randomisation:** Participants weren't randomly assigned to be depressed or not — they came to the study with their condition already established

**No blinding:** Researchers knew who was depressed and who wasn't when analysing the samples (though DNA sequencing is somewhat objective)

**Confounding is rampant:** Depressed people eat differently, sleep differently, take different medications, and have different stress levels — all of which affect gut bacteria

**Specific methodological features of the included studies:**

**Duration:** These were cross-sectional — they took a single stool sample from each person at one point in time. No one followed participants over time to see if bacterial changes preceded or followed depression onset.

**Medication confound:** This is the biggest problem. Between 33% and 100% of MDD participants were on antidepressants. Antidepressants have antimicrobial properties and directly alter gut bacteria. Two studies included people on antipsychotics, which also affect the microbiome. The one study that standardised medication (Lin et al., all on escitalopram) had only 10 people per group.

**Antibiotic exclusion:** All but one study excluded people who had taken antibiotics in the past month. One study (Naseribafrouei et al.) didn't report antibiotic use at all.

**Probiotic exclusion:** Three studies excluded recent probiotic users; one study (Aizawa et al.) had five participants on probiotics.

**Medical exclusions varied:** Some studies excluded people with diabetes, heart disease, IBS, or inflammatory bowel disease; others didn't. One study (Zheng et al.) only excluded medical illness in the control group, not the MDD group — a major design flaw.

**What this design can prove:**

That there are statistical associations between depression and gut bacterial composition

Which bacterial groups tend to differ between depressed and non-depressed populations

That the relationship is complex and inconsistent across populations

**What this design cannot prove:**

That changing gut bacteria would treat or prevent depression

Whether bacterial differences are a cause, consequence, or coincidence of depression

That any specific bacterial group is a reliable biomarker for depression

**Statistical approach:** The review didn't perform a meta-analysis (pooling the numbers statistically) because the studies were too different in methods, populations, and reporting. Instead, they did a narrative synthesis — listing which taxa differed and in which direction.

Key findings

**Overall result:** No consistent pattern emerged across studies. Of the 50 bacterial taxa that showed statistically significant differences (p < 0.05) between MDD and controls in at least one study, **none showed the same direction of difference in all studies.**

**By phylum (broad bacterial groups):**

**Firmicutes:** Most differentiating taxa were in this phylum — 9 families and 12 genera differed between groups. But results were split: some genera were higher in MDD, some lower, and some went in opposite directions across studies.

**Bacteroidetes:** Divergent results across studies — some found higher levels in MDD, others found lower

**Actinobacteria:** Divergent results

**Fusobacteria:** Divergent results

**Proteobacteria:** Divergent results

**Family-level finding:** **Lachnospiraceae** differentiated MDD from controls in four of six studies — but in two studies it was higher in MDD, and in two studies it was lower. Even the most consistent finding was inconsistent.

**Genera consistently higher in MDD (across studies that found a difference):**

*Anaerostipes*

*Blautia*

*Clostridium*

*Klebsiella*

*Lachnospiraceae incertae sedis*

*Parabacteroides*

*Parasutterella*

*Phascolarctobacterium*

*Streptococcus*

**Genera consistently lower in MDD:**

*Bifidobacterium*

*Dialister*

*Escherichia/Shigella*

*Faecalibacterium*

*Ruminococcus*

**Genera with divergent findings (higher in some studies, lower in others):**

*Alistipes*

*Bacteroides*

*Megamonas*

*Oscillibacter*

*Prevotella*

*Roseburia*

**Diversity findings:** Results for alpha diversity (how many types of bacteria live in one person) were mixed — some studies found lower diversity in MDD, others found no difference. Beta diversity (how different the communities are between groups) was also inconsistent.

**No study reported effect sizes** (like Cohen's d or odds ratios) — they only reported p-values for individual bacterial comparisons. This means we cannot say *how much* the bacteria differed, only *that* they differed statistically.

Effect magnitude

Because the studies reported only p-values (not effect sizes) and the results were inconsistent, it's impossible to give a meaningful effect magnitude. However, here's what we can say:

In the studies that found differences, the relative abundance of specific bacterial genera typically differed by a few percentage points of the total bacterial population — not dramatic shifts like doubling or halving

For example, *Bifidobacterium* (a genus commonly found in probiotic supplements) was lower in MDD in some studies, but the absolute difference was typically in the range of 1–5% of total bacteria

The inconsistency across studies is itself a finding: the "signal" of depression-related bacterial changes is weak enough that it gets drowned out by differences in diet, medication, geography, and lab methods

**Translation for a self-experimenter:** If you were to measure your own gut bacteria before and after some intervention, the changes you'd see from normal day-to-day variation (what you ate yesterday, whether you're stressed, your sleep quality) would likely be larger than any "depression signature" these studies detected.

Limitations

**What the authors acknowledge:**

Small sample sizes (10–63 per group)

Geographic limitation (five of six studies in Asia)

Medication confounds (antidepressants, antipsychotics affect gut bacteria)

Lack of standardised methods for DNA sequencing and analysis

No meta-analysis possible due to heterogeneity

Cross-sectional design cannot establish causation

**What a critical reader would add:**

1. **No healthy control group in one study:** Naseribafrouei et al. used controls with neurological symptoms — this is not a true control group and contaminates the comparison

2. **Diet was not controlled:** Diet is the single biggest factor shaping gut microbiota, and depressed people eat differently. None of the studies controlled for or even measured diet systematically

3. **Stool samples are imperfect proxies:** Stool bacteria don't perfectly represent the bacteria living in your intestinal lining (mucosal bacteria), which may be more relevant to brain-gut signalling

4. **16S sequencing has limited resolution:** It can identify bacteria to genus level but often not to species or strain level — different species within the same genus can have opposite effects on health

5. **Multiple comparisons problem:** Testing hundreds of bacterial groups for differences inflates the chance of false positives. The studies reported p < 0.05 without correction for multiple testing in most cases

6. **Publication bias:** Studies that find no differences between groups are less likely to be published, so the literature may overrepresent positive findings

7. **Industry funding not reported:** The review doesn't disclose whether any studies had funding from probiotic or pharmaceutical companies

8. **No preregistration:** The systematic review protocol wasn't preregistered (e.g., on PROSPERO), so we can't verify they didn't change their analysis plan after seeing the results

9. **Only one database searched:** PubMed alone — they may have missed relevant studies in Embase, Web of Science, or other databases

10. **Language restriction:** English only, which may bias toward certain countries and populations

Practical takeaways

For someone running their own n=1 experiment:

### What to test

Given the inconsistency in the literature, the most actionable target is **increasing butyrate-producing bacteria** (like *Faecalibacterium*, *Roseburia*, and *Lachnospiraceae*), which were lower in some MDD studies and are known to produce short-chain fatty acids that support gut barrier health and reduce inflammation.

**Specific interventions to try:**

**Dietary fibre (prebiotics):** 25–35 g/day of total fibre from diverse plant sources (vegetables, fruits, legumes, whole grains). Resistant starch (from cooked-and-cooled potatoes, green bananas, oats) specifically feeds butyrate producers

**Probiotic supplement:** A multi-strain probiotic containing *Lactobacillus* and *Bifidobacterium* species (the latter was lower in MDD in some studies). Look for at least 10 billion CFU/day

**Fermented foods:** 1–2 servings/day of yogurt, kefir, kimchi, sauerkraut, or kombucha — these introduce live bacteria and may increase diversity

### Minimum meaningful duration

Gut bacteria can shift within **24–48 hours** of a dietary change, but stabilising a new community takes **2–4 weeks**. For mood effects, allow **4–8 weeks** minimum — the gut-brain axis involves multiple steps (bacterial metabolism → immune signalling → vagus nerve → brain), so changes in mood may lag behind changes in bacteria.

### What to measure

**Primary outcome:** Mood symptoms

Use a validated daily mood rating: **PHQ-9** (Patient Health Questionnaire, 0–27 scale) or **MoodZoom** (free app) once daily

Track for at least 2 weeks before starting the intervention to establish baseline

**Secondary outcomes:**

**Stool consistency:** Bristol Stool Scale (1–7, with 3–4 being ideal) — constipation and diarrhoea both affect bacterial communities

**Stool frequency:** Number of bowel movements per day

**Energy levels:** 0–10 scale daily

**Sleep quality:** 0–10 scale daily

**Optional (if you have access):**

**Stool microbiome testing:** Companies like uBiome (now defunct), Viome, or DayTwo offer 16S sequencing. But be warned: the results are noisy, and the clinical relevance of individual bacterial levels is unclear

**Inflammatory markers:** High

Read full paper →More Omega-3 research