What the Research Says About Journalling
A synthesis of 6 studies on journalling — what actually works, what doesn't, and how to test it yourself.
The School-Based Mindfulness Meta-Analysis Found a 0.80 Effect on Cognitive Performance — But You're Probably Doing It Wrong
Here’s the number that should stop you mid-sentence: a meta-analysis of 24 school-based mindfulness studies, pooling data from 2,224 students, found that mindfulness interventions produced a Hedge's g of 0.80 for cognitive performance — attention, concentration, executive function, working memory. That’s a large effect by any standard. For comparison, the same analysis found only a g of 0.39 for stress reduction and non-significant effects on emotional problems and third-party ratings (like teacher reports of behaviour). So if you’re journalling to think more clearly, the data says you might be on to something. If you’re journalling to fix your mood, the evidence is thinner than your average wellness influencer suggests.
But here’s the catch: the interventions in those studies weren’t “write whatever comes to mind for 10 minutes.” They were structured, often 5–45 minutes of formal practice, 1–5 times per week, for 4–24 weeks. And the cognitive boost came from mindfulness meditation — not free-form journalling. The studies that actually test journalling as a specific practice? They’re mostly absent from this meta-analysis. So before you buy that leather-bound notebook, let’s look at what the research actually supports, what it doesn’t, and how to run your own n=1 experiment without wasting your time.
What the research actually shows
The evidence base for journalling is a patchwork, not a cathedral. The most credible finding comes from the mindfulness meta-analysis (Zoogman et al., 2014), which pooled 24 studies — 13 published, 11 unpublished — and found a moderate overall benefit of g = 0.40 for mindfulness-based interventions compared to control conditions. But that overall number hides the real story: the cognitive domain (g = 0.80) was the standout, while emotional problems showed no statistically significant improvement. The stress reduction effect (g = 0.39) was real but modest.
The scoping study framework paper (Arksey & O'Malley, 2005) is worth mentioning here because it explains why the journalling literature is so messy. Scoping studies prioritise breadth over depth — they map what exists without testing causal claims. Most journalling research is exactly this: descriptive, not experimental. You’ll find plenty of correlational studies linking journalling to well-being, but very few randomised controlled trials (RCTs) with active control groups. The business model innovation review (Foss & Saebi, 2017) makes a similar point about a different field: without clear definitions and causal tests, you’re left with case studies and anecdotes. Journalling research suffers from the same problem.
The COVID-19 mental health survey (Hao et al., 2020) provides a useful cautionary tale. Among 2,065 outpatients at a Chinese hospital, about one in four reported clinically significant anxiety, depression, or insomnia during lockdown. But the cutoff scores used were low — GAD-7 ≥5 and PHQ-9 ≥5, which capture mild symptoms, not clinical disorders. If you’re journalling because you feel mildly anxious, you’re in the majority. If you’re journalling because you have a diagnosed condition, the evidence for journalling as a standalone treatment is weak.
The depression genetics study (Power et al., 2017) adds another layer: by splitting 22,158 depression cases by age of onset, researchers found that early-onset depression (before age 27) shares more genetic overlap with schizophrenia and bipolar disorder than late-onset depression does. What does this mean for journalling? If your low mood started early in life, it may be biologically distinct — and less likely to respond to a generic journalling protocol. Your n=1 experiment needs to account for your baseline.
The nuance most people miss
The biggest gap in the journalling literature is the lack of active control conditions. Most studies compare journalling to “no intervention” or “waitlist.” That’s a low bar. When the mindfulness meta-analysis compared mindfulness to active controls (like relaxation training or social-emotional learning), the effects shrank. The cognitive boost (g = 0.80) came from studies that often used no-intervention controls — meaning some of that effect could be placebo, novelty, or simply paying attention to anything for 20 minutes a day.
Another nuance: the school-based mindfulness studies included both published and unpublished data. Unpublished studies tend to have smaller or null effects. The fact that the meta-analysis included them is good science — but it also means the true effect size for cognitive performance is likely lower than 0.80. The emotional problems domain was not statistically significant at all. If you’re journalling to feel better emotionally, you’re betting on an effect that the best available evidence doesn’t confirm.
The consensus recommendations for automated insulin delivery (Boughton & Hovorka, 2022) offer a methodological lesson: when you have dozens of RCTs and real-world data from over 50,000 users, you can make confident recommendations. Journalling research doesn’t have that. The largest studies are observational, the definitions are fuzzy, and the outcomes are self-reported. If you’re running a personal experiment, you need to be your own RCT — with a clear intervention, a control condition, and a specific metric.
Practical implications
Test structured mindfulness, not free-form journalling. The cognitive effect (g = 0.80) came from formal mindfulness practice — breathing meditation, body scans, mindful movement. If you want to think more clearly, try 10 minutes of guided breath-counting, not “dear diary.” The meta-analysis found effects with sessions ranging from 5–45 minutes, 1–5 times per week. Start with 10 minutes daily for 4 weeks.
Measure cognitive performance, not mood. The emotional problems domain was not statistically significant. If you measure “how I feel” on a 1–10 scale, you’re likely to see noise, not signal. Instead, test something concrete: reaction time on a simple task, number of words recalled from a list, or time to complete a puzzle. The cognitive effect was large; the mood effect was not.
Watch for the placebo of novelty. The mindfulness studies that used active controls (like relaxation training) found smaller effects. If you start journalling and feel better immediately, that could be the Hawthorne effect — any structured attention feels good. To separate signal from noise, run a crossover design: 4 weeks of journalling, 4 weeks of an active control (like listening to music or doing puzzles), and compare the results.
Design your own experiment
What to test: Structured mindfulness journalling — specifically, 10 minutes of breath-counting meditation followed by 5 minutes of writing about what you noticed during the meditation. This combines the intervention that produced the cognitive effect (g = 0.80) with a written component that lets you track your progress.
How long to run it: Minimum 4 weeks. The meta-analysis included studies ranging from 4 to 24 weeks. Four weeks is enough to see a cognitive effect if one exists, but not so long that you’ll quit.
What to measure: Choose one cognitive metric that you can test daily or weekly. Options: (1) Time to complete a standardised puzzle (e.g., Sudoku or a word search), (2) Number of words recalled from a 20-word list after a 5-minute delay, (3) Reaction time on a simple online test (e.g., the Stroop test). Measure your baseline for 1 week before starting the intervention.
What confound to watch for: Sleep quality. The COVID-19 survey found that about one in four outpatients reported clinically significant insomnia during lockdown. Poor sleep crushes cognitive performance. If your sleep changes during the experiment, your cognitive metric will change too — and you won’t know if it’s the journalling or the sleep. Track your sleep hours and quality (1–10 scale) daily. If sleep varies by more than 1 hour, the cognitive data is unreliable.
What a positive result looks like: A consistent improvement of at least 10% on your chosen cognitive metric from baseline to week 4, with sleep variation less than 1 hour per night. If you see a 15% improvement in word recall or a 20% faster puzzle time, that’s a meaningful effect — roughly in line with the g = 0.80 found in the meta-analysis. If you see no change or a decline, don’t force it. The evidence for emotional benefits is weak, and the cognitive benefit may not work for everyone. Your n=1 experiment is not a failure if it tells you “this doesn’t work for me.” That’s data.