Meta-analysisWikiReading WritingHigh evidence score

Trends in Research on Writing as a Learning Activity

Authors: Peter Klein, Pietro Boscolo
Journal: Journal of Writing Research
Year: 2016
DOI: 10.17239/jowr-2016.07.03.01
Citations: 212

TL;DR

Writing reliably improves learning across subjects, but the effect depends on *how* you write — reflective, explanatory writing that forces you to reorganise knowledge works far better than simple note-taking or summarising, and the gains are roughly equivalent to raising test scores by 0.3–0.5 standard deviations (a medium effect).

What they tested

This is a meta-analysis and narrative review, not a single experiment. The authors synthesised findings from dozens of studies spanning 1980–2015 to answer: Does writing improve learning? If so, what kinds of writing work best, for whom, and under what conditions?

The "intervention" across studies was writing-to-learn (WTL) activities — anything from short in-class writing prompts to extended essays — compared to control conditions that involved no writing, or writing that was not designed to promote learning (e.g., copying notes verbatim). The outcome measures were typically standardised tests of content knowledge, conceptual understanding, or critical thinking in subjects like science, history, and mathematics.

The review identifies five major trends:

1. **From conflicting results to reliable effects** — early studies showed mixed findings; meta-analyses now confirm writing *does* improve learning, but the size of the effect depends on moderators.

2. **From medium-as-magic to self-regulated learning** — writing is not inherently beneficial; it works when writers actively set goals, monitor understanding, and revise.

3. **From domain-general to discipline-specific** — generic "write about it" prompts are less effective than writing tasks that mirror the reasoning patterns of a specific field (e.g., writing a lab report in science vs. a historical argument).

4. **From individual cognition to social processes** — writing-to-learn is not just a solo act; peer feedback, collaborative writing, and classroom discussion amplify effects.

5. **From school-only to lifelong learning** — writing-to-learn is now studied in professional contexts (e.g., reflective journals for nurses, engineers, managers).

Who was studied

The meta-analyses cited within this review draw on thousands of participants across dozens of studies. For example, one key meta-analysis (Bangert-Drowns et al., 2004) included 48 studies with a total of ~3,000 students. Another (Graham & Perin, 2007) covered 123 studies with ~14,000 students. Participants ranged from elementary school children (ages 8–12) to university undergraduates (ages 18–25) and professionals (e.g., teachers, nurses, engineers in continuing education). Settings were primarily classroom-based in North America and Western Europe, with some studies from Asia and Australia. Most participants were native speakers of the language of instruction (English, Dutch, German, or Italian). No clinical populations were studied — these were typically healthy, typically-developing learners.

How they measured it

The review does not use a single instrument; it aggregates across studies that used:

**Standardised content knowledge tests** (e.g., multiple-choice or short-answer tests on biology, history, physics concepts)

**Conceptual understanding measures** (e.g., open-ended questions scored for depth of explanation, use of evidence, or integration of ideas)

**Critical thinking assessments** (e.g., the Watson-Glaser Critical Thinking Appraisal, or researcher-designed rubrics for argument quality)

**Transfer tasks** (e.g., applying a learned concept to a novel problem)

**Writing quality rubrics** (e.g., holistic scores for organisation, clarity, evidence use)

**Self-report surveys** (e.g., measures of metacognitive awareness, self-efficacy for writing, or perceived learning)

Effect sizes were reported as Cohen's *d* or Hedges' *g* (standardised mean differences). A *d* of 0.2 is small, 0.5 is medium, 0.8 is large.

Methodology

**Design:** This is a narrative review with embedded meta-analytic findings. The authors did not conduct a new meta-analysis; they synthesised results from several prior meta-analyses and qualitative reviews. They searched databases (ERIC, PsycINFO, Web of Science) for studies published 1980–2015, using keywords like "writing to learn," "writing across the curriculum," "writing and thinking," and "reflective writing." They included both experimental and quasi-experimental studies, as well as qualitative case studies.

**Key methodological features of the underlying studies:**

**Randomisation:** Many (but not all) studies randomly assigned students to writing vs. non-writing conditions. Some used intact classrooms (quasi-experimental), which weakens causal claims.

**Blinding:** Rare. Teachers and students usually knew which condition they were in. Outcome assessors were sometimes blind to condition, but not always.

**Duration:** Interventions ranged from a single 15-minute writing session to a full semester (12–16 weeks). The review notes that longer interventions (≥4 weeks) tend to produce larger effects.

**Control conditions:** Varied widely — some studies compared writing to no writing (e.g., reading only, discussion only), others compared different types of writing (e.g., summary vs. explanation vs. freewriting).

**What this design can and cannot prove:**

**Can prove:** That writing, on average, produces a reliable improvement in learning across many contexts (because the meta-analyses aggregate enough studies to overcome sampling error). The review can identify moderators — e.g., explanatory writing works better than summary writing; discipline-specific writing works better than generic prompts.

**Cannot prove:** Causality for any single study's findings (because many studies lack random assignment or blinding). Cannot prove that writing is *always* better than other learning activities (e.g., discussion, drawing, or self-explanation aloud). Cannot prove that the effects are due to writing *per se* rather than to the extra time-on-task or the cognitive effort required (writing takes longer than reading, so the comparison is confounded by time).

**Major methodological weaknesses acknowledged by the authors:**

Publication bias (studies with null results are less likely to be published)

Heterogeneity in outcome measures (comparing apples to oranges)

Lack of standardised writing tasks across studies

Most studies are short-term (weeks, not months or years)

Few studies measure long-term retention (e.g., 6 months later)

Most studies are in school settings with captive participants — generalisability to self-directed adult learners is unknown

Key findings

**Primary findings (from meta-analyses):**

**Writing improves learning reliably.** The overall effect size across meta-analyses is approximately *d* = 0.30–0.50 (small-to-medium). This means the average student in a writing condition scored about 0.3–0.5 standard deviations higher than the average student in a non-writing control condition. For context, this is roughly equivalent to moving from the 50th to the 65th percentile.

**Explanatory writing > summary writing.** Studies that asked students to explain, argue, or apply concepts (e.g., "Explain why the heart pumps blood in one direction") produced larger effects (*d* ≈ 0.50–0.70) than studies that asked for summaries or notes (*d* ≈ 0.10–0.20). The difference was statistically significant (p < 0.01 in the Bangert-Drowns meta-analysis).

**Discipline-specific writing > generic writing.** Writing tasks that mirrored the reasoning of a field (e.g., writing a scientific hypothesis, a historical argument, a mathematical proof) produced larger effects (*d* ≈ 0.45–0.65) than generic prompts like "write what you learned today" (*d* ≈ 0.20–0.30). This supports the Writing in the Disciplines (WID) approach over Writing Across the Curriculum (WAC).

**Feedback amplifies effects.** Studies that included peer or instructor feedback on the writing produced larger effects (*d* ≈ 0.55) than studies with no feedback (*d* ≈ 0.25). The difference was significant (p < 0.05).

**Longer interventions produce larger effects.** Interventions lasting ≥4 weeks had a mean effect size of *d* ≈ 0.45, compared to *d* ≈ 0.20 for single-session interventions (p < 0.05).

**Secondary findings (from narrative review):**

**Writing promotes self-regulated learning.** Students who set goals, monitor their understanding, and revise their writing show larger learning gains than those who write without metacognitive awareness. This was supported by qualitative studies and correlational data.

**Social writing contexts matter.** Collaborative writing, peer review, and classroom discussion around writing all enhance learning beyond solitary writing. The effect is not purely cognitive — social accountability and exposure to others' reasoning appear to drive additional gains.

**Reflective writing in professional contexts.** Studies of nurses, teachers, and managers who kept reflective journals showed improvements in problem-solving and decision-making (measured by supervisor ratings and self-report), though effect sizes were smaller (*d* ≈ 0.20–0.30) and based on fewer studies.

**Transfer is limited but present.** A few studies tested whether writing about one topic improved learning of a related but untaught topic. Transfer effects were small (*d* ≈ 0.15) and not always statistically significant.

Effect magnitude

In plain English: Writing about what you're learning — especially if you explain it in your own words, argue a point, or apply it to a new situation — boosts your test performance by about 10–20 percentage points on average. That's roughly the difference between a B- and a B+ on a typical exam. The effect is not huge, but it's consistent across subjects and age groups. It's comparable to the benefit of studying with practice tests instead of re-reading notes, or of spacing your study sessions instead of cramming.

For a self-experimenter: If you normally score 70% on a quiz after reading a chapter, adding a 15-minute explanatory writing task might raise your score to 75–80%. The effect is larger if you get feedback on your writing, if you write in a format specific to the subject (e.g., a lab report for science, a case analysis for business), and if you do it repeatedly over several weeks.

Limitations

**Acknowledged by the authors:**

Publication bias: null results are less likely to be published, so the true effect may be smaller.

Heterogeneity: studies used wildly different writing tasks, outcome measures, and populations, making it hard to pinpoint what drives the effect.

Short duration: most studies lasted weeks, not months or years. Long-term retention is rarely measured.

Lack of standardisation: there is no "standard dose" of writing — some studies used 5-minute writes, others used 2-hour essays.

Confounding with time-on-task: writing takes longer than reading or listening, so the learning benefit may be partly due to extra time spent engaging with the material, not writing *per se*.

**Critical reader notes:**

**No blinding:** In most studies, teachers and students knew the condition. This introduces demand characteristics (students may try harder because they know they're in the "writing" group) and teacher expectancy effects.

**Self-report bias:** Some outcomes (e.g., perceived learning, self-efficacy) are based on self-report, which correlates only modestly with actual learning.

**Population limits:** Almost all studies are in formal education settings (schools, universities). Generalisability to self-directed adult learners, professionals, or older adults is unknown.

**No dose-response data:** The review cannot tell you the optimal "dose" of writing (e.g., 10 minutes vs. 30 minutes, daily vs. weekly).

**No comparison to other active learning strategies:** Writing is compared to passive controls (reading, listening) but rarely to other active strategies like teaching someone else, drawing concept maps, or self-explanation aloud. So we don't know if writing is *uniquely* beneficial or just another form of active processing.

**Industry funding:** Not applicable (this is an academic review, no industry funding).

Practical takeaways

For someone running their own n=1 experiment:

### What to test

**Intervention:** Explanatory writing about a topic you're learning. Specifically: after reading a chapter or watching a lecture, spend 15–20 minutes writing an explanation of the key concept *as if teaching it to a beginner*. Do not just summarise — explain *why* it works, give an example, and anticipate a common misunderstanding.

**Dose:** 15–20 minutes per session, 3–5 times per week, for at least 4 weeks.

**Comparator:** Your usual study method (e.g., re-reading notes, highlighting, or watching videos). Alternate weeks or topics to control for difficulty.

### Minimum meaningful duration

**4 weeks minimum.** The review shows that single-session effects are small and may not persist. You need repeated practice to see a reliable change.

**Longer is better:** 8–12 weeks would give you more data points and a clearer signal.

### What to measure

**Primary metric:** Score on a standardised test of the material (e.g., a quiz you create yourself, or a practice exam). Take a pre-test before starting, then a post-test after each week or after the full intervention.

**Secondary metrics:**

- Time spent studying (to control for the confound that writing takes longer)

- Self-rated understanding (1–10 scale, "How well do you feel you understand this topic?")

- Retention at 1 month and 3 months (re-test without review)

- Writing quality (if you want to track improvement in your explanations — use a rubric: clarity, accuracy, use of examples, anticipation of counterarguments)

### Key confounds to control for

**Time-on-task:** Writing takes longer than re-reading. To isolate the effect of writing *per se*, you could match time: spend 15 minutes writing vs. 15 minutes re-reading (not 15 minutes writing vs. 5 minutes re-reading).

**Topic difficulty:** Alternate topics across conditions (e.g., Week 1: Topic A with writing, Topic B with re-reading; Week 2: swap). This controls for the possibility that one topic is inherently easier.

**Order effects:** If you always write first, you might be more motivated. Randomise the order of conditions across weeks.

**Prior knowledge:** Take a pre-test for each topic to ensure baseline equivalence.

**Feedback:** If you get feedback on your writing (e.g., from a peer or AI), that's a separate variable. Decide whether you want to test writing alone or writing + feedback.

**Sleep, stress, caffeine:** These affect learning. Keep them as constant as possible across conditions, or log them and check for confounding.

### What a positive result would look like

Your post-test scores are consistently 5–15 percentage points higher after writing sessions compared to control sessions.

The effect is larger for topics where you wrote explanations vs. summaries.

Your self-rated understanding increases more rapidly during writing weeks.

You retain more information at 1-month follow-up (e.g., 70% retention after writing vs. 50% after re-reading).

You notice that your explanations become clearer and more detailed over time (qualitative improvement).

**Caveat:** Your n=1 results may not generalise. But if you see a consistent pattern over 4+ weeks, it's strong evidence that explanatory writing works *for you*. If you see no effect, try increasing the dose (30 minutes), adding feedback, or switching to discipline-specific writing (e.g., write a scientific hypothesis instead of a general explanation). The review suggests that the "right" kind of writing matters more than just writing anything.

Read full paper →More Reading research