Does Music Training Enhance Literacy Skills? A Meta-Analysis
Read full paper →- Authors
- Reyna L. Gordon, Hilda M. Fehd, Bruce D. McCandliss
- Journal
- Frontiers in Psychology
- Year
- 2015
- Citations
- 240
TL;DR
Music training produces a small but reliable improvement in phonological awareness (the ability to hear and manipulate sounds in words) in children, equivalent to about a 0.2 standard deviation gain, but does not reliably improve reading fluency; the effect on rhyming skills grows stronger with more hours of practice.
What they tested
This meta-analysis tested whether structured music training (instrument lessons, singing programs, or rhythm-based training) causes improvements in literacy-related language skills in children, compared to control groups that received no music training or alternative non-music interventions (e.g., art classes, extra reading instruction, or no intervention at all).
The key hypothesis was "direct transfer": that learning music—particularly its rhythmic and tonal patterns—directly strengthens the same brain mechanisms used for processing speech sounds and reading.
Two main outcome categories were analysed:
**Phonological awareness**: The ability to identify, segment, blend, and manipulate sounds in spoken words (e.g., recognising that "cat" and "bat" rhyme, or that "dog" has three sounds: /d/ /o/ /g/). This is a foundational skill for learning to read.
**Reading fluency**: The ability to read text accurately, quickly, and with appropriate expression (e.g., words per minute, reading comprehension scores).
Moderators examined included:
Total hours of music training
Age of children
Type of control intervention (e.g., no treatment vs. alternative enrichment like art or drama)
Who was studied
The meta-analysis pooled data from **13 studies** involving a total of **901 children**. The children ranged in age from approximately **4 to 11 years old** (preschool through elementary school). All studies were peer-reviewed and met strict inclusion criteria: they had to include a music training group versus a control group, pre- and post-test measures, and evidence that reading instruction was held constant across groups (so any difference could be attributed to music training, not to differences in classroom reading instruction).
The children came from diverse socioeconomic backgrounds, though the authors note that many studies did not report SES in sufficient detail to analyse its effects. Most studies were conducted in school settings, with music training delivered as part of the school day or as an after-school program. No studies included children with diagnosed learning disabilities or language impairments; all were typically developing children.
How they measured it
The meta-analysis did not use a single instrument but synthesised results across multiple standardised and researcher-developed tests. The key measures were:
**Phonological awareness**: Tests such as the Comprehensive Test of Phonological Processing (CTOPP), the Phonological Awareness Test (PAT), and researcher-designed rhyming tasks. These typically involve tasks like:
- Rhyme detection ("Do 'cat' and 'bat' rhyme?")
- Phoneme segmentation ("Say 'dog' without the /d/")
- Blending ("What word is /c/ /a/ /t/?")
- Elision ("Say 'smile' without the /s/")
**Reading fluency**: Tests such as the Test of Word Reading Efficiency (TOWRE), the Woodcock Reading Mastery Tests, and curriculum-based measures of words read per minute. These assess speed and accuracy of reading words and connected text.
**Music training hours**: Reported by study authors, ranging from approximately 10 to 120 total hours of training across the studies.
**Age**: Treated as a continuous variable, with children ranging from 4 to 11 years.
**Control type**: Categorised as "no treatment" (children continued normal school activities), "alternative enrichment" (e.g., art, drama, sports), or "academic tutoring" (e.g., extra reading instruction).
Effect sizes were calculated using **Hedges' g** (a bias-corrected version of Cohen's d), which expresses the difference between music training and control groups in standard deviation units. A random-effects model was used to account for variability across studies.
Methodology
### Study design
This is a **meta-analysis**, meaning it statistically combines results from multiple independent studies to estimate an overall effect. The authors conducted a systematic literature review, searching databases (PsycINFO, ERIC, PubMed, Web of Science) and reference lists of relevant articles. They identified 13 studies that met all inclusion criteria.
### Inclusion criteria (why they matter)
The authors applied three strict criteria that are critical for testing the "direct transfer" hypothesis:
1. **Music training vs. control groups**: Without a control group, you cannot rule out that any improvement is due to maturation, practice effects, or placebo effects. This is the minimum requirement for a causal claim.
2. **Pre- vs. post-test measures**: Without pre-test data, you cannot know whether groups were equivalent at baseline. Some studies found that children who chose music training already had higher literacy scores before training began—a classic selection bias.
3. **Reading instruction held constant across groups**: This is the most important and most often violated criterion. If the music group also received extra reading tutoring, or if the control group received less reading instruction, any benefit could be due to reading practice, not music. The authors excluded studies where this was not controlled.
### Statistical approach
The authors used a **random-effects meta-analysis**, which assumes that the true effect size varies across studies (due to differences in populations, interventions, and measures). This is more conservative than a fixed-effects model and produces wider confidence intervals. They also tested for **publication bias** (the tendency for studies with null results to go unpublished) using funnel plots and Egger's test.
### Moderator analyses
The authors examined whether effect sizes varied by:
**Hours of training** (continuous)
**Age** (continuous)
**Control type** (categorical)
These analyses help answer "for whom and under what conditions does music training work?"
### What this design can and cannot prove
**Can prove**: That, across the existing literature, music training is associated with a small but statistically reliable improvement in phonological awareness compared to control conditions. Because the included studies were randomised or quasi-experimental with pre-test equivalence, the meta-analysis provides stronger evidence than any single correlational study.
**Cannot prove**:
That music training *causes* improvements in reading fluency (the meta-analysis found no significant effect).
That the effect is due to music *specifically* rather than to any structured, engaging, group-based enrichment activity (many control groups received no alternative enrichment).
That the effect lasts beyond the training period (most studies tested immediately post-training only).
That the effect applies to children with learning disabilities or language impairments (all studies used typically developing children).
### Methodological weaknesses
**Small number of studies (k=13)**: This limits statistical power for moderator analyses and makes the overall estimate less precise.
**Heterogeneity in music training**: Studies used different instruments (piano, violin, voice, percussion), different durations (10–120 hours), and different pedagogical approaches. The meta-analysis treats "music training" as a single category, but the effects may vary by type.
**Control group variability**: Some control groups received no intervention, others received art or drama. The effect of music may be partly due to receiving any structured enrichment, not music per se.
**Publication bias**: The authors found some evidence of asymmetry in funnel plots, suggesting that small studies with null results may be missing from the literature.
**No blinding**: In most studies, teachers and assessors knew which children were in the music group, introducing potential expectancy effects.
**Short follow-up**: Most studies tested immediately after training ended, so durability is unknown.
Key findings
**Phonological awareness**: Music training produced a small but statistically significant improvement compared to control groups. The overall effect size was **Hedges' g = 0.20** (95% CI: 0.05 to 0.35, p = 0.009). This means the average child in the music training group scored about 0.2 standard deviations higher on phonological awareness tests than the average child in the control group.
**Reading fluency**: No significant aggregate effect was found. The overall effect size was **Hedges' g = 0.12** (95% CI: -0.08 to 0.32, p = 0.24). This means that, across studies, music training did not reliably improve reading fluency compared to control conditions.
**Moderator: Hours of training**: For phonological awareness, there was a **significant positive relationship** between total hours of music training and effect size (β = 0.004, p = 0.03). This means that studies with more training hours tended to show larger effects. Specifically, the effect on **rhyming skills** (a subcomponent of phonological awareness) grew stronger with increased hours: for every additional 10 hours of training, the effect size increased by approximately 0.04 standard deviations.
**Moderator: Age**: Age did not significantly moderate the effect (p = 0.45). The benefit of music training on phonological awareness was similar for preschoolers and elementary school children.
**Moderator: Control type**: The effect was larger when the control group received no intervention (g = 0.28) compared to when the control group received alternative enrichment like art or drama (g = 0.12), but this difference was not statistically significant (p = 0.18). This suggests that some of the benefit may be due to receiving any structured enrichment, not music specifically.
**Secondary outcomes**: Only two studies measured **spelling** and **vocabulary**, so no meta-analysis was possible. Individual studies reported mixed results.
**Publication bias**: Funnel plot asymmetry was detected for reading fluency outcomes, suggesting possible publication bias (small studies with null results may be missing). For phonological awareness, the evidence was less clear.
Effect magnitude
The overall effect on phonological awareness (d = 0.20) is considered **small** by conventional standards (Cohen's guidelines: 0.2 = small, 0.5 = medium, 0.8 = large). To put this in perspective:
If the average child in the control group scores at the 50th percentile on a phonological awareness test, the average child in the music training group would score at about the **58th percentile**—a modest but meaningful shift.
This is roughly equivalent to the difference in reading ability between children born in the same month but separated by about **2–3 months of development**.
For comparison, explicit phonics instruction typically produces effect sizes of d = 0.4–0.6 for phonological awareness, so music training is about half as effective as direct teaching of sound-letter relationships.
The effect on rhyming skills grew with training hours. At 20 hours of training, the effect was negligible (d ≈ 0.05). At 60 hours, it reached d ≈ 0.25. At 120 hours (the maximum in the included studies), it reached d ≈ 0.45—a medium effect. This suggests that **rhyming skills may be particularly sensitive to music training**, but only with sustained practice.
The null effect on reading fluency (d = 0.12) means that, even if music training improves the ability to hear and manipulate sounds, this does not automatically translate into faster or more accurate reading. Reading fluency depends on many other skills (vocabulary, orthographic knowledge, comprehension) that music training may not directly address.
Limitations
**Acknowledged by authors:**
Small number of studies (k=13) limits statistical power and generalisability.
Heterogeneity in music training programs (different instruments, durations, pedagogies) makes it difficult to identify what specific aspect of music training drives the effect.
Many studies did not report SES, IQ, or baseline reading ability in sufficient detail to control for these confounds.
Possible publication bias, especially for reading fluency outcomes.
Most studies had short training periods (10–40 hours) and no long-term follow-up.
**Additional critical observations:**
**No active control in most studies**: Only 4 of 13 studies used an alternative enrichment control (e.g., art, drama). The rest used no-treatment controls. This means the "music effect" may partly reflect the benefits of any structured, engaging, adult-led group activity, not music specifically.
**Lack of blinding**: In school-based studies, teachers and testers typically know which children are in the music group. This can introduce expectancy effects (teachers may unconsciously give more attention or encouragement to music group children).
**No randomisation in some studies**: Several included studies used quasi-experimental designs (e.g., comparing children who chose music lessons to those who did not). Even with pre-test equivalence, unmeasured confounds (e.g., parental involvement, motivation, cognitive ability) may differ between groups.
**Limited outcome measures**: The meta-analysis could only analyse phonological awareness and reading fluency because these were the only outcomes measured consistently across studies. Other potentially relevant outcomes (spelling, vocabulary, listening comprehension, executive function) were not analysed.
**No dose-response analysis for reading fluency**: The authors did not report whether hours of training moderated reading fluency effects, leaving open the possibility that longer training might produce benefits.
**All studies used typically developing children**: Results cannot be generalised to children with dyslexia, language impairments, or other learning difficulties.
Practical takeaways
For someone running their own n=1 experiment (e.g., a parent wanting to test whether music lessons improve their child's reading skills):
### What to test
**Intervention**: Structured music training—ideally instrument lessons (piano, violin, or voice) that involve both rhythmic and melodic components. Group lessons may be as effective as individual lessons, but the key is regular, sustained practice.
**Dose**: At least **60–80 hours of total training** over the study period. The meta-analysis found that effects on rhyming skills only became meaningful after about 60 hours. This translates to roughly **30–40 minutes per day, 5 days per week, for 6–8 months**, or **1 hour per week for 1.5–2 years**.
**Control condition**: Ideally, compare music training to an alternative enrichment activity (e.g., art classes, drama, sports) of equal duration and intensity. This helps isolate the specific effect of music from the general effect of structured enrichment.
### Minimum meaningful duration
**At least 6 months** of consistent training (minimum 60 total hours). Shorter periods are unlikely to produce detectable effects.
Test at **baseline, 3 months, and 6 months** to track the trajectory of change.
### What to measure
**Primary outcome**: Phonological awareness. Use a standardised test like the **Comprehensive Test of Phonological Processing (CTOPP-2)** or the **Phonological Awareness Test (PAT-2)**. Focus on subtests for:
- Rhyming (e.g., "Which word rhymes with 'cat'?")
- Phoneme segmentation (e.g., "Say 'dog' without the /d/")
- Blending (e.g., "What word is /c/ /a/ /t/?")
**Secondary outcome**: Reading fluency. Use the **Test of Word Reading Efficiency (TOWRE-2)** or a curriculum-based measure of words read per minute from grade-level passages.
**Process measure**: Track **hours of practice** (not just time in lessons). Use a practice log to record daily minutes of instrument practice.
**Control measure**: Monitor **general cognitive development** using a brief IQ test (e.g., KBIT-2) at baseline and post-test to rule out maturation effects.
### Key confounds to control for
**Maturation**: Children naturally improve in phonological awareness and reading over time. Use a control condition (alternative enrichment) to isolate the music effect.
**Expectancy effects**: If the child knows they are in a "special" program, they may try harder. If possible, keep the child blind to the hypothesis (e.g., "We're trying two different enrichment programs to see which is more fun").
**Practice effects**: Repeated testing can inflate scores. Use alternate forms of tests at each time point, and space tests at least 3 months apart.
**SES and parental involvement**: Children from higher-SES families may have more resources and parental support for music practice. If comparing siblings or using a within-family design, control for birth order and parental time investment.
**School reading instruction**: Ensure that the child's school reading curriculum does not change during the study period. Any changes could confound the results.
**Mot