Meta-analysisWikiReadingHigh evidence score

Virtual reality improves emotional but not cognitive empathy: A meta-analysis.

Authors: Alison Jane Martingano, Fernanda Hererra, Sara Konrath
Journal: Technology Mind and Behavior
Year: 2021
DOI: 10.1037/tmb0000034
Citations: 164

TL;DR

Virtual reality experiences reliably boost emotional empathy (feeling *with* someone) by a small-to-moderate amount, but they do not improve cognitive empathy (understanding someone's perspective), and VR is no more effective than simply reading a story or imagining someone else's experience — meaning you can get the same empathy boost from a book as from a $400 headset.

What they tested

This is a meta-analysis, meaning the authors statistically combined results from 43 separate studies to answer one overarching question: Does experiencing a virtual reality (VR) simulation make people more empathic?

The intervention in each study was some form of VR experience designed to put the user "in someone else's shoes." Examples included:

A VR simulation of being a refugee fleeing war

A VR experience of being homeless on a city street

A VR simulation of being a person with colour-blindness

A VR experience of being a patient with Alzheimer's disease

A VR simulation of being a cow in a slaughterhouse

The comparators varied across studies. Some studies compared VR to a no-intervention control group (people who did nothing). Others compared VR to a "less technologically advanced" empathy intervention, such as:

Reading a written narrative about the same topic

Watching a 2D video of the same scenario

Imagining the experience from a first-person perspective (guided imagination exercise)

Reading a factual description of the condition

The outcome measures were all standardised tests of empathy, which the authors categorised into two types:

**Emotional empathy:** Feeling what another person feels (e.g., compassion, distress, concern). Measured with scales like the Interpersonal Reactivity Index (IRI) Empathic Concern subscale, or self-reported feelings of "compassion" or "sympathy" after the experience.

**Cognitive empathy:** Understanding what another person thinks or feels (e.g., perspective-taking, theory of mind). Measured with scales like the IRI Perspective-Taking subscale, or performance-based tasks like the "Reading the Mind in the Eyes" test.

Who was studied

The meta-analysis included 43 independent studies with a total of **5,644 participants**. The individual studies varied widely:

**Sample sizes per study:** Ranged from 20 to 1,200 participants.

**Population:** Mostly university students (undergraduate psychology students), but also included some community samples, adults recruited online (e.g., via Amazon Mechanical Turk), and a few studies with children or adolescents.

**Setting:** Laboratory settings (university labs), online studies (participants completed VR at home with cardboard headsets), and field settings (e.g., museums or public exhibitions).

**Demographics:** The authors note that most studies did not report detailed demographics, but where reported, samples were predominantly female (approximately 60–70%) and young (mean age ~20–30 years). Most participants were from Western, educated, industrialised, rich, and democratic (WEIRD) countries (USA, Canada, UK, Netherlands, Germany).

How they measured it

The authors extracted effect sizes from each study. The specific instruments varied, but all fell into validated categories:

**Emotional empathy measures:**

**Interpersonal Reactivity Index (IRI) – Empathic Concern subscale:** A 7-item self-report scale (e.g., "I often have tender, concerned feelings for people less fortunate than me"). Scores range from 0 to 28, higher = more emotional empathy.

**Positive and Negative Affect Schedule (PANAS) – Compassion subscale:** Self-reported feelings of "compassion," "sympathy," "moved," "tender" on a 1–5 scale.

**Single-item self-report:** "How much compassion do you feel for the person in the video?" on a 1–7 Likert scale.

**Behavioural measures:** In a few studies, participants were given the opportunity to donate money to a charity related to the VR topic, or to sign a petition — these were treated as behavioural proxies for emotional empathy.

**Cognitive empathy measures:**

**Interpersonal Reactivity Index (IRI) – Perspective-Taking subscale:** A 7-item self-report scale (e.g., "I sometimes try to understand my friends better by imagining how things look from their perspective"). Scores range from 0 to 28, higher = more cognitive empathy.

**Reading the Mind in the Eyes Test (RMET):** A performance-based test where participants look at 36 photographs of eyes and choose which of four words best describes what the person is thinking or feeling. Score = number correct out of 36.

**Empathic Accuracy Task:** Participants watch a video of a person telling a story and are asked to infer what the person is feeling at each moment. Accuracy is scored against the person's own self-report.

**Moderators (factors tested to see if they changed the effect):**

**Immersion level:** High (full VR headset with 360° view and head tracking) vs. low (cardboard headset or 360° video on a flat screen)

**Interactivity level:** High (user can move, pick up objects, make choices) vs. low (passive viewing)

**Comparison condition:** No intervention vs. reading/video/imagination

**Duration of VR experience:** Ranged from 2 minutes to 30 minutes

**Time of measurement:** Immediately after VR vs. delayed (hours to weeks later)

Methodology

**Design:** This is a **random-effects meta-analysis**. The authors systematically searched databases (PsycINFO, PubMed, Scopus, Google Scholar) for all published and unpublished studies that experimentally tested the effect of a VR experience on any measure of empathy. They included studies up to September 2020. They extracted effect sizes (Cohen's *d* or Hedges' *g*) from each study and combined them statistically.

**Why random-effects?** The authors assumed that the true effect of VR on empathy varies across studies (because of different VR experiences, different populations, different measures). A random-effects model is more conservative and generalisable than a fixed-effects model, which assumes one true effect.

**Subgroup analyses:** The authors tested whether the effect of VR differed depending on:

Type of empathy (emotional vs. cognitive)

Type of comparison condition (no intervention vs. active control like reading)

Level of immersion (high vs. low)

Level of interactivity (high vs. low)

Duration of VR experience (short vs. long)

Time of measurement (immediate vs. delayed)

**Publication bias check:** The authors tested for publication bias (the tendency for journals to publish only positive results) using funnel plots and Egger's regression test. They found no significant evidence of publication bias for emotional empathy outcomes, but there was some asymmetry for cognitive empathy outcomes (suggesting that studies showing VR *improving* cognitive empathy may be overrepresented in the literature — making the null finding even more robust).

**What this design can and cannot prove:**

**Can prove:** That VR experiences, on average across all studies, cause a change in self-reported empathy compared to control conditions. Because the individual studies were mostly randomised experiments, the meta-analysis inherits their causal logic — VR *causes* a change in empathy.

**Cannot prove:** Why VR works (the mechanism). The authors can only speculate that emotional empathy is aroused automatically by evocative stimuli, while cognitive empathy requires effortful imagination. But the meta-analysis cannot test this mechanism directly.

**Cannot prove:** That the effect lasts. Most studies measured empathy immediately after the VR experience. Only a handful measured it days or weeks later, and those showed that the effect fades.

**Cannot prove:** That VR is better than cheaper alternatives. The subgroup analysis comparing VR to reading/video/imagination found no significant difference — but this is a null finding, not proof of equivalence. It's possible that VR is slightly better or slightly worse, and the data are simply too noisy to tell.

**Major methodological weaknesses:**

**Self-report bias:** Nearly all empathy measures were self-report. People may say they feel more compassionate after VR because they think they *should* feel more compassionate, or because the VR experience is novel and exciting (a "novelty effect"). The few studies that used behavioural measures (donations) showed smaller and less consistent effects.

**Heterogeneity:** The studies varied wildly in quality. Some were well-controlled lab experiments; others were field studies with no random assignment. The authors did not exclude low-quality studies, which could inflate or deflate the overall effect.

**Small number of cognitive empathy studies:** Only 12 of the 43 studies measured cognitive empathy. With such a small sample, the null finding for cognitive empathy could be due to low statistical power rather than a true absence of effect.

**Demand characteristics:** In many studies, participants knew they were in an empathy experiment. They may have guessed the hypothesis and responded accordingly.

Key findings

All effect sizes are reported as Hedges' *g* (a standardised mean difference). By convention, *g* = 0.2 is small, 0.5 is medium, 0.8 is large.

**Primary outcome: Overall effect of VR on empathy (all types combined)**

VR experiences significantly increased empathy compared to control conditions: **g = 0.54, 95% CI [0.37, 0.71], p < 0.001**.

This is a **medium effect size**. In plain terms, the average person in the VR condition scored about half a standard deviation higher on empathy measures than the average person in the control condition.

**Secondary outcome: Emotional empathy vs. cognitive empathy**

**Emotional empathy:** VR significantly improved emotional empathy: **g = 0.60, 95% CI [0.41, 0.79], p < 0.001** (based on 41 studies).

**Cognitive empathy:** VR did NOT significantly improve cognitive empathy: **g = 0.08, 95% CI [-0.16, 0.32], p = 0.52** (based on 12 studies).

The difference between these two effects was statistically significant (p for subgroup difference < 0.001). This is the paper's headline finding: VR works for feeling, not for understanding.

**Secondary outcome: VR vs. less technologically advanced interventions**

When VR was compared to reading a narrative, watching a 2D video, or imagining the experience: **g = 0.04, 95% CI [-0.20, 0.28], p = 0.74** (based on 12 studies).

VR was **no more effective** than these cheaper, simpler interventions. The effect of VR was essentially zero when the control group also received an empathy-inducing experience.

**Secondary outcome: Immersion level**

High-immersion VR (full headset, 360° view): **g = 0.55, 95% CI [0.35, 0.75]**

Low-immersion VR (cardboard headset, 360° video on flat screen): **g = 0.52, 95% CI [0.24, 0.80]**

The difference between these was not statistically significant (p = 0.84). **Expensive, high-end VR was no better than cheap cardboard headsets.**

**Secondary outcome: Interactivity level**

High-interactivity VR (user can move, make choices): **g = 0.56, 95% CI [0.31, 0.81]**

Low-interactivity VR (passive viewing): **g = 0.52, 95% CI [0.30, 0.74]**

The difference was not statistically significant (p = 0.79). **Interactive VR was no better than passive VR.**

**Secondary outcome: Duration of VR experience**

Short VR (less than 10 minutes): **g = 0.58, 95% CI [0.36, 0.80]**

Long VR (10 minutes or more): **g = 0.49, 95% CI [0.24, 0.74]**

The difference was not statistically significant (p = 0.60). **Longer VR experiences were not more effective.**

**Secondary outcome: Time of measurement**

Immediate measurement (right after VR): **g = 0.56, 95% CI [0.38, 0.74]**

Delayed measurement (hours to weeks later): **g = 0.22, 95% CI [-0.10, 0.54]**

The difference was not statistically significant (p = 0.07), but the trend suggests the effect may fade over time. Only 5 studies measured delayed effects, so this is very uncertain.

Effect magnitude

The overall effect of VR on emotional empathy (g = 0.60) means that if you take two random people, one who did a VR empathy experience and one who did nothing, the VR person will score higher on an emotional empathy scale about 65–70% of the time. This is a moderate boost — roughly equivalent to the difference in empathy between women and men (a well-established small-to-medium effect).

However, the effect is **not larger than reading a story**. If you compare VR to reading a first-person narrative about the same topic, the advantage of VR disappears (g = 0.04). So the practical magnitude is: VR gives you a moderate empathy boost compared to doing nothing, but zero boost compared to reading a well-written story.

For cognitive empathy, the effect is essentially zero (g = 0.08). This means VR does not help you understand another person's perspective any better than doing nothing. If you want to improve your ability to take someone else's perspective, VR is not the tool.

The lack of difference between high-end and low-end VR (g = 0.55 vs. 0.52) means that a $15 cardboard headset with a smartphone is just as effective as a $400 Oculus Quest. The content matters; the hardware does not.

Limitations

**Acknowledged by authors:**

Most studies measured empathy immediately after the VR experience; long-term effects are unknown.

Most studies used self-report measures, which are susceptible to social desirability bias and demand characteristics.

The meta-analysis could not control for the quality of the VR content — some experiences may be poorly designed.

There was significant heterogeneity across studies (I² = 68%), meaning the effects varied widely. Some VR experiences may work much better than others.

The number of studies measuring cognitive empathy was small (k = 12), limiting statistical power for that analysis.

Most studies were conducted with WEIRD populations (Western, educated, industrialised, rich, democratic), limiting generalisability.

**Additional critical limitations:**

**Novelty effect:** VR is still relatively novel for most people. The empathy boost may be partly due to the excitement of trying a new technology, not the content itself. As VR becomes commonplace, the effect may shrink.

**No blinding:** Participants knew they were in a VR experiment. They could not be blinded to the intervention. This is a fundamental limitation of all VR research.

**No active control for time/attention:** In many studies, the control group did nothing, while the VR group had a novel, engaging experience. The empathy boost could simply be due to receiving any engaging stimulus, not VR specifically. The subgroup analysis comparing VR to reading partially addresses this, but reading is also an empathy-inducing activity.

**Publication bias for cognitive empathy:** The funnel plot asymmetry suggests that studies showing VR improving cognitive empathy may be overrepresented. The true effect may be even closer to zero.

**No correction for multiple comparisons:** The authors ran many subgroup analyses without adjusting for multiple testing. Some of the "significant" findings (like the overall effect) may be inflated by selective reporting within individual studies.

Practical takeaways

For someone running their own n=1 experiment:

**What to test:**

**Intervention:** A 5–10 minute VR experience designed to evoke empathy (e.g., a 360° video of a refugee's journey, or a simulation of being homeless). Use a cheap cardboard headset with your smartphone — the hardware

Read full paper →More Reading research