RCTWikiTop journalSocial Habits Mental HealthHigh evidence score

Behavioural activation to mitigate the psychological impacts of COVID-19 restrictions on older people in England and Wales (BASIL+): a pragmatic randomised controlled trial

Authors: Simon Gilbody, Elizabeth Littlewood, Dean McMillan, Lucy Atha, Della Bailey, Kalpita Baird, Samantha Brady, Lauren Burke, Carolyn Chew‐Graham, Peter Coventry, Suzanne Crosland, Caroline Fairhurst, Andrew Henry, Kelly Hollingsworth, Elizabeth Newbronner, Eloise Ryde, Leanne Shearsmith, Han‐I Wang, Judith Webster, Rebecca Woodhouse, Andrew Clegg, Sarah Dexter‐Smith, Tom Gentry, Catherine Hewitt, Andrew J. Hill, Karina Lovell, Claire Sloan, Gemma Traviss‐Turner, Steven Pratt, David Ekers
Journal: The Lancet Healthy Longevity
Year: 2024
DOI: 10.1016/s2666-7568(23)00238-6
Citations: 45

TL;DR

A structured telephone-based behavioural activation programme (up to eight weekly sessions) reduced depression scores by about 1.7 points on the PHQ-9 scale (0–27) at three months in socially isolated older adults with multiple long-term conditions, compared to usual care plus wellbeing resources.

What they tested

**Intervention:** Behavioural activation (BA) delivered via telephone. Participants were offered up to eight weekly sessions (about 30–45 minutes each) with a trained support worker. The BA was adapted for social isolation during COVID-19: it focused on helping participants identify and schedule activities that provided a sense of achievement, pleasure, or social connection, while specifically encouraging activities that maintained or rebuilt social ties (e.g., phone calls with friends, gardening, walking with a neighbour at a distance). Sessions included activity monitoring, goal setting, and problem-solving barriers to engagement.

**Comparator:** "Usual care" – participants continued any existing medical or mental health care they were receiving. They were also given a signposting resource pack containing information about COVID-19 wellbeing resources (e.g., NHS mental health helplines, Age UK support services, online exercise classes). No additional structured psychological support was provided.

**Primary outcome:** Depression severity measured by the Patient Health Questionnaire-9 (PHQ-9) at 3 months after randomisation.

**Secondary outcomes (measured at 3 months and 6 months):** Loneliness (De Jong Gierveld Loneliness Scale – emotional and social subscales), anxiety (Generalised Anxiety Disorder-7, GAD-7), health-related quality of life (EQ-5D-5L), social participation, and self-reported activity levels.

Who was studied

**Sample size:** 435 participants (218 intervention, 217 control) recruited from 26 general practices in England and Wales.

**Population:** Adults aged 65 years and older (mean age 75.7 years, SD 6.7). All participants were socially isolated (defined as living alone OR having no weekly face-to-face contact with family/friends OR reporting feeling lonely often/always). All had a PHQ-9 score of 5 or higher (indicating at least mild depression) and had two or more long-term health conditions (e.g., diabetes, heart disease, arthritis, COPD). 62.1% were female, 96.1% were White. Participants were recruited during the COVID-19 pandemic (February 2021 to February 2022), when social distancing and lockdown restrictions were in place.

**Exclusion criteria:** Severe cognitive impairment (unable to consent or engage with telephone sessions), current psychosis, active suicidal ideation requiring immediate crisis care, already receiving a structured psychological therapy, or unable to communicate in English.

How they measured it

**Depression:** Patient Health Questionnaire-9 (PHQ-9). A 9-item self-report scale (0–27). Each item scored 0–3 (not at all to nearly every day). Higher scores = worse depression. A score of 5–9 = mild, 10–14 = moderate, 15–19 = moderately severe, 20+ = severe. The minimal clinically important difference (MCID) is generally considered 3–5 points.

**Loneliness:** De Jong Gierveld Loneliness Scale (11 items). Two subscales: emotional loneliness (6 items, e.g., "I experience a general sense of emptiness") and social loneliness (5 items, e.g., "There are plenty of people I can rely on when I have problems"). Each item scored 0–2 (no, more or less, yes). Higher scores = more loneliness.

**Anxiety:** Generalised Anxiety Disorder-7 (GAD-7). 7-item scale (0–21). Higher = worse anxiety. MCID ~3–4 points.

**Health-related quality of life:** EQ-5D-5L (5 dimensions: mobility, self-care, usual activities, pain/discomfort, anxiety/depression; plus a visual analogue scale 0–100).

**Social participation:** A single item asking how often participants engaged in social activities (e.g., visiting friends, attending clubs) in the past month.

**Activity levels:** Self-reported frequency of activities (e.g., walking, gardening, hobbies) over the past week.

All measures were collected via telephone interview by a researcher who was masked (blinded) to treatment allocation. Assessments occurred at baseline, 3 months (primary endpoint), and 6 months (follow-up).

Methodology

**Study design:** Pragmatic, parallel-group, randomised controlled trial (RCT). "Pragmatic" means it was designed to test whether the intervention works in real-world conditions (e.g., with existing NHS staff, minimal exclusion criteria, flexible delivery) rather than in a tightly controlled laboratory setting.

**Randomisation:** Simple (unstratified) 1:1 allocation using a remote web-based randomisation system. No stratification by age, sex, depression severity, or number of conditions. This is a weakness – stratification could have ensured balance on key prognostic variables. However, with 435 participants, simple randomisation usually produces comparable groups by chance.

**Blinding:** Outcome assessors (researchers conducting telephone interviews) were masked to allocation. Participants and support workers delivering BA were not blinded (impossible given the nature of the intervention). This is a common limitation in psychological therapy trials – participants know they are receiving a "treatment" and may report better outcomes due to expectation effects. The authors attempted to mitigate this by using a self-report questionnaire (PHQ-9) rather than a clinician-rated measure, and by having assessors blind to allocation.

**Duration:** The intervention lasted up to 8 weeks (weekly sessions). The primary outcome was measured at 3 months post-randomisation (i.e., ~1 month after the last session for those who completed all 8). Secondary outcomes were also measured at 6 months (4 months after the intervention ended) to assess durability.

**Statistical approach:** Intention-to-treat (ITT) analysis – participants were analysed in the group they were randomised to, regardless of how many sessions they actually attended. This is the gold standard for RCTs because it preserves the benefits of randomisation and estimates the real-world effect of offering the intervention (including non-adherence). The primary analysis used a linear mixed model adjusting for baseline PHQ-9 score, age, sex, and number of long-term conditions. Missing data were handled using multiple imputation (a method that estimates missing values based on observed data).

**What this design can and cannot prove:**

**Can prove:** That offering telephone BA (vs. usual care + signposting) causes a reduction in depression scores at 3 months in this specific population. The RCT design, with random allocation and blinded outcome assessment, allows causal inference.

**Cannot prove:** That BA is better than other active psychological therapies (e.g., cognitive behavioural therapy, problem-solving therapy) – there was no active comparator. Cannot prove that the effect lasts beyond 6 months. Cannot prove that BA works in younger adults, people without multiple long-term conditions, or non-White populations (96% White sample). Cannot prove that telephone delivery is equivalent to in-person delivery. The lack of blinding of participants means some of the observed benefit could be due to placebo/expectation effects.

**Major methodological weaknesses:**

1. No active control group (e.g., "friendly phone calls" without BA content) – so we cannot separate the specific effects of BA from the non-specific effects of receiving weekly attention from a supportive person.

2. High attrition: 18% of participants (78/435) did not provide primary outcome data at 3 months. While multiple imputation was used, differential dropout could bias results.

3. The sample was overwhelmingly White (96%) and relatively healthy (despite having multiple conditions) – limits generalisability.

4. The intervention was delivered during a unique historical period (COVID-19 lockdowns) – it is unclear if effects would replicate in non-pandemic conditions.

5. The primary outcome (PHQ-9) is self-reported and susceptible to demand characteristics (participants who received BA may feel obliged to report improvement).

Key findings

**Primary outcome (depression at 3 months):**

Adjusted mean difference in PHQ-9 scores between control and intervention groups: **-1.65 points** (95% CI -2.54 to -0.75, p=0.0003). This means the BA group scored about 1.7 points lower (less depressed) than the control group at 3 months.

The effect was statistically significant (p < 0.001) but below the conventional threshold for a "minimally clinically important difference" (MCID) of 3–5 points on the PHQ-9.

**Secondary outcomes at 3 months:**

**Emotional loneliness:** Adjusted mean difference -0.48 (95% CI -0.88 to -0.08, p=0.019) – small but statistically significant reduction.

**Social loneliness:** Adjusted mean difference -0.22 (95% CI -0.60 to 0.16, p=0.25) – not statistically significant.

**Anxiety (GAD-7):** Adjusted mean difference -0.72 (95% CI -1.35 to -0.09, p=0.025) – small but significant reduction.

**Quality of life (EQ-5D-5L VAS):** Adjusted mean difference 2.86 (95% CI 0.07 to 5.65, p=0.045) – small improvement.

**Social participation:** No significant difference (p=0.12).

**Activity levels:** No significant difference (p=0.08).

**Secondary outcomes at 6 months (follow-up):**

**Depression (PHQ-9):** Adjusted mean difference -1.18 (95% CI -2.18 to -0.18, p=0.021) – still significant but attenuated.

**Emotional loneliness:** Adjusted mean difference -0.42 (95% CI -0.85 to 0.01, p=0.055) – borderline, no longer significant.

**Anxiety (GAD-7):** Adjusted mean difference -0.59 (95% CI -1.27 to 0.09, p=0.089) – no longer significant.

**Adherence:** Participants attended an average of 5.2 out of 8 sessions (SD 2.9). 38% attended all 8 sessions; 18% attended 0 sessions.

**Adverse events:** None reported as attributable to the intervention.

Effect magnitude

The primary effect was a **1.7-point reduction on the PHQ-9** (0–27 scale). To put this in context:

The average PHQ-9 score at baseline was about 10 (moderate depression). A 1.7-point drop would bring the average to about 8.3 (still in the mild range).

This is roughly **one-third of the typical effect of antidepressant medication** (which usually produces a 3–5 point reduction on the PHQ-9 in clinical trials).

It is equivalent to about **one less depressive symptom** (e.g., going from "nearly every day" to "several days" on one item, or from "several days" to "not at all" on two items).

The effect on emotional loneliness was even smaller: a 0.48-point reduction on a 0–12 scale (about 4% of the scale range).

In plain English: **For every 100 people who receive telephone BA, about 10–15 will experience a meaningful reduction in depression symptoms (e.g., moving from moderate to mild depression), while the rest will see little to no change.** The number needed to treat (NNT) was not reported, but based on the effect size, it is likely around 8–12 (i.e., you need to treat 8–12 people to get one additional positive response compared to usual care).

Limitations

**Acknowledged by authors:**

The trial was conducted during a unique period (COVID-19 pandemic) – generalisability to non-pandemic times is uncertain.

The sample was predominantly White (96%) – results may not apply to ethnic minority groups.

The control group received "usual care" plus signposting, not an active placebo – so non-specific effects of attention and support cannot be ruled out.

Attrition was moderate (18% at 3 months, higher at 6 months).

The primary outcome (PHQ-9) is self-reported and may be influenced by social desirability bias.

**Additional critical observations:**

The effect size (1.7 PHQ-9 points) is below the conventional MCID of 3–5 points. While statistically significant, it is unclear whether this translates to a noticeable improvement in daily life for most participants.

No objective measures of activity or social contact were used (e.g., accelerometry, call logs) – all outcomes were self-report.

The intervention was delivered by "support workers" with limited mental health training – fidelity to the BA model may have varied.

The 6-month follow-up showed attenuation of effects, suggesting the intervention may need booster sessions to maintain benefits.

The trial was funded by the UK National Institute for Health and Care Research (NIHR) – no obvious conflict of interest, but no independent replication.

Practical takeaways

For someone running their own n=1 experiment:

**What to test:**

A structured behavioural activation protocol: schedule 2–3 small, achievable activities per day that give you a sense of pleasure (e.g., listening to music, cooking a favourite meal) or mastery (e.g., tidying a drawer, paying a bill). Prioritise activities that involve social connection (e.g., phone call with a friend, walking with a neighbour, joining an online book club).

Dose: aim for 8 weekly sessions of 30–45 minutes of structured planning and review. In a self-experiment, you could do this yourself using a workbook (e.g., "Behavioural Activation for Depression" by Martell et al.) or with a friend/coach.

**Minimum meaningful duration:**

Run the experiment for at least **8 weeks** (the intervention duration in this trial). Measure outcomes at baseline, week 4, week 8 (end of intervention), and week 12 (1-month follow-up) to see if effects persist.

If you want to test durability, extend to 6 months (24 weeks) with monthly booster sessions.

**What to measure (specific metrics):**

**Primary:** PHQ-9 (free online) – complete weekly. A positive result would be a **sustained drop of 3+ points** from your baseline (the MCID). A drop of 1.7 points (the trial average) is possible but may not feel meaningful.

**Secondary:**

- Loneliness: De Jong Gierveld Loneliness Scale (11 items) – weekly.

- Anxiety: GAD-7 – weekly.

- Activity log: Record daily how many activities you completed that gave you pleasure or mastery (0–10 scale for each). Track the number of social interactions (phone calls, in-person visits, video chats) per day.

- Quality of life: EQ-5D-5L (free) – weekly.

**Objective measure (optional):** Use a step counter or phone screen time log to see if your activity levels objectively increase.

**Key confounds to control for:**

**Seasonal effects:** If you start in winter (when mood is typically lower), improvements may be due to spring/summer rather than BA. Run the experiment at the same time of year as your baseline, or use a crossover design (8 weeks BA, 8 weeks washout, 8 weeks control).

**Life events:** Major stressors (bereavement, job loss, illness) can overwhelm any intervention. Note these in a diary and exclude weeks with major events from analysis.

**Medication changes:** If you start or stop antidepressants, sleep aids, or other psychotropic medications during the experiment, this confounds the results. Try to keep medication stable for the duration.

**Sleep quality:** Poor sleep can mimic depression. Track sleep duration and quality (e.g., using a sleep diary or wearable) as a potential confound.

**Expectation

Read full paper →More Social Habits research