ObservationalWikiTop journalLearningModerate

Study smart – impact of a learning strategy training on students’ study behavior and academic performance

Authors: Felicitas Biwer, Anique B. H. de Bruin, Adam M. Persky
Journal: Advances in Health Sciences Education
Year: 2022
DOI: 10.1007/s10459-022-10149-z
Citations: 54

TL;DR

A single-session learning strategy training program, combined with targeted follow-up support for low-performing students, shifted students away from ineffective study habits (rereading, highlighting) toward evidence-based strategies (distributed practice, interleaving, elaboration), and reduced the final-exam performance gap between high- and low-performing students by roughly 50% compared to a previous cohort that received no training.

What they tested

The researchers tested a "Study Smart" training program designed to teach first-year university students about effective and ineffective learning strategies. The intervention had two tiers:

**Tier 1 (all students):** A 90-minute interactive workshop during the first weeks of the semester covering the science of learning, including why rereading and highlighting are weak strategies, and how to use distributed practice (spreading study sessions over time), interleaving (mixing topics within a study session), elaboration (explaining concepts in your own words), and self-testing.

**Tier 2 (low-performing students only):** The 20% of students who scored lowest on the first midterm exam (n = 25) received additional support: three individual coaching sessions (30–45 minutes each) with a trained instructor, focused on diagnosing their current study habits, setting specific goals for adopting effective strategies, and troubleshooting implementation barriers.

The **comparator** was not a randomized control group within the same cohort. Instead, the researchers compared the academic performance of the Study Smart cohort (n = 125 first-year pharmacology students) to the **previous year's cohort** (n = 120 students) who had taken the same courses and exams but had not received any learning strategy training.

**Outcome measures:**

**Metacognitive knowledge:** Students rated the effectiveness of 10 learning strategies (e.g., highlighting, rereading, self-testing, distributed practice) on a 1–5 scale before the training, immediately after, and at the end of the semester. Accuracy was measured by comparing their ratings to expert consensus from cognitive psychology literature.

**Self-reported study behavior:** Students reported how often they used each of the 10 strategies on a 1–5 scale (never to very often) at the same three time points.

**Academic performance:** Scores on three midterm exams and one final exam (all multiple-choice, covering pharmacology content) were collected. The final exam was cumulative.

Who was studied

**Sample size:** 125 first-year pharmacology students in the intervention cohort; 120 students in the historical comparison cohort.

**Population:** All first-year students enrolled in a Doctor of Pharmacy (PharmD) program at a single university in the United States.

**Setting:** A large public university. The course was "Integrated Pharmacology," a required first-semester course.

**Demographics:** Not reported in detail, but typical for this program: ~60–70% female, mean age ~22 years, predominantly domestic students.

**Inclusion/exclusion:** All enrolled students participated in the Tier 1 training. Only the bottom 20% on midterm 1 (n = 25) received Tier 2 coaching. No students were excluded from analysis.

How they measured it

**Metacognitive knowledge questionnaire:** A custom 10-item survey. Students rated each strategy on a 1–5 Likert scale (1 = very ineffective, 5 = very effective). The researchers calculated an "accuracy score" by taking the absolute difference between each student's rating and the expert consensus rating (derived from meta-analyses in cognitive psychology). Lower scores = more accurate knowledge. This was administered at three time points: before the training (baseline), immediately after the training (post-test), and at the end of the semester (12-week follow-up).

**Study behavior questionnaire:** A parallel 10-item survey where students reported how often they used each strategy on a 1–5 scale (1 = never, 5 = very often). Same three time points.

**Academic performance:** Exam scores were obtained from university records. Midterm 1 (week 4), Midterm 2 (week 8), Midterm 3 (week 12), and a cumulative final exam (week 16). All exams were multiple-choice with 50–60 items. Scores were converted to percentages.

**Rank classification:** Students were classified into "top," "middle," and "bottom" ranks based on their Midterm 1 score. The bottom rank was defined as the lowest 20% (n = 25). The top rank was the highest 20% (n = 25). The middle rank was the remaining 60% (n = 75).

Methodology

**Study design:** This was a **quasi-experimental, observational cohort study** with a historical control group. It was not a randomized controlled trial (RCT). All students in the intervention year received the training; there was no concurrent control group. The comparison was made to the previous year's cohort, which had the same curriculum, instructors, and exams but no training.

**Why this design matters:**

**No randomisation:** Students were not randomly assigned to training vs. no training. This means any differences between cohorts could be due to pre-existing differences (e.g., the 2021 cohort might have been smarter, more motivated, or had different prior knowledge). The researchers attempted to control for this by comparing baseline Midterm 1 scores between cohorts, which were similar (mean ~72% in both years), but this does not rule out all confounds.

**No blinding:** Students knew they were receiving a "study skills" intervention. This creates demand characteristics – students might report using better strategies simply because they know it's expected, not because they actually changed their behavior. Instructors also knew which cohort received the training, which could influence grading or teaching.

**Historical control:** The control group was from the previous academic year. This introduces cohort effects: different student populations, different life circumstances (e.g., the COVID-19 pandemic affected 2020–2021 cohorts differently), different exam versions (even if "similar," they are not identical), and different teaching dynamics.

**Duration:** The study followed students for one 16-week semester. This is long enough to see changes in study habits and exam performance, but not long enough to assess whether skills persist into subsequent semesters or years.

**Statistical approach:** The researchers used repeated-measures ANOVA to compare knowledge and behavior over time within the intervention cohort. For academic performance, they used independent-samples t-tests to compare final exam scores between the intervention and historical cohorts, and within the intervention cohort they compared performance across rank groups using ANOVA.

**What this design can prove:**

That a learning strategy training program is associated with changes in self-reported study behavior and metacognitive knowledge in the short term (12 weeks).

That low-performing students who receive additional coaching show greater improvement in exam scores than low-performing students from the previous year.

**What this design cannot prove:**

That the training *caused* the improvements. Without randomisation and a concurrent control group, we cannot rule out that other factors (e.g., changes in teaching, student motivation, exam difficulty) drove the results.

That the effects are due to the Tier 2 coaching specifically, since all low-performing students received it and there was no comparison group of low-performing students who only received Tier 1.

That the results generalize to other universities, courses, or student populations.

**Major methodological weaknesses:**

No concurrent control group (historical control only).

No blinding of students or instructors.

Self-reported study behavior is notoriously unreliable – students may overreport "good" strategies and underreport "bad" ones.

The Tier 2 coaching was not standardized; different coaches may have delivered different content.

Attrition: 11 of 125 students (8.8%) did not complete the follow-up survey, and their data were excluded from some analyses. If these students were systematically different (e.g., lower performers who dropped out), this biases results.

Key findings

**Metacognitive knowledge (accuracy of strategy effectiveness ratings):**

At baseline, students overestimated the effectiveness of highlighting (mean rating 3.8/5 vs. expert rating 2/5) and rereading (mean 4.1/5 vs. expert 2/5), and underestimated distributed practice (mean 3.2/5 vs. expert 5/5) and self-testing (mean 3.5/5 vs. expert 5/5).

Immediately after training, accuracy scores improved significantly: the mean absolute error dropped from 1.12 (SD = 0.34) at baseline to 0.61 (SD = 0.28) post-training (p < 0.001, Cohen's d = 1.63, a very large effect).

At the 12-week follow-up, accuracy remained improved: mean error = 0.72 (SD = 0.31), still significantly better than baseline (p < 0.001, d = 1.24). However, there was a small but significant decline from post-test to follow-up (p = 0.02, d = 0.37).

**Self-reported study behavior:**

**Highlighting:** Decreased from mean 3.9/5 at baseline to 3.2/5 at follow-up (p < 0.001, d = 0.71).

**Rereading:** Decreased from 4.0/5 to 3.3/5 (p < 0.001, d = 0.68).

**Distributed practice:** Increased from 2.8/5 to 3.6/5 (p < 0.001, d = 0.82).

**Interleaving:** Increased from 2.5/5 to 3.1/5 (p < 0.001, d = 0.59).

**Elaboration:** Increased from 3.0/5 to 3.5/5 (p < 0.001, d = 0.51).

**Self-testing:** Increased from 3.2/5 to 3.8/5 (p < 0.001, d = 0.63).

All changes were maintained at the 12-week follow-up with no significant decline.

**Academic performance (primary outcome):**

In the **historical cohort** (no training), the gap between top and bottom performers on Midterm 1 persisted through the final exam. Top-ranked students scored a mean of 88.4% on the final; bottom-ranked students scored 72.1% – a gap of 16.3 percentage points (p < 0.001).

In the **Study Smart cohort**, the gap narrowed dramatically. Top-ranked students scored a mean of 86.7% on the final; bottom-ranked students scored 81.2% – a gap of only 5.5 percentage points (p = 0.04 for the rank × cohort interaction).

The bottom-ranked students in the Study Smart cohort improved from a mean of 58.3% on Midterm 1 to 81.2% on the final exam – an improvement of 22.9 percentage points. In the historical cohort, bottom-ranked students improved from 57.9% to 72.1% – an improvement of 14.2 percentage points. The difference in improvement between cohorts was 8.7 percentage points (p = 0.02).

There was no significant difference in final exam scores between the top-ranked students in the two cohorts (88.4% vs. 86.7%, p = 0.31), suggesting the training did not harm high performers.

**Secondary outcome – remediation track engagement:**

Of the 25 students in the bottom rank who were offered Tier 2 coaching, 22 (88%) attended at least one session. The mean number of sessions attended was 2.4 out of 3.

Students who attended all three sessions (n = 12) showed the largest gains: mean final exam score of 84.3% vs. 77.8% for those who attended fewer sessions (p = 0.04).

Effect magnitude

The training produced a **very large** improvement in metacognitive knowledge accuracy immediately after the session (Cohen's d = 1.63). To put this in perspective: a d of 1.63 means the average student after training had more accurate knowledge than ~95% of students before training.

The reduction in the performance gap between top and bottom students was **substantial**: from a 16.3-point gap in the historical cohort to a 5.5-point gap in the intervention cohort – a reduction of about 66%.

For the bottom-ranked students specifically, the training was associated with an **extra 8.7 percentage points** on the final exam compared to the historical cohort. In a typical grading scale, this could mean the difference between a C+ and a B-, or between passing and failing a high-stakes exam.

The shift in study behavior was **moderate to large**: distributed practice increased by 0.8 points on a 5-point scale (d = 0.82), meaning the average student went from "sometimes" using it to "often" using it. Highlighting and rereading dropped by similar magnitudes.

Limitations

**Acknowledged by authors:**

The lack of a concurrent control group means causal claims are tentative.

Self-report measures of study behavior may not reflect actual behavior.

The study was conducted at a single institution with a specific student population (pharmacy students), limiting generalizability.

The Tier 2 coaching was not standardized across coaches, and the content of sessions was not recorded.

The follow-up period was only one semester; longer-term retention of skills is unknown.

**Additional critical observations:**

**Historical control confounds:** The two cohorts were from different academic years. The COVID-19 pandemic disrupted education globally in 2020–2021, and the "historical" cohort (likely 2020–2021) may have experienced different stressors, online vs. in-person instruction, or grading policies than the intervention cohort (likely 2021–2022). The paper does not discuss this.

**No objective measure of study behavior:** Students self-reported their use of strategies. Without log data from digital learning tools, screen recordings, or study diaries, we cannot verify that students actually changed their behavior. Social desirability bias is likely: students who just attended a workshop on "good" studying will report using those strategies.

**Regression to the mean:** The bottom-ranked students were selected based on a single low midterm score. Statistically, these students are likely to score higher on subsequent exams simply because extreme scores tend to move toward the average on retesting. The historical cohort also showed improvement (14.2 points), which is consistent with regression to the mean. The extra 8.7 points in the intervention cohort could be due to the training, but could also be due to other factors.

**No blinding of exam graders:** Instructors knew which cohort they were teaching and may have unconsciously adjusted grading or teaching quality.

**Attrition bias:** 11 students (8.8%) were lost to follow-up for the survey data. If these were disproportionately low-performing students who dropped out, the remaining sample would show inflated improvements.

**No replication:** This is a single study with a relatively small sample (n = 25 in the bottom rank). The findings need replication in other settings before they can be considered robust.

Practical takeaways

For someone running their own n=1 experiment to improve their learning:

### What to test

**Intervention:** A structured "learning strategy training" consisting of two components:

1. **Self-education session (90 minutes):** Read or watch a summary of evidence-based learning strategies. Focus on four key techniques: distributed practice (spreading study over multiple days), interleaving (mixing topics within a session), elaboration (explaining concepts in your own words without notes), and self-testing (retrieval practice). Also learn why highlighting and rereading are weak strategies.

2. **Personal coaching (3 sessions, 30 minutes each):** After your first exam or quiz, if your score is in the bottom 20% of your class or below your personal target, schedule three weekly sessions where you review your study habits, set specific goals (e.g., "I will use self-testing for 20 minutes each day"), and troubleshoot obstacles.

### Minimum

Read full paper →More Learning research