ObservationalWikiStrength Training Financial Behaviour Workspace Reading Journalling Social Habits Causal Inference sensitivity_analysis Screen TimeModerate

Economic Analysis of Social Interactions

Read full paper →

Authors: Charles F. Manski
Journal: The Journal of Economic Perspectives
Year: 2000
DOI: 10.1257/jep.14.3.115
Citations: 2,083

TL;DR

Social interactions—how your behaviour is influenced by the people around you—are extremely difficult to study using observational data because many different causal mechanisms can produce the same observed patterns, meaning that without controlled experiments or subjective data on expectations, you cannot reliably tell whether your friends' actions caused your behaviour or whether you simply chose friends who already behaved like you.

What they tested

This is not an empirical study but a theoretical and methodological critique. Manski examines the entire field of empirical research on social interactions—how individuals' behaviours, beliefs, and outcomes are shaped by the groups they belong to (peers, neighbours, family, colleagues). He tests no single intervention. Instead, he analyses the logical structure of the problem: given observational data (e.g., survey data on test scores, income, crime, or health behaviours across neighbourhoods or schools), what can we actually conclude about social influence?

The paper identifies three distinct mechanisms that can produce observed correlations between group members' outcomes:

1. **Endogenous effects:** Your behaviour changes because your peers' behaviour changes (true social influence). Example: You start exercising because your roommate starts exercising.

2. **Exogenous (contextual) effects:** Your behaviour changes because of fixed characteristics of your group, not because of their current behaviour. Example: You exercise more because your neighbourhood has a gym, not because your neighbours exercise.

3. **Correlated effects:** Members of the same group behave similarly because they share similar environments or because they self-selected into that group. Example: You and your roommate both exercise because you both chose an apartment near a park—not because either of you influenced the other.

The outcome measure is the *interpretability* of empirical findings—specifically, whether a given correlation can be uniquely attributed to one of these three mechanisms. Manski shows that without additional data (experiments, subjective expectations, or longitudinal data on group formation), it is mathematically impossible to distinguish them.

Who was studied

No human subjects were studied. The paper analyses the logical structure of existing empirical research on social interactions. The "sample" is the body of published observational studies in economics and sociology up to the year 2000, including studies of:

Neighbourhood effects on children's educational attainment and crime

Peer effects on academic performance in schools

Social interactions in labour markets (e.g., how job referrals spread)

Family and household decision-making

The "setting" is the universe of observational datasets (e.g., the Panel Study of Income Dynamics, the National Longitudinal Survey of Youth, census data) that economists had used to try to measure social influence.

How they measured it

No instruments or scales were used. Manski uses mathematical modelling and logical deduction. He formalises the problem using a linear-in-means model—a standard econometric framework where an individual's outcome is modelled as a function of:

Their own characteristics (age, income, education)

The mean outcome of their reference group (e.g., average test score in their school)

The mean characteristics of their reference group (e.g., average parental income in their school)

Unobserved factors

He then shows that, without strong assumptions (e.g., that groups are formed randomly, or that you know exactly who influences whom), the parameters of this model are *not identified*—meaning that infinitely many different combinations of endogenous, exogenous, and correlated effects can produce the exact same observed data.

Methodology

**Study design:** This is a theoretical paper—a formal mathematical analysis of the identification problem in observational studies of social interactions. It is not an experiment, a meta-analysis, or a systematic review. It is a critique of an entire research programme.

**Key logical steps:**

1. Manski defines the three types of social effects (endogenous, exogenous, correlated).

2. He presents the linear-in-means model, which is the standard way economists have tried to estimate peer effects.

3. He proves that, in the most general case, the model's parameters are unidentified—you cannot separately estimate the strength of endogenous effects, exogenous effects, and correlated effects from observational data alone.

4. He discusses what additional data or assumptions could solve the problem: random assignment to groups (experiments), data on subjective expectations (what people think their peers will do), or data on group formation processes.

**Why this design matters:** The paper is not a study you would replicate. It is a foundational critique that every researcher in this field must confront. Its "methodology" is mathematical proof, not empirical observation. The strength of this approach is that it reveals a logical impossibility—no amount of bigger observational datasets can solve the identification problem without additional information. The weakness is that it does not provide an alternative empirical strategy; it only diagnoses the problem.

**What this design can and cannot prove:**

**Can prove:** That observational data alone cannot distinguish between causal social influence and non-causal correlation. This is a mathematical result, not a statistical one—it holds regardless of sample size.

**Cannot prove:** That social interactions do not exist, or that they are unimportant. The paper only shows that we cannot reliably measure them using standard observational methods.

**Major methodological weakness:** The paper is purely theoretical. It does not test its claims against any real-world data. It also assumes a specific model (linear-in-means) which, while standard, may not capture all forms of social interaction (e.g., non-linear effects, network structure, threshold effects where behaviour spreads only after a critical mass is reached).

Key findings

**The reflection problem:** Manski identifies what he calls the "reflection problem"—the mathematical impossibility of distinguishing whether a person's behaviour is influenced by the average behaviour of their group, or whether the group's average behaviour is simply the sum of its members' individual behaviours. This is analogous to trying to tell whether a mirror reflects your image or your image creates the mirror. The two are observationally equivalent.

**Non-identification of endogenous effects:** In the standard linear-in-means model, the parameter measuring endogenous social effects (how much your outcome changes when your peers' outcomes change) is not identified unless the researcher makes an arbitrary assumption about which group-level characteristics affect outcomes and which do not. Specifically, if the researcher does not know exactly which exogenous characteristics of the group matter, the endogenous effect cannot be separated from the exogenous effect.

**The role of self-selection:** Even if a researcher observes that students in high-achieving schools have higher test scores, this could be because:

- The school's peer environment raises achievement (endogenous effect)

- The school has better teachers and facilities (exogenous effect)

- High-achieving families choose to live in that school district (correlated effect)

Without random assignment, these three explanations are mathematically indistinguishable.

**What would solve the problem:** Manski identifies three routes to identification:

1. **Random assignment** of individuals to groups (e.g., random assignment of students to classrooms, or random assignment of families to neighbourhoods via housing vouchers)

2. **Subjective data** on expectations (e.g., asking people what they think their peers will do, rather than observing what peers actually do)

3. **Non-linearities** in the social interaction process (e.g., if the effect of peers is non-linear—only mattering above a threshold—this can sometimes be identified)

**Empirical research has made little progress:** Manski argues that despite a surge in empirical work on social interactions, the fundamental identification problem remains unresolved. Most studies that claim to have found peer effects have done so by making untestable assumptions about which group characteristics matter and which do not.

Effect magnitude

This paper does not report effect sizes because it is not an empirical study. However, the practical magnitude of the problem it identifies is enormous:

**In education research:** Studies claiming that a 1-point increase in average peer test scores raises an individual's test score by 0.2–0.5 points (a typical finding) are equally consistent with the interpretation that students with higher test scores simply sort into schools with other high-scoring students. The "effect" could be entirely spurious.

**In health behaviour research:** Studies claiming that having obese friends increases your probability of becoming obese by 57% (a famous finding from the Framingham Heart Study) suffer from the same identification problem. The correlation could reflect shared environment (same neighbourhood, same grocery stores) or selection (obese people befriend other obese people).

**In crime research:** Studies claiming that living in a high-crime neighbourhood increases an individual's probability of committing crime by 10–20 percentage points cannot distinguish whether this is due to peer influence, lack of policing, or the fact that families who commit crimes tend to live in the same neighbourhoods.

The key takeaway is not a specific number but a range: the true effect of social influence could be anywhere from zero to the full observed correlation, and observational data alone cannot tell you where in that range the truth lies.

Limitations

**What the author acknowledges:**

The analysis is limited to the linear-in-means model, which may not capture all forms of social interaction.

The paper does not provide a solution; it only diagnoses the problem.

Some special cases (e.g., non-linear interactions, or data on who actually interacts with whom) can partially overcome the identification problem.

**What a critical reader would note:**

**No empirical test:** The paper is entirely theoretical. It does not test its claims against any real-world dataset to show that the identification problem actually matters in practice. A sceptic could argue that while the problem exists in theory, in practice the magnitude of the bias is small.

**Assumes rational expectations:** The linear-in-means model assumes that individuals know the mean outcome of their reference group. In reality, people may have biased perceptions of their peers' behaviour, which could actually help identification (because perceived peer behaviour differs from actual peer behaviour).

**Ignores network structure:** The paper treats groups as homogeneous (everyone in a school or neighbourhood influences everyone else equally). Real social networks are much more structured—you are influenced by your friends, not by the average person in your neighbourhood. Network data (who actually interacts with whom) can sometimes solve the identification problem, but Manski does not discuss this in depth.

**Dated context:** The paper was published in 2000. Since then, there has been substantial progress using randomised experiments (e.g., random assignment of students to roommates, random assignment of housing vouchers) and using network data. The critique remains valid for purely observational studies, but the field has moved partly beyond it.

**No discussion of measurement error:** The paper assumes that outcomes and group characteristics are measured without error. In practice, measurement error can either worsen or (counter-intuitively) help identification, but this is not addressed.

Practical takeaways

For someone running their own n=1 experiment on social influence:

### What to test

**Specific intervention:** Deliberately change the behaviour of one person in your social circle and measure whether your own behaviour changes. For example:

- Ask a friend to start exercising at a specific time each day, and measure whether you also start exercising more.

- Ask a friend to start studying in the same room as you, and measure your own study time.

- Ask a friend to start eating a specific food (e.g., vegetables at dinner), and measure your own consumption.

**Dose:** The intervention should be a clear, observable change in your peer's behaviour. The peer should change their behaviour by a specific amount (e.g., "exercise for 30 minutes at 6 PM every day" rather than "exercise more").

### Minimum meaningful duration

**At least 4 weeks:** Social influence may take time to develop. A minimum of 2 weeks of baseline measurement (your behaviour before the peer changes) and 2 weeks of intervention (after the peer changes) is needed to distinguish a trend from noise.

**Longer is better:** 8–12 weeks total (4 weeks baseline, 4–8 weeks intervention) would give more reliable results, especially if the behaviour is habitual (e.g., diet, exercise, sleep).

### What to measure (specific metrics)

**Your behaviour:** Measure the same behaviour daily. For example:

- Minutes of exercise per day (using a fitness tracker or log)

- Hours of study per day (using a timer or app)

- Calories consumed per day (using a food diary)

- Minutes of social media use per day (using screen time tracking)

**Peer's behaviour:** Also measure the peer's behaviour daily, to confirm they actually changed. If you cannot measure the peer directly, at least record whether they performed the behaviour each day (yes/no).

**Confounders:** Measure daily:

- Your mood (1–10 scale)

- Your energy level (1–10 scale)

- Your stress level (1–10 scale)

- Whether you spent time with the peer (yes/no, and duration)

- Whether you were exposed to other peers who also perform the behaviour

### Key confounds to control for

**Self-selection:** You may have chosen to be friends with this person because you already share similar behaviours. To control for this, the peer should change their behaviour *after* you have established a baseline. Do not choose a peer who already exercises if you also exercise—choose a peer who does not currently exercise and ask them to start.

**Shared environment:** If you and your peer live together, changes in your behaviour could be due to changes in your shared environment (e.g., you both start exercising because the weather improves, not because of social influence). Control for this by measuring environmental factors (weather, day of week, holidays) and including them in your analysis.

**Reverse causality:** Your behaviour might cause your peer's behaviour, not the other way around. To control for this, the peer should change their behaviour *first*, and you should measure whether your behaviour changes *after* their change. Use a time-lagged analysis (e.g., does peer's behaviour on day 1 predict your behaviour on day 2?).

**Expectation effects:** You might change your behaviour simply because you know you are being studied. To minimise this, do not tell yourself the hypothesis you are testing. If possible, have the peer change their behaviour without telling you the exact timing of the change.

### What a positive result would look like

**Clear temporal pattern:** Your behaviour shows no trend during the baseline period (2 weeks), then shows a clear increase or decrease within 1–2 weeks of the peer's behaviour change.

**Effect size:** A change of at least 20–30% from your baseline mean. For example, if you exercise 30 minutes/day at baseline, a positive result would be 36–39 minutes/day during the intervention.

**Consistency:** The change should be visible on most days, not just a few. If you exercise more on only 2 out of 14 intervention days, that is not a reliable effect.

**Statistical test:** Use a simple t-test comparing your baseline mean to your intervention mean. A p-value < 0.05 is suggestive, but with n=1 and small sample sizes, focus on the effect size and consistency rather than p-values alone.

### Key warning from Manski's paper

Even with a well-designed n=1 experiment, you cannot fully rule out that the change in your behaviour was caused by something else that happened at the same time as your peer's behaviour change. The only way to be confident is to repeat the experiment multiple times (e.g., have the peer start and stop the behaviour several times) and see if your behaviour tracks theirs each time. This is the closest you can get to a causal conclusion in an n=1 setting.

Read full paper →More Strength Training research