---
title: "PHI403 Lecture 19 — What RCTs Do Not Show"
type: source
tags: [philosophy-of-science, rct, ecological-fallacy, external-validity, negative-results, evidence-based-medicine]
authors: [Anjum, Rani Lill; Rocca, Elena]
year: 2023
venue: "PHI403 Causation in Science, NMBU"
kind: handout
raw_path: "raw/Philosophy of Science/PHI302 19 What RCTs Do Not Show.pdf"
created: 2026-04-20
updated: 2026-04-20
key_claims:
  - RCTs are the gold standard but systematically exclude severe effects, risk groups, and individual variation.
  - Inferring individual-level causation from RCT group averages is the ecological fallacy.
  - Negative results are rarely published; systematic reviews of RCTs are biased toward optimistic effect estimates and understate risks.
  - Policy cannot be derived automatically from RCT evidence — a further normative step is always required.
---

# PHI403 Lecture 19 — What RCTs Do Not Show

Centre-piece critique of the **[[concepts/evidence-hierarchy|EBM evidence hierarchy]]**'s gold standard. RCTs test a single causal intervention against a placebo; randomisation is meant to ensure representativeness; blinding prevents knowledge of group assignment from contaminating results. But the design **systematically excludes** large categories of causally relevant information:

- **Severe effects** — ethically untestable (air pollution, seatbelts). The lecture cites the *BMJ* satirical paper "*Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of RCTs*".
- **Risk groups** — known-vulnerable subpopulations (allergies, children, elderly, pregnant women, very sick patients) are excluded from trials, so adverse effects in these groups are not evidenced.
- **Variations and marginals** — randomisation is designed to eliminate heterogeneity within the test sample; individual variation is treated as noise.
- **Individual propensities** — RCTs produce group-level frequencies; inferring from group to individual is the **ecological fallacy**.
- **Negative results** — rarely published; systematic reviews of RCTs are biased toward positive findings and understate risks.

**External validity**: once reported, RCT results are typically applied universally. Without caveats about exclusions, this extrapolation is not scientifically supported.

**Policy is not derivable from RCTs**: meta-analyses can rank interventions by effect size, but choosing a policy requires a further normative judgment — conflating *is* with *ought* is the risk.

This is the lecture most directly relevant to evaluating **ML in practice**: the same critiques apply to benchmark-driven ML and to [[sources/2020-rama-maneiro-deep-learning-ppm-review|PPM benchmark contests]].

## Connections
Back-link: [[sources/2023-anjum-rocca-phi403-causation-in-science]]. Concepts: [[concepts/rct-limitations]] · [[concepts/evidence-hierarchy]] · [[concepts/probabilistic-causation]] · [[concepts/causation]] · [[concepts/methodological-pluralism]].