--- title: "PHI403 Lecture 19 — What RCTs Do Not Show" type: source tags: [philosophy-of-science, rct, ecological-fallacy, external-validity, negative-results, evidence-based-medicine] authors: [Anjum, Rani Lill; Rocca, Elena] year: 2023 venue: "PHI403 Causation in Science, NMBU" kind: handout raw_path: "raw/Philosophy of Science/PHI302 19 What RCTs Do Not Show.pdf" created: 2026-04-20 updated: 2026-04-20 key_claims: - RCTs are the gold standard but systematically exclude severe effects, risk groups, and individual variation. - Inferring individual-level causation from RCT group averages is the ecological fallacy. - Negative results are rarely published; systematic reviews of RCTs are biased toward optimistic effect estimates and understate risks. - Policy cannot be derived automatically from RCT evidence — a further normative step is always required. --- # PHI403 Lecture 19 — What RCTs Do Not Show Centre-piece critique of the **[[concepts/evidence-hierarchy|EBM evidence hierarchy]]**'s gold standard. RCTs test a single causal intervention against a placebo; randomisation is meant to ensure representativeness; blinding prevents knowledge of group assignment from contaminating results. But the design **systematically excludes** large categories of causally relevant information: - **Severe effects** — ethically untestable (air pollution, seatbelts). The lecture cites the *BMJ* satirical paper "*Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of RCTs*". - **Risk groups** — known-vulnerable subpopulations (allergies, children, elderly, pregnant women, very sick patients) are excluded from trials, so adverse effects in these groups are not evidenced. - **Variations and marginals** — randomisation is designed to eliminate heterogeneity within the test sample; individual variation is treated as noise. - **Individual propensities** — RCTs produce group-level frequencies; inferring from group to individual is the **ecological fallacy**. - **Negative results** — rarely published; systematic reviews of RCTs are biased toward positive findings and understate risks. **External validity**: once reported, RCT results are typically applied universally. Without caveats about exclusions, this extrapolation is not scientifically supported. **Policy is not derivable from RCTs**: meta-analyses can rank interventions by effect size, but choosing a policy requires a further normative judgment — conflating *is* with *ought* is the risk. This is the lecture most directly relevant to evaluating **ML in practice**: the same critiques apply to benchmark-driven ML and to [[sources/2020-rama-maneiro-deep-learning-ppm-review|PPM benchmark contests]]. ## Connections Back-link: [[sources/2023-anjum-rocca-phi403-causation-in-science]]. Concepts: [[concepts/rct-limitations]] · [[concepts/evidence-hierarchy]] · [[concepts/probabilistic-causation]] · [[concepts/causation]] · [[concepts/methodological-pluralism]].