Proof, p-values, and hypothesis testing
- David M Rind, MD
David M Rind, MD
- Section Editor — General Medicine
- Chief Medical Officer
- Institute for Clinical and Economic Review
- Assistant Professor of Medicine, Part-time
- Harvard Medical School
The concepts around biostatistics are frequently confusing to clinicians. The meaning of a p-value in particular is commonly misunderstood and yet is central to the way most clinicians interpret the results of scientific studies [1,2].
This review will discuss the correct interpretation of p-values and confidence intervals, the idea of proof, and the understanding of power calculations in negative studies. A general discussion of the meaning of biostatistical terms is found elsewhere. (See "Glossary of common biostatistical and epidemiological terms".)
In scientific and medical endeavors, a common question to be addressed is "what constitutes proof?" How do we decide when the evidence for or against a hypothesis is adequate to consider the matter proven?
Certain methodologies of clinical trials are considered "stronger" than other methodologies. For instance, randomized clinical trials are generally considered better evidence than case control studies. Proof, however, never exists in a single trial result or a single piece of evidence. Proof is a human concept having to do with the rational thought process. Information may be sufficient to allow one person to consider something proven where another will not.
As an example, there is no evidence from clinical trials in humans that cigarette smoking causes lung cancer. However, evidence from epidemiologic studies overwhelmingly shows a relationship between smoking and lung cancer. A dose-response relationship in these studies and evidence from animal studies provide strong support for the relationship having biologic plausibility and being causal (ie, smoking is not just associated with lung cancer but is a cause of lung cancer). Most people consider it proven that smoking causes lung cancer despite the absence of clinical trials in humans.
- Davidoff F. Standing statistics right side up. Ann Intern Med 1999; 130:1019.
- Goodman SN. Toward evidence-based medical statistics. 1: The P value fallacy. Ann Intern Med 1999; 130:995.
- Goodman SN, Berlin JA. The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med 1994; 121:200.