Do ultra-short screening instruments accurately detect depression in primary care? A pooled analysis and meta-analysis of 22 studies.
Mitchell AJ, Coyne JC
Br J Gen Pract. 2007;57(535):144.
BACKGROUND: Guidance from the National Institute for Health and Clinical Excellence recommends one or two questions as a possible screening method for depression. Ultra-short (one-, two-, three- or four-item) tests have appeal due to their simple administration but their accuracy has not been established.
AIM: To determine whether ultra-short screening instruments accurately detect depression in primary care.
DESIGN OF STUDY: Pooled analysis and meta analysis.
METHOD: A literature search revealed 75 possible studies and from these, 22 STARD-compliant studies (Standards for Reporting of Diagnostic Accuracy) involving ultra-short tests were entered in the analysis.
RESULTS: Meta-analysis revealed a performance accuracy better than chance (P<0.001). More usefully for clinicians, pooled analysis of single-question tests revealed anoverall sensitivity of 32.0% and specificity of 97.0% (positive predictive value [PPV]was 55.6% and negative predictive value [NPV]was 92.3%). For two- and three-item tests, overall sensitivity on pooled analysis was 73.7% and specificity was 74.7% with a PPV of only 38.3% but a pooled NPV of 93.0%. The Youden index for single-item and multiple item tests was 0.289 and 0.47 respectively, suggesting superiority of multiple item tests. Re-analysis examining only 'either or' strategies improved the 'rule in' ability of two- and three-question tests (sensitivity 79.4% and NPV 94.7%) but at the expense of being able to rule out a possible diagnosis if the result was negative.
CONCLUSION: A one-question test identifies only three out of every 10 patients with depression in primary care, thus unacceptable if relied on alone. Ultra-short two- or three-question tests perform better, identifying eight out of 10 cases. This is at the expense of a high false-positive rate (only four out of 10 cases with a positive score are actually depressed). Ultra-short tests appear to be, at best, a method for ruling out a diagnosis and should only be used when there are sufficient resources for second-stage assessment of those who screen positive.
