The application of Rasch measurement theory to psychiatric clinical outcomes research: Commentary on … Screening for depression in primary care

Skye P. Barbic; Stefan J. Cano

doi:10.1192/pb.bp.115.052290

The application of Rasch measurement theory to psychiatric clinical outcomes research

Commentary on … Screening for depression in primary care

Published online by Cambridge University Press: 02 January 2018

Skye P. Barbic and

Stefan J. Cano

Show author details

Skye P. Barbic: Affiliation:
University of British Columbia, Vancouver, Canada
Stefan J. Cano*: Affiliation:
Modus Outcomes, Stotfold, UK
*: Correspondence to Stefan Cano (stefan.cano@modusoutcomes.com)

Article contents

Summary
Footnotes
References

Rights & Permissions

Summary

This commentary argues the importance of robust, meaningful assessment of clinical and functional outcomes in psychiatry. Outcome assessments should be fit for the purpose of measuring relevant concepts of interest in specific clinical settings. As well, the measurement model selected to develop and test assessments can be critical for guiding care. Three types of measurement models are presented: classical test theory, item response theory, and Rasch measurement theory. To optimise current diagnostic and treatment practices in psychiatry, careful consideration of these models is warranted.

Type: Original Papers
Information: BJPsych Bulletin , Volume 40 , Issue 5 , October 2016 , pp. 243 - 244

DOI: https://doi.org/10.1192/pb.bp.115.052290 [Opens in a new window]
Creative Commons: This is an open-access article published by the Royal College of Psychiatrists and distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: Copyright © 2016 The Authors

Unlike many fields in medicine, most clinical outcomes in psychiatry are not directly observable and cannot be captured with diagnostic tests such as blood work or imaging. In recent years, the importance of the routine use of clinical outcome assessments (patient-reported outcomes, clinician-reported outcomes, observer-reported outcomes and performance outcomes) for measuring the symptoms of disease and treatment outcomes has been increasingly emphasised.¹ Clinical outcome assessments such as the Patient Health Questionnaire-9 (PHQ-9)^{Reference Kroenke, Spitzer and Williams2} are now commonly used in clinical research and practice to provide an assessment of a patient's severity of mood and improvement in response to treatment.^{Reference Thase3} More broadly, as the demand increases for a broad range of mental health services to be patient-centred, clinical outcome assessments are used to capture outcomes such as sustained symptom reduction, return to full functioning and optimal patient well-being.^{Reference Thornicroft and Slade4}

To optimise mental healthcare, clinical outcome assessments used in psychiatry should be shown to be fit for purpose. They should appropriately capture the concept of interest (e.g. depression) in the context of use (e.g. patients attending primary care clinics reporting symptoms of depression).¹ They should also be underpinned by an appropriate measurement model, that is they should have evidence that the summed score of their individual items is ‘psychometrically sound’.¹ To this end, there are three main psychometric approaches based on three types of measurement model: classical test theory (CTT), Rasch measurement theory (RMT) and item response theory (IRT).^{Reference Cano and Hobart5}

The current dominant paradigm in clinical outcomes research is CTT, the foundations of which were laid down by Charles Spearman at the turn of the twentieth century.^{Reference Spearman6} CTT is associated with the psychometric properties most commonly recognised and understood by clinicians (e.g. reliability, validity and ability to detect change). However, there are four important limitations^{Reference Hobart and Cano7} that prevent CTT methodology from fulfilling the requirements of scientific rigour demanded of high-stakes clinical decision-making: (a) measurements generated are ordinal rather than interval; (b) scores for persons and samples are scale dependent; (c) scale properties, such as reliability and validity, are sample dependent; (d) data can support group-level inferences but are not suitable for individual patient measurement.

Georg Rasch, a Danish mathematician, argued that the core requirement of social measurement should be the same as that in physical measurement, and developed the simple logistic model now known as the ‘Rasch model’.^{Reference Rasch8} In essence, RMT methods assess the extent to which observed clinical outcome assessment data (e.g. patient ratings on the items of the PHQ-9) ‘fit’ with predictions of those ratings from the Rasch model (which defines how a set of items should perform to generate reliable and valid measurements).^{Reference Rasch8} The difference between the expected and observed scores reveals the extent to which valid measurement is achieved. In turn, this gives rise to a range of potential investigations to better understand the extent to which the clinical outcome assessment under investigation is an appropriate measurement instrument (e.g. scale-to-sample targeting, adequacy of type and kind of response options, item and person fit, item dependency (or bias), stability between subgroups).^{Reference Hobart and Cano7,Reference Andrich9} Importantly, RMT addresses^{Reference Hobart and Cano7} each of the four limitations of CTT described above: (a) linear measurements can be constructed from ordinal-level data; (b) item estimates provided are free from the sample distribution and person estimates are free from the scale distribution; (c) subsets of items from each scale rather than all items can be used (i.e. the foundation for item banking and computerised adaptive testing); (d) estimates are suitable for individual person analyses rather than only for group comparison studies.

IRT is another body of psychometric methodology that is used to ascertain the degree to which a given model and parameter estimates can account for the structure of and statistical patterns in a clinical outcome assessment dataset.^{Reference Lord and Novick10} The distinction between RMT and IRT is subtle but important. IRT models are statistical models used to explain data, and the aim of an IRT analysis is to find the statistical model that best explains the observed data.^{Reference Andrich9} By contrast, the aim of RMT is to determine the extent to which observed clinical outcome assessment data satisfy the measurement model.^{Reference Rasch8} When the data do not fit the model, they are examined to try to explain the misfit. This is the central tenet of the Rasch model and one that distinguishes it from IRT models. Specifically, its defining property is its mathematical embodiment of the principle of invariant comparison. Thus, the comparison of two people is independent of which items are used within a set of items assessing the same concept of interest. In this way, the Rasch model is taken as a criterion for the structure of the responses, rather than simply a statistical description of the responses from patients. This central tenet distinguishes the RMT diagnostic paradigm from the IRT modelling paradigm.^{Reference Andrich9}

In this issue, Horton and Perry provide an example of diagnostic information that can be attained using RMT methods, not available using information gleaned from CTT or IRT methods.^{Reference Horton and Perry11} The availability and increased application of RMT psychometric methods for developing and evaluating clinical outcome assessments in psychiatry has important implications for future research and practice. By better understanding the strengths, weaknesses and measurement potential of such assessments, we are able to build an evidence base towards optimising the organisation and delivery of healthcare in psychiatry.^{Reference Barbic, Kidd, Davidson, McKenzie and O'Connell12}

Footnotes

†

See original paper, pp. 237–243, this issue.

Declaration of interest

None.

References

1 US Food and Drug Administration. Clinical Outcome Assessment Qualification Program. FDA, 2015. Available at: http://www.fda.gov/Drugs/DevelopmentApprovalProcess/DrugDevelopmentToolsQualificationProgram/ucm284077.htm (accessed 23 November 2015).Google Scholar

2 Kroenke, K, Spitzer, R, Williams, J. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–13.CrossRef Google Scholar PubMed

3 Thase, M. Translating clinical science into effective therapies. J Clin Psychiatry 2014; 75: e11.Google Scholar

4 Thornicroft, G, Slade, M. New trends in assessing the outcomes of mental health interventions. World Psychiatry 2014; 13: 118–24.CrossRef Google Scholar PubMed

5 Cano, S, Hobart, J. The problem with health measurement. Patient Pref Adher 2011; 5: 279–90.Google Scholar PubMed

6 Spearman, CE. The proof and measurement of association between two things. Am J Psychol 1904; 15: 72–101.CrossRef Google Scholar

7 Hobart, J, Cano, S. Improving the evaluation of therapeutic intervention in MS: the role of new psychometric methods. UK Health Techn Assess Prog (Monograph) 2009; 13: 1–200.Google Scholar

8 Rasch, G. Probabilistic Models for Some Intelligence and Attainment Tests. Danish Institute for Education Research, reprinted: MESA Press, 1993.Google Scholar

9 Andrich, D. Rating scales and Rasch measurement. Expert Rev. Pharmacoecon Outcomes Res 2011; 11: 571–5.Google Scholar

10 Lord, FM, Novick, MR. Statistical Theories of Mental Test Scores. Addison-Wesley, 1968.Google Scholar

11 Horton, M, Perry, A. Screening for depression in primary care: a Rasch analysis of the PHQ-9? BJPsych Bull 2016; doi: 10.1192/pb.bp.114.050294.Google Scholar

12 Barbic, S, Kidd, S, Davidson, L, McKenzie, K, O'Connell, M. Validation of the Brief Version of the Recovery Self-Assessment (RSA-B) using Rasch measurement theory. Psychiatr Rehabil J 2015; 38: 349–58.Google Scholar

Submit a response

eLetters

No eLetters have been published for this article.

Article contents

The application of Rasch measurement theory to psychiatric clinical outcomes research

Summary

Footnotes

References

eLetters

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests