Measurement precision at the cut score in medical multiple choice exams: Theory matters

PDF / 445,321 Bytes
9 Pages / 595 x 842 pts (A4) Page_size
14 Downloads / 231 Views

Perspect Med Educ https://doi.org/10.1007/s40037-020-00586-0

Measurement precision at the cut score in medical multiple choice exams: Theory matters Felicitas-Maria Lahner · Stefan Schauber · Andrea Carolin Lörwald · Roger Kropf · Sissel Guttormsen · Martin R. Fischer · Sören Huwendiek

© The Author(s) 2020

Abstract Introduction In high-stakes assessment, the measurement precision of pass-fail decisions is of great importance. A concept for analyzing the measurement precision at the cut score is conditional reliability, which describes measurement precision for every score achieved in an exam. We compared conditional reliabilities in Classical Test Theory (CTT) and Item Response Theory (IRT) with a special focus on the cut score and potential factors influencing conditional reliability at the cut score. Methods We analyzed 32 multiple-choice exams from three Swiss medical schools comparing conditional reliability at the cut score in IRT and CCT. Additionally, we analyzed potential influencing factors such as the range of examinees’ performance, year of study, and number of items using multiple regression.

F.-M. Lahner () · A. C. Lörwald · S. Guttormsen · S. Huwendiek Institute for Medical Education, University of Bern, Bern, Switzerland [email protected] F.-M. Lahner Department of Health Professions, University of Applied Sciences, Bern, Switzerland S. Schauber Centre for Educational Measurement at the University of Oslo (CEMO) and Centre for Health Sciences Education, University of Oslo, Oslo, Norway R. Kropf Faculty of Medicine, University of Zurich, Zurich, Switzerland M. R. Fischer Institute for Medical Education, University Hospital, LMU Munich, Munich, Germany

Results In CTT, conditional reliability was highest for very low and very high scores, whereas examinees with medium scores showed low conditional reliabilities. In IRT, the maximum conditional reliability was in the middle of the scale. Therefore, conditional reliability at the cut score was significantly higher in IRT compared with CTT. It was influenced by the range of examinees’ performance and number of items. This influence was more pronounced in CTT. Discussion We found that conditional reliability shows inverse distributions and conclusions regarding the measurement precision at the cut score depending on the theory used. As the use of IRT seems to be more appropriate for criterion-oriented standard setting in the framework of competency-based medical education, our findings might have practical implications for the design and quality assurance of medical education assessments. Keywords Multiple choice exams · Measurement precision · Reliability · Conditional reliability

Introduction This study examines the nature of measurement precision at the cut score as estimated according to Classical Test Theory (CTT) and Item Response Theory (IRT). In the following, we will begin by describing why it is important to determine the measurement precision at the cut score, and by introducing the concept of conditional reliability and its ma

Data Loading...

Measurement precision at the cut score in medical multiple choice exams: Theory matters

Recommend Documents

Group Formation Theory at Multiple Scales

Introducing multiple-choice questions to promote learning for medical students: effect on exam performance in obstetrics

Average Precision at n

Precision at n

Navigating choice in multiple sclerosis management

Social Choice Theory

Precision Measurement of Low Loss Window Materials

Effective Theories and Theory Choice

Color Measurement and Calibration in Medical Photography

Rational Choice Theory and Friends

Defining and tracking medical student self-monitoring using multiple-choice question item certainty

The Wave Function in Multiple Scattering Theory