Jump to start of content

Coronavirus Updates Expand/Collapse the ABIM alert.

At ABIM we are actively monitoring the ongoing spread of coronavirus (COVID-19) and will post regular updates about any changes to scheduled assessments.
Please follow the United States Centers for Disease Control and Prevention (CDC) for the most up-to-date information on the virus.

Breadcrumb trail:

Using multilevel item response theory to explain differential item functioning on a medical maintenance of certification exam.


Bashkov BM, Cubbellotti S, Zhang Y. — American Board of Internal Medicine

Presented: American Educational Research Association Annual Meeting, April 2016

Abstract: Given the increasing number of internationally trained physicians entering U.S. residency and fellowship programs, it is important to examine whether the test items comprising board certification exams are differentially more difficult for this group of examinees, and if so, what item and person characteristics can help explain this phenomenon. In this study, we used a series of multilevel item response theory models to examine the effects of item characteristics and person-level covariates on item difficulty for U.S. and international medical school graduates (USMGs and IMGs) on a Maintenance of Certification (MOC) exam in internal medicine. Although, neither group of examinees was disadvantaged overall, for research purposes, we examined several additional models to get some insight into what item and person characteristics make some items differentially more difficult for IMGs versus USMGs (or vice versa). We found that items containing images and judgment-based items were easier than items that required synthesis of information or longer items, and that IMGs and USMGs were affected differently. In addition, diplomates' age and testing time moderated some of these effects. Across all examinees, items containing images and judgment items were relatively easier; synthesis items and longer items were more difficult. Moreover, image and judgment items were slightly more difficult for IMGs; however, this effect was not statistically significant. In addition, as examinees’ age increased, synthesis items became easier for USMGs, whereas longer items became easier for IMGs. Finally, controlling for total testing time, synthesis items were more difficult for IMGs; the negative effect of item length was exacerbated for USMGs who took longer to complete the test.

For more information about this presentation, please contact Research@abim.org.