What's New in 2015
Starting with exams administered in Spring 2015, ABIM is now distributing exam results in a new, electronic format. This new Score Report was redesigned in collaboration with ABIM Board Certified physicians across various specialties. The result is an enhanced, user-friendly report that provides a detailed description of exam performance in a timely manner. To learn more about the features of the updated score report, visit the Transforming ABIM blog or watch this interactive video.
ABIM uses an Automated Test Assembly (ATA) program to build its exams. This program ensures a fair balance of content on each examination form, so that each form reflects the distribution of the items according to the blueprint and other specific content criteria. ATA also uses statistical criteria to ensure that examination forms are comparably constructed with regard to difficulty, discrimination and other statistical constraints. The examination forms are built with items that best meet the content and statistical criteria; the Exam Committee chair then reviews and approves the ATA-constructed examination for approval before a form is administered to candidates.
Standardized Score Scale
Your performance on the entire examination determines your examination pass-fail decision. Overall performance is reported on a standardized score scale ranging from 200 to 800 in increments of 1 point. The mean standardized score for first-time takers of this examination is 500 with a standard deviation of 100. For example, a score of 600 is one standard deviation above the mean of first time takers. Candidates with equal ability will achieve the same standardized score.
Standard Error of Measurement
The standard error of measurement (SEM) is the measure of your score's precision. The smaller the standard error of measurement, the more likely your score is reproducible across multiple retakes. For example, an average examinee would have a score of 500 and a standard error of 12 (i.e. 500 ±12). This means that if this examinee were to retake the exam without any additional preparation, the expected score for his or her retake would fall between 488 and 512. The standard error of measurement does not affect pass/fail decisions. Pass/fail decisions are based on a candidate’s actual test score (e.g., 500), without considering his/her standard error of measurement (e.g., 488).
To pass the examination, your standardized score must equal or exceed the standardized passing score. The passing standard for each ABIM Subspecialty exam is established using standard-setting techniques that follow best practices in the testing industry. The standard for each certification exam is set by the designated ABIM Subspecialty Board or Test Committee. Members of the specialty boards and test committees are nationally recognized specialists whose combined expertise encompasses the breadth of clinical knowledge in the specialty area. Members include both clinical educators and practitioners, incorporating the perspectives of both the training and practice environments. In setting the passing standard, the committee considers several factors, including relevant changes to the knowledge base of the field as well as changes in the characteristics of minimally qualified candidates for certification.
The passing standard for an exam is based on a specified level of mastery of content in the specialty area. Therefore, no predetermined percentage of examinees will pass or fail the exam. The committee sets a content-based standard, using the modified-Angoff method. This evidence-based method asks raters to conceptualize and estimate what a specialist who is just barely qualified to merit certification would be able to do. For each question, the rater is asked, “What is the probability that this type of physician will correctly answer this question?” The raters' judgments are systematically combined to derive the passing standard on the standardized score scale.
Following best practices in the testing industry, standards are periodically reviewed for appropriateness and may be adjusted. If the committee determines that the current standard is no longer appropriate, based on its judgment of the cognitive expertise essential for certification, it will set a new standard using the process described above. This new standard will then be periodically reviewed to ensure its continuing appropriateness.
Reference Group Information
The ABIM score report includes reference group information to help physicians interpret their results. The reference group is defined as the group of first-time examinees who completed a similar exam during the current or a previous administration. Typically, the reference group on the ABIM score reports includes first-time takers of the exam from the current administration and up to two more previous administrations.
The rationale for including a reference group is to provide stability when making comparisons with the performance of other examinees. Since the number of first-time takers completing the exam during a given administration may be small, the reference group comprised of first-time takers from multiple administrations is used in the score report in order to compare your performance with a more representative cohort. For specific pass rate information, please visit the Exam Pass Rate section.
Content Area Subscores
Content area scores provide feedback on your relative strengths and weaknesses in the content domains of your specialty. They are reported in standard deviation units and are on a different scale than your overall score. Therefore, these scores cannot be compared directly to your overall score.
Content area scores are calculated from fewer questions than the overall score so they are less exact or reproducible. The lower reproducibility limits the degree to which you can generalize from your performance on a content area to your specific strengths and weaknesses. Therefore, the standard error of measurement (SEM) for the medical content areas is much larger than the overall test score. For these reasons, you should be cautious in interpreting the content area scores that appear in your report.
Due to the fact that each content area has fewer questions, the classical percent correct method is not considered for reporting performance of content areas. Instead, these content area scores are calculated using the Empirical Bayes (EB) method. This method is consistent with the procedures currently in place for estimating overall scores.
This method yields more reliable scores than the classical percent correct method. The method incorporates ancillary information to enhance score precision–something that the percent correct method does not do. This has the effect of making each candidate's content area score profile more homogenous and less susceptible to irregularities associated with small numbers of items. In addition, the scaled scores resulting from the EB procedure are not as test- and sample-dependent as percent correct scoring and deciles. Percent correct scores are dependent on the specific items that were administered, and deciles are dependent on the group that was administered the exam. Although the EB procedures do rely on subscore reliability estimates and inter-subdomain correlation estimates–which are dependent, in part, on the exam and the administration group–the EB scores are not as test- or sample-dependent as the classical methods.
An examination blueprint is a table of specifications that defines the content of each exam. It is developed by the Exam Committee and reviewed annually. The examination blueprint is based on analyses of current practices and an understanding of the relative importance of the clinical problems in the specialty area. The exam blueprints at the primary level for Internal Medicine and each subspecialty are published on the ABIM website. Select an exam blueprint for your specialty.