Cook RJ, Clauser J. — American Board of Internal Medicine
Presented: Association of Test Publishers, March 2015
The multiple choice item is a fixture of the standardized testing landscape. This is, so in part, because of the relative ease with which multiple-choice items can be developed, administered and scored, as well as the information that they provide relative to testing time. Despite these benefits, the multiple-choice item is not without limitations. It is often justly criticized for reducing the cognitive task to one of selection instead of generation, a reduction that can sacrifice fidelity to the construct being assessed. Additionally, its quality is dependent on subject matter experts' ability to produce compelling distractors. This challenge, when unmet, results in increased costs, as items must be revised or discarded.
The short answer item, a common alternative to multiple choice, overcomes these limitations while introducing a few of its own. In its favor, the short answer item allows examinees to generate their own responses, simultaneously maintaining fidelity to the construct and simplifying item development by eliminating the need for distractors. While a short answer item may prove more challenging to an examinee, it is not for lack of understanding of the item type. The primary challenge comes in scoring. By not fixing the universe of possible responses, an item will frequently have multiple variations on a correct response, all of which must be identified in order to properly score the item. In practice, this challenge is met either by having humans manually score the items or by using an automated scoring procedure. Both are sources of measurement error and are expensive to implement.
This session will offer an alternative approach. By predefining a universe of plausible responses and employing simple search algorithms, it is possible to take from the best of both of these item types to produce a new one without the limitations of either. Examinees interact with an item by generating text, which is matched against stored plausible responses, which are in turn listed as options for the examinee to select. On a geography item, for instance, an examinee could type “guinea” and be presented with selection options of “Guinea”, “Guinea-Bissau”, “Equatorial Guinea”, and “Papua New Guinea”. This approach requires generative processes while allowing for instant, error-free scoring. This session will offer an interactive demonstration of this new item type and how it works.
For more information about this presentation, please contact Research@abim.org.