Jae-Chun Ban, ACT, Inc.
Bradley A. Hanson, ACT, Inc.
Qing Yi, ACT, Inc.
Deborah J. Harris, ACT, Inc.
Paper presented at the Annual Meeting of the American Educational Research Association (Seattle, April, 2001)
Revised: April 17, 2001
Abstract: The purpose of this study was to compare and evaluate three online pretest item calibration/scaling methods (the marginal maximum likelihood estimate with one EM cycle (OEM) method, the marginal maximum likelihood estimate with multiple EM cycles (MEM) method, and Stocking's Method B) in terms of item parameter recovery when the item responses to the pretest items in the pool would be sparse. Simulations of computerized adaptive tests (CAT) were used to evaluate results yielded by the three methods. The MEM method produced the smallest average total error in recovering the 240 pretest item characteristic curves. Stocking's Method B yielded the second smallest average total error in parameter estimation. The OEM method yielded a large average total error in parameter estimation. In terms of scale maintenance, the MEM method and Stocking's Method B performed well in keeping the scale of the pretest items on the same scale as that of the true parameters. With the OEM method, the scale of the pretest item parameter estimates deviated from that of the true parameters.
Download paper in PDF format (140 KB). Version 4.0 or later of Adobe Acrobat Reader (which is freely available) is needed to view this paper.
Brad Hanson's Home Page