The four-parameter logistic model assumes that even high ability examinees can make mistakes (e.g. due to carelessness). This phenomenon was reflected by the non-zero upper asymptote (d-parameter) of the IRT logistic curve. Research on 4PLM has been hampered, since the model has been considered conceptually and computationally complicated – and its usefulness has been questioned. After 25 years, following introduction of appropriate software, the psychometric characteristics of 4PLM and the model’s usefulness can be assessed more reliably. The aim of this article is to show whether 4PLM can be used to detect item-writing flaws (which introduce construct-irrelevant variance to the measurement). Analysis was conducted in two steps: (a) qualitative – assessment of compliance of items with the chosen item-writing guidelines, (b) quantitative – fitting 4PLM to compare the results with qualitative analysis – to determine whether the same items were detected as flawed. Other IRT models (3PLM and 2PLM) were also fitted to check the validity of results. Flawed items can be detected by the means of qualitative analysis as well as by 4PLM and simpler IRT models. This model is discussed from the perspective of practical use in educational research.

Karolina Świst, Instytut Badań Edukacyjnych


