Item analysis in a psycholinguistics course based on classical test theory using ITEMAN 4.0.2
DOI:
https://doi.org/10.53873/culture.v12i2.779Keywords:
Item Analysis, Classical Test Theory, ITEMAN 4.0.2, psycholinguistics, ReliabilityAbstract
This study aims to analyze the quality of test items in a psycholinguistics course using the Classical Test Theory. The data used consisted of 30 students’ responses to 20 multiple-choice test items, which were analyzed using several indicators: difficulty level, discriminatory power, item-total correlation, and instrument reliability. The results showed that most items were in the moderate difficulty category (p = 0.46–0.56), with one item categorized as easy (p = 0.73) and two items as difficult (p = 0.26). The discriminatory power of the majority of items was in the very good category (87.5%–100%), while three items showed lower discriminatory power and required revision. The item-total correlation was generally very high (r = 0.88–0.99), indicating consistency among items, but several items with lower correlations (r < 0.70) suggested possible wording inaccuracies or content inconsistencies. The test’s reliability reached 0.99, indicating very high internal consistency, although this value was influenced by the quite extreme response patterns between the upper and lower groups. Overall, the test instrument was considered good, but several items needed revision, particularly in terms of distractors, difficulty level, and item functionality, to ensure more accurate and representative learning evaluations.References
Alderson, J. C. (2020). Language test construction and evaluation. Routledge.
Aljabr, A., Shaikh, S., Kannan, S. K., Aldhuwayhi, S., Jayakumar, S., Al-Roomy, R., Uthappa, R., & Alkhujairi, A. (2021). Replacing non-functional distractors to improve the quality of MCQs: A quasi-experimental study. International Journal of Educational Sciences, 33(1–3), 52–58. https://doi.org/10.31901/24566322.2021/33.1-3.1186
Allen, M. J., & Yen, W. M. (2022). Introduction to measurement theory. Waveland Press.
Ayanwale, M. A. (2022). Classical test theory and item response theory: A comparative review. Journal of Measurement and Evaluation, 21(8). https://doi.org/10.26803/ijlter.21.8.22
Azwar, S. (2017). Reliability and validity. Student Library.
Brown, H. D., & Abeywickrama, P. (2021). Language assessment: Principles and classroom practices (4th ed.). Pearson Education.
Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Cengage Learning.
DeVellis, R. F., & Thorpe, C. T. (2021). Scale development: Theory and applications (5th ed.). SAGE.
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement. Prentice Hall.
Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Lawrence Erlbaum Associates.
Hambleton, R. K., & Jones, R. W. (2021). Modern measurement theory. Routledge.
Hartati, N., & Yogi, H. P. S. (2019). Item analysis for better quality test. English Language in Focus (ELIF), 2(1), 59–70. https://jurnal.umj.ac.id/index.php/ELIF
Hrich, N., Azekri, M., & Khaldi, M. (2024). Artificial intelligence item analysis tool for educational assessment: Case of large scale competitive exams. International Journal of Information and Educational Technology, 14(4), 822–827. https://www.ijiet.org/vol14/IJIET-V14N6-2107.pdf
Kline, P. (2020). Psychological testing: A practical approach to design and evaluation. Routledge.
Liu, Y., Pek, J., & Maydeu-Olivares, A. (2024). Understanding reliability from a regression perspective. https://doi.org/10.48550/arXiv.2404.16709
Miller, L. A., & Lovler, R. L. (2020). Foundations of psychological testing: A practical approach (6th ed.). Sage Publications.
Nitko, A. J., & Brookhart, S. M. (2011). Educational assessment of students. Pearson.
Nurjanah, S., Rahmawati, D., & Wicaksono, A. (2024). Psychometric quality of multiple-choice tests under Classical Test Theory using ITEMAN software. Jurnal Evaluasi Pendidikan, 15(2), 101–118. https://jurnal.uny.ac.id/index.php/jpep/article/download/71542/23430
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). McGraw-Hill.
Nuroini, N., & Syamsudin, A. (2023). Item quality analysis of final semester test in Chemistry subject using ANATES. Jurnal Kimia dan Pendidikan Kimia, 7(1), 55–66. https://jurnal.uns.ac.id/jkpk/article/view/54999
Ohiri, S. C., & Okoye, R. O. (2024). Application of classical test theory as linear modeling to test item development and analysis. International Research Journal of Modernization in Engineering, Technology and Science, 5(10). https://doi.org/10.56726/IRJMETS45379
Resi, D. N. (2025). Quality analysis of Biology mid-semester assessment questions with classical test theory and Rasch model. Bio-Pedagogy, 13(2), 63. https://doi.org/10.20961/bio-pedagogi.v13i2.88075
Rezigalla, A., Abdalla, O., & Ibrahim, S. (2024). Item analysis: The impact of distractor efficiency on the difficulty index and discrimination power of multiple-choice items. Research in Medical Education, 13(1), 22–29.
Rohmatdi, A. (2024). Classical test theory evaluation with ITEMAN 4.3. Al-Ishlah: Journal of Education, 16(4). https://journal.staihubbulwathan.id/index.php/alishlah/article/download/5671/2568
Shah, R., Patel, S., & George, A. (2019). Item analysis as a tool to validate multiple-choice questions. International Journal of Basic & Clinical Pharmacology, 8(3), 540–545. https://www.ijbcp.com/index.php/ijbcp/article/view/3324
Subhaktiyasa, P. G. (2024). Evaluation of the validity and reliability of quantitative research instruments: A literature study. Journal of Education Research, 5(4), 5599–5609. https://doi.org/10.37985/jer.v5i4.1747
Tavakol, M., & Dennick, R. (2023). Multiple-choice item analysis: A contemporary review. Medical Education Review, 57(2), 215–230.
Thompson, N. A. (2022). Classical test theory in modern assessment practice. Educational Measurement Review, 18(1), 25–38.
Traxler, M. J. (2023). Introduction to psycholinguistics: Understanding language science (2nd ed.). Wiley-Blackwell.
Zubairi, N. A., Al-Haqan, A., Al-Fadhli, S., & Al-Mutairi, R. (2025). Effective use of item analysis to improve the reliability and validity of multiple-choice examinations. Journal of Taibah University Medical Sciences, 20(1), 1–10. https://pmc.ncbi.nlm.nih.gov/articles/PMC11911747/
Downloads
Additional Files
Published
Issue
Section
License
Copyright (c) 2025 Siska Adinda Prabowo Putri, Lucy Hariadi, Amin Khudlori

This work is licensed under a Creative Commons Attribution 4.0 International License.
