Item analysis in a psycholinguistics course based on classical test theory using ITEMAN 4.0.2

Authors

  • Siska Adinda Prabowo Putri Universitas AKI
  • Lucy Hariadi Universitas AKI
  • Amin Khudlori Universitas AKI

DOI:

https://doi.org/10.53873/culture.v12i2.779

Keywords:

Item Analysis, Classical Test Theory, ITEMAN 4.0.2, psycholinguistics, Reliability

Abstract

This study aims to analyze the quality of test items in a psycholinguistics course using the Classical Test Theory. The data used consisted of 30 students’ responses to 20 multiple-choice test items, which were analyzed using several indicators: difficulty level, discriminatory power, item-total correlation, and instrument reliability. The results showed that most items were in the moderate difficulty category (p = 0.46–0.56), with one item categorized as easy (p = 0.73) and two items as difficult (p = 0.26). The discriminatory power of the majority of items was in the very good category (87.5%–100%), while three items showed lower discriminatory power and required revision. The item-total correlation was generally very high (r = 0.88–0.99), indicating consistency among items, but several items with lower correlations (r < 0.70) suggested possible wording inaccuracies or content inconsistencies. The test’s reliability reached 0.99, indicating very high internal consistency, although this value was influenced by the quite extreme response patterns between the upper and lower groups. Overall, the test instrument was considered good, but several items needed revision, particularly in terms of distractors, difficulty level, and item functionality, to ensure more accurate and representative learning evaluations.

References

Alderson, J. C. (2020). Language test construction and evaluation. Routledge.

Aljabr, A., Shaikh, S., Kannan, S. K., Aldhuwayhi, S., Jayakumar, S., Al-Roomy, R., Uthappa, R., & Alkhujairi, A. (2021). Replacing non-functional distractors to improve the quality of MCQs: A quasi-experimental study. International Journal of Educational Sciences, 33(1–3), 52–58. https://doi.org/10.31901/24566322.2021/33.1-3.1186

Allen, M. J., & Yen, W. M. (2022). Introduction to measurement theory. Waveland Press.

Ayanwale, M. A. (2022). Classical test theory and item response theory: A comparative review. Journal of Measurement and Evaluation, 21(8). https://doi.org/10.26803/ijlter.21.8.22

Azwar, S. (2017). Reliability and validity. Student Library.

Brown, H. D., & Abeywickrama, P. (2021). Language assessment: Principles and classroom practices (4th ed.). Pearson Education.

Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Cengage Learning.

DeVellis, R. F., & Thorpe, C. T. (2021). Scale development: Theory and applications (5th ed.). SAGE.

Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement. Prentice Hall.

Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Lawrence Erlbaum Associates.

Hambleton, R. K., & Jones, R. W. (2021). Modern measurement theory. Routledge.

Hartati, N., & Yogi, H. P. S. (2019). Item analysis for better quality test. English Language in Focus (ELIF), 2(1), 59–70. https://jurnal.umj.ac.id/index.php/ELIF

Hrich, N., Azekri, M., & Khaldi, M. (2024). Artificial intelligence item analysis tool for educational assessment: Case of large scale competitive exams. International Journal of Information and Educational Technology, 14(4), 822–827. https://www.ijiet.org/vol14/IJIET-V14N6-2107.pdf

Kline, P. (2020). Psychological testing: A practical approach to design and evaluation. Routledge.

Liu, Y., Pek, J., & Maydeu-Olivares, A. (2024). Understanding reliability from a regression perspective. https://doi.org/10.48550/arXiv.2404.16709

Miller, L. A., & Lovler, R. L. (2020). Foundations of psychological testing: A practical approach (6th ed.). Sage Publications.

Nitko, A. J., & Brookhart, S. M. (2011). Educational assessment of students. Pearson.

Nurjanah, S., Rahmawati, D., & Wicaksono, A. (2024). Psychometric quality of multiple-choice tests under Classical Test Theory using ITEMAN software. Jurnal Evaluasi Pendidikan, 15(2), 101–118. https://jurnal.uny.ac.id/index.php/jpep/article/download/71542/23430

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). McGraw-Hill.

Nuroini, N., & Syamsudin, A. (2023). Item quality analysis of final semester test in Chemistry subject using ANATES. Jurnal Kimia dan Pendidikan Kimia, 7(1), 55–66. https://jurnal.uns.ac.id/jkpk/article/view/54999

Ohiri, S. C., & Okoye, R. O. (2024). Application of classical test theory as linear modeling to test item development and analysis. International Research Journal of Modernization in Engineering, Technology and Science, 5(10). https://doi.org/10.56726/IRJMETS45379

Resi, D. N. (2025). Quality analysis of Biology mid-semester assessment questions with classical test theory and Rasch model. Bio-Pedagogy, 13(2), 63. https://doi.org/10.20961/bio-pedagogi.v13i2.88075

Rezigalla, A., Abdalla, O., & Ibrahim, S. (2024). Item analysis: The impact of distractor efficiency on the difficulty index and discrimination power of multiple-choice items. Research in Medical Education, 13(1), 22–29.

Rohmatdi, A. (2024). Classical test theory evaluation with ITEMAN 4.3. Al-Ishlah: Journal of Education, 16(4). https://journal.staihubbulwathan.id/index.php/alishlah/article/download/5671/2568

Shah, R., Patel, S., & George, A. (2019). Item analysis as a tool to validate multiple-choice questions. International Journal of Basic & Clinical Pharmacology, 8(3), 540–545. https://www.ijbcp.com/index.php/ijbcp/article/view/3324

Subhaktiyasa, P. G. (2024). Evaluation of the validity and reliability of quantitative research instruments: A literature study. Journal of Education Research, 5(4), 5599–5609. https://doi.org/10.37985/jer.v5i4.1747

Tavakol, M., & Dennick, R. (2023). Multiple-choice item analysis: A contemporary review. Medical Education Review, 57(2), 215–230.

Thompson, N. A. (2022). Classical test theory in modern assessment practice. Educational Measurement Review, 18(1), 25–38.

Traxler, M. J. (2023). Introduction to psycholinguistics: Understanding language science (2nd ed.). Wiley-Blackwell.

Zubairi, N. A., Al-Haqan, A., Al-Fadhli, S., & Al-Mutairi, R. (2025). Effective use of item analysis to improve the reliability and validity of multiple-choice examinations. Journal of Taibah University Medical Sciences, 20(1), 1–10. https://pmc.ncbi.nlm.nih.gov/articles/PMC11911747/

Downloads

Additional Files

Published

2025-11-30