The use of UGM’s eLOK as a Student Learning Outcomes Evaluation Platform

Ramadhan Dwi Marvianto, Haryanti Mustika, Sri Suning Kusumawardani
(Submitted 14 November 2022)
(Published 31 May 2024)


The use of online platforms in testing is a necessity nowadays, especially in an academic context. The platform used in UGM, namely eLOK, provides a testing facility which enable the lecturers to simultaneously see the results of the test evaluations and questions directly from the platform. However, there haven’t been any studies that compare the evaluation results using UGM’s eLOK with other approaches. Therefore, this study compared evaluation results from UGM’s eLOK with the Classical Test Theory (CTT) and Item Response Theory 2-Parameter Logistics (IRT 2-PL) approach using the Graded Response Model (GRM). This study included 22 active students who took the test using the UGM’s eLOK platform in the Multivariate Statistics course during the even semester of 2020/2021 academic year. The results of the analysis showed that the evaluation using the UGM’s eLOK platform had close equivalence with the CTT approach, although each parameter’s value was slightly different. In addition, the results of the IRT analysis were found to have far differences with the other two methods, but these results only slightly reflect the actual parameters due to the minimal number of subjects. The results of this study can be used as a reference in using UGM’s eLOK as a student academic testing platform, where lecturers are able to evaluate the quality of the tests and items given to test.


educational technology; UGM’s eLOK; psychology; psychometrics

Full Text: PDF

DOI: 10.22146/gamajop.79149


Aiken, L. R. (1979). Relationships between the item difficulty and discrimination indexes. Educational and Psychological Measurement, 39(1), 821–824.

Allen, M. J., & Yen, M. M. (1979). Introduction to measurement theory. Brooks/Cole Pub. Co.

American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing. American Educational Research Association.

Azwar, S. (2013). Penyusunan skala psikologi (2nd ed.). Pustaka Pelajar.

Azwar, S. (2016a). Dasar-dasar psikometrika (2nd ed.). Pustaka Pelajar.

Azwar, S. (2016b). Konstruksi tes kemampuan kognitif (1st ed.). Pustaka Pelajar.

Azwar, S. (2016c). Tes prestasi (2nd ed.). Pustaka Pelajar.

Azwar, S. (2017). Reliabilitas dan validitas (4th ed.). Pustaka Pelajar.

Baker, F. B., & Kim, S.-H. (2017). The basics of item response theory using r. In Measurement: Interdisciplinary research and perspectives, 16(3). Springer.

Brown, A., & Croudace, T. J. (2015). Scoring and estimating score precision using multidimensional IRT1. In Handbook of item response theory modeling: Applications to typical performance assessment (a volume in the Multivariate Applications Series) (Issue March, pp. 307–333). Routledge.

Butcher, P. (2021, December 11). Quiz report statistics.

Cappelleri, J. C., Jason Lundy, J., & Hays, R. D. (2014). Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures. Clinical Therapeutics, 36(5), 648–662.

Center for Academic Innovation and Studies. (2021, February 9). eLOK: Tentang Kami.

Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Harcourt Brace Jovanovich College Publishers.

de Ayala, R. J. (2009). The theory and practice of item response theory. The Guilford Press.

Ebel, L. (1965). Measuring educational achievement. Prentice Hall.

Furr, M. R., & Bacharach, V. R. (2013). Psychometric: An Introduction (2nd ed.). SAGE Publisher.

Hambleton, R. K. (1991). Fundamentals of item response theory. SAGE Publications.

Harvey, R. J. & Hammer, A. L. (1999). Item response theory. The Counseling Psychologist, 27(3), 353–383.

Hinkle, D., Wiersma, W., & Jurs, S. (2003). Applied statistics for the behavioral sciences. Houghton Mifflin.

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometrics theory (3rd ed.). McGraw-Hill.

R Core Team. (2019). R: A language and environment for statistical computing.

Ramadhaningtyas, D. A. (2018). Item Response Theory (IRT) vs Classical Test Theory (CTT): Skala Identitas Etnis versi Forced Choice [Skripsi]. Universitas Brawijaya.

Riswan, R. (2021). Sample size and test length for item parameter estimate and exam parameter estimate. Al-Khwarizmi : Jurnal Pendidikan Matematika Dan Ilmu Pengetahuan Alam, 9(1), 69–78.

Şahin, A., & Anıl, D. (2017). The effects of test length and sample size on item parameters in item response theory. Kuram ve Uygulamada Egitim Bilimleri, 17(1), 321–335.

Schauberger, P., & Walker, A. (2021). openxlsx: Read, Write and Edit xlsx Files.

Setiadi, H. (1997). Small sample IRT item parameter estimates [Dissertations]. University of Massachusetts Amherst.

Sumardi. (2020). Teknik pengukuran dan penilaian hasil belajar. Deepublish Publisher.

Wickham, H., & Bryan, J. (2019). readxl: Read Excel Files.

Wickham, H., François, R., Henry, L., & Müller, K. (2022). dplyr: A Grammar of Data Manipulation.

Willse, J. T. (2018). CTT: Classical Test Theory Functions.

Xie, Y. (2014). knitr: A Comprehensive Tool for Reproducible Research in R. In V. Stodden, F. Leisch, & R. D. Peng (Eds.), Implementing Reproducible Computational Research. Chapman and Hall/CRC.

Xie, Y. (2015). Dynamic Documents with R and knitr (2nd ed.). Chapman and Hall/CRC.


  • There are currently no refbacks.

Copyright (c) 2024 Gadjah Mada Journal of Psychology (GamaJoP)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.