Abstract
Essay assessment in economics education plays an important role in measuring students' conceptual understanding and higher-order thinking skills, but is often hampered by subjectivity and teachers' workload. This study aims to analyze the reliability and level of agreement between teacher assessments and the ChatGPT-based automatic essay grading system (EsyGrade). The study uses a quantitative approach with a comparative reliability study design involving 60 high school students in grades 10 and 11. Data were analyzed using the Intraclass Correlation Coefficient (ICC), Pearson's correlation, and paired sample t-test. The results show that EsyGrade has a very high level of reliability and is consistent with teacher assessments, both in terms of total scores and each dimension of essay assessment. These findings indicate that EsyGrade has the potential to be a reliable and objective essay assessment support tool in economics learning.
First Page
81
Last Page
86
Recommended Citation
Pratama, Ramadzan Defitri; Sangka, Khresna Bayu; and Indrawati, Cicilia Dyah Sulistyaningrum
(2026)
"Reliability of ChatGPT-Based Essay Scoring: A Teacher–AI Comparison in Economics Education,"
Jurnal Pendidikan: Teori, Penelitian, dan Pengembangan: Vol. 11:
No.
3, Article 1.
DOI: https://doi.org/10.17977/2502-471X.1183
Available at:
https://citeus.um.ac.id/jptpp/vol11/iss3/1
The EsyGrade System Description & Data Set.pdf
Included in
Curriculum and Instruction Commons, Educational Assessment, Evaluation, and Research Commons, Educational Technology Commons, Education Economics Commons, Teacher Education and Professional Development Commons
