Automated Essay Grading Software Sustainability in Assessment: a Critical Review for Quality Feedback and Stakeholders Involvement

Damilola Olaoye; Henry Henry Owolabi; Oluwaseun Tayo Olaoye

doi:10.71291/jocatia.v2i.33

Authors

Damilola Olaoye University of Ilorin
Henry Henry Owolabi University of Ilorin
Oluwaseun Tayo Olaoye School of Education, Kwara State College of Education(T), Lafiagi

DOI:

https://doi.org/10.71291/jocatia.v2i.33

Keywords:

automated essay grading, assessment, software development, artificial intelligence, stakeholders’ involvement

Abstract

This paper explores a critical review of literature on automated essay grading software and system development procedures through the nomenclature of technology in assessment. Various techniques and methodologies used in essay grading software were identified, as well as different software that are valid and reliable in scoring both short and extended essay test items, which various stakeholders can leverage for cost-effectiveness, scoring consistency, objectivity, timely result delivery, and quick feedback. Software development stages that are required in the developing automated scoring system are discussed. The state of heart as regards the AES software that requires training of manually marked essays and those that do not require training is embedded in this review with various advantages automated essay scoring exhibits over human scoring and its criticism. The evaluation matrices for validating the automated essay grading system with human raters were also identified. This reviewed study concludes that with the development of artificial intelligence, a reliable and valid assessment in scoring of short-answer and extended essays is viable and realisable with prompt feedback, reduced cost, and time wastage, and thereby promotes objectivity and fairness in scoring to learners that human expert scoring may not achieve. Finally, it was recommended from this review that more automated essay grading software that does not require training with manually marked essays and able to marked different subjects needs to be developed and explored.

Downloads

Download data is not yet available.

References

Abdeljaber, H. A. (2021). Automatic Arabic short answers scoring using longest common subsequence and Arabic WordNet. IEEE Access, 9, 76433-76445.

Ade-Ibijola, A. O., Wakama, I., & Amadi, J. C. (2012). An Expert System for Automated Essay Scoring (AWS) in Computing using Shallow NLP Techniques for Inferencing. International Journal of Computer Applications 51(10), 37-45.

Al-Saqqa, S., Sawalha, S., & AbdelNabi, H. (2020). Agile software development: Methodologies and trends. International Journal of Interactive Mobile Technologies, 14(11) 246-269.

Anhar, F., Farookh, K. H., & Tharam, D. (2013) “An intelligent approach for automatically grading spelling in essays using rubric-based scoring”, Journal of Computer and System Sciences, https://orcid.org/10.1016/j.jcss.2013.01.021.

Anher, F. (2013). A Robust Methodology for Automated Essay Grading (Doctoral dissertation, Curtin University).

Attali, Y., & Burstein, J. (2006). Automated Essay Scoring With e-rater [R] V. 2. Journal of Technology, Learning, and Assessment, 4(3), 1-22.

Australian Curriculum Assessment and Reporting Authority- ACARA (2015). An Evaluation of Automated Scoring of NAPLAN Persuasive writing. 1-16. Retrieved from http://www.acara.edu.au/assessment/reseach.html

Balle, A. R., Oliveira, M., Curado, C., & Nodari, F. (2018). How do knowledge cycles happen in software development methodologies? Industrial and Commercial Training, 50(7/8), 380-392.

Bhuvaneswari, T., & Prabaharan, S. (2013). A survey on software development life cycle modes. International Journal of Computer Science and Mobile Computing, 2(5), 262-267.

Blood, I. (2017). Automated Essay Scoring: A Literature Review. Working Papers in Applied Linguistics & TESOL, 17(2), 40-64. Retrieved July 2022, from http://www.tc.columbia.edu/tesolalwbjournal

Bridgeman, B., Trapani, C., & Attali, Y. (2009). Considering fairness and validity in evaluating automated scoring. Paper presented at the Annual Meeting of the National Council on Measurement in Education, San Diego, CA.

Choudhary, B., & Rakesh, S. K. (2016). An approach using agile method for software development. 2016 International Conference on Innovation and Challenges in Cyber Security (ICICCS-INBUSH) (pp. 155-158). IEEE. https://doi.org/10.1109/iciccs.2016.7542304

Christie, J. R. (1999). Automated essay marking-for both style and content. In M. Danson (Ed.), Proceedings of the Third Annual Computer Assisted Assessment Conference, Loughborough University, Loughborough, UK.

Csapó, B., Ainley, J., Bennett, R. E., Latour, T. and Law, N. (2012). Technological Issues for Computer-Based Assessment. Assessment and Teaching of 21st century skills, 143-230.

Darwish, S. M., & Mohamed, S. K. (2020). Automated essay evaluation based on fusion of fuzzy ontology and latent semantic analysis. In: Hassanien A, Azar A, Gaber T, Bhatnagar RF, Tolba M (eds) The International Conference on Advanced Machine Learning Technologi and Applications.

Davenport, T. H. & Ronanki, R. (2018). Artificial intelligence for the real world, Harvard Business Review 96 (1) 108–116.

Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning, and Assessment 5(1):1-36.

Duwairi, R. M. (2006). A framework for the computerized assessment of university student essays. Computer in Human Behaviour, 381-388. Doi: 10.1016/j.chb.2004.09.006

Dwivedi, Y., Hughes, K. L., Ismagilova, E., Aarts, G., Coombs, C., Crick, T. (2019). Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy, International Journal of Information Management. 101994.

Fiseha, M. G., Adeel, H. S., Muhammad I. K., & Basim, A. K. (2020). Challenges of remote assessment in higher education in the context of COVID-19: a case study of Middle East College. Educational Assessment, Evaluation and Accountability 32:519–535 https://doi.org/10.1007/s11092-020-09340-w

Hearst, M. (2000). The debate on automated essay grading. IEEE Intelligent Systems, 15(5), 22 37, IEEE CS Press.

Helingo, M., Purwandari, B., Satria, R., & Solichah, I. (2017). The Use of Analytic Hierarchy Process for Software Development Method Selection: A Perspective of e-Government in Indonesia 4th Information Systems International Conference 2017, ISICO 2017, Bali, Indonesia. Procedia Computer Science 124 (405–414).

Huitt, W. (2011). Bloom et al.'s taxonomy of the cognitive domain. Educational psychology interactive, 22.

Hughes, L. J., Johnston, A. N., & Mitchell, M. L. (2018). Human influences impacting assessors’ experiences of marginal student performances in clinical courses. Collegian, 25(5), 541 547.

Kakkonen, T., Myller, N., Sutinen, E., & Timonen, J. (2008). Comparison of Dimension Reduction Methods for Automated Essay Grading. Educational Technology & Society, 11(3), 2 288.

Kumaran, V. S., & Sankar, A. (2015). Towards an automated system for short-answer assessment using ontology mapping. International Arab Journal of e-Technology, 4(1), 17-24. Retrieved July 2017 from http://www.iajet.org/iajet_files/vol.4/no.1/3.pdf

Landauer, T. K., Laham, D., & Foltz, P. W. (2003). Automated scoring and annotation of essays with the Intelligent Essay Assessor. Automated Essay Scoring: A Cross-Disciplinary Perspective, 87–112.

Larman, C., & Basili, V. R. (2003). Iterative and incremental developments. A brief history. Computer, 36(6), 47-56.

LaVoie, N., Cianciolo, A., Martin, J. (2015). Automated assessment of diagnostic skill. Poster presented at the CGEA CGSA COSR conference, Columbus, OH.

Leacock, C., & Chodorov, M. (2004). Scoring free-responses automatically: A case study of a large-scale assessment.

Mahana, M., John, M., & Apte, A. (2012). Automated Essay Grading Using Machine Learning. California: Stanford University. Retrieved fromhttp://cs229.stanford.edu/proj2012/AhanajohnsApteAutomatedEssayGradingUsin MachneLearning.pdf

Mikalef, P & Gupta, M. (2021). Artificial intelligence capability: Conceptualization, measurement calibration, and empirical study on its impact on organizational creativity and firm performance. Elsevier Journal of Information and Management. https://doi.org/10.1016/j.im.2021.103434

Mitchell, T., Russel, T., Broomhead, P., & Aldridge, N. (2002). Toward Robust Computerized Marking of Free-Text Responses. Proceedings of the 6th CAA Conference, pp. 231-249. Loughborough University. Retrieved from https://dspace.lboro.ac.uk/2134/1884

Nikitas, N. K. (2010) "Computer Assisted Assessment (CAA) of Free-Text: Literature Review and the Specification of an Alternative CAA System,” pp. 116-118.

Noelle, L., James, P., Peter, J. L., Sharon, A., & Robert, N. K. (2020). Using Latent Semantic Analysis to Score Short Answer Constructed Responses: Automated Scoring of the Consequences Test. Educational and Psychological Measurement, Vol. 80(2) 399–414.

Olaoye, D. D. (2024). Automated extended essay scoring of senior school certificate examination economics items in south-west nigeria, using semantic contextual similarity. An Unpublished PhD Thesis University of Ilorin

Øistein, E. A., Watson, R., Zheng, Y., & Cheung, F. K. Y. (2021). “Benefits of alternative evaluation methods for Automated Essay Scoring”. In: Proceedings of the 14th International Conference on Educational Data Mining (EDM21). International Educational Data Mining Society, 856-864. https://educationaldatamining.org/edm2021/ EDM ’21 June 29 - July 02 2021, Paris, France.

Page, E. B. (2003). Project Essay Grade: In M. D. Shermis and J. C. Burstein (Eds). Automated Essay scoring: A cross-disciplinary perspective (pp. 43-54). Mahwah, NJ: Lawrence Erlbaum Associates.

Pearson Education, Inc. (2012). Intelligent Essay Assessor™ (IEA) fact sheet. Retrieved from http://kt.pearsonassessments.com/download/IEA-FactSheet-20100401.pdf

Perkusich, M., Soares, G., Almeida, H., & Perkusich, A. (2015). A procedure to detect problems of processes in software development projects using Bayesian networks. Expert Systems with Applications, 42(1), 437-450.

Ramachandran, L., Cheng, J., & Foltz, P. (2015). Identifying patterns for short answer scoring using graph-based lexico-semantic text matching. In Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 97–106).

Ramalingam, V. V., Pandian, A., Chatry, P., & Nigam, H. (2018). Automated Essay Grading Using Machine Learning Algorithm. Journal of Physics: Conference Series. Doi/10.1088/1742 6596/1000/1/012030

Ramesh, D., & Sanampudi, S. K. (2021). An automated essay scoring system: a systematic literature Review. Artificial Intelligence Review 55 (pp. 2495–2527) https://doi.org/10.1007/s10462-021-10068-2.

Ramnarain-Seetohul, V., Bassoo, V. & Rosunally, Y. (2022). Similarity measures in automated essay scoring systems: A ten-year review. Education Information Technologies https://doi.org/10.1007/s10639-021-10838-z

Rudner, L. M., & Liang, T. (2002). Automated essay scoring using Bayes’ Theorem. The Journal of Technology, Learning and Assessment, 1(2), 3-21.

Ruparelia, N. B. (2010). Software development lifecycle models. ACM SIGSOFT Software Engineering Notes. Hewlett-Packard Enterprise Services, 35(3), 8-13. https://orcid.org/10.1145/1764810.1764814.

Salim, Y., Stevanus, V., Barlian, E., Sari, A. C., & Suhartono, D. (2019). Automated English Digital Essay Grader Using Machine Learning. In 2019 IEEE International Conference on Engineering, Technology and Education (TALE) (pp. 1–6). IEEE.

Shermis, D. M. (2014). State-of-the-art Automated essay Scoring: Competition, Results and Future Directions from a United State demonstration. Science Direct, 20, pp 53-76. Retrieved November, 2021 from https://assets.documentcloud.org/documents/1094637/shermis-aw fina/.pdf

Siddiqi, R., & Harrison, J. (2006). On the Automated Assessment: Short-Free Responses. 1-11. Retrieved November 3, 2021 from http://www.iea.info/docuntes/paper_2b711df83.pdf

Sowunmi, E. T. (2021). Development and Validation of Essay Test Assessor for Senior School Certificate Examination in Nigeria. Unpublished PhD Thesis, University of Ilorin.

Steedle, J. T., & Elliot, S. (2012). The efficacy of automated essay scoring for evaluating student responses to complex critical thinking performance tasks. New York, NY: Council for Aid to Education

Sukkarieh, J. Z., Pulman, S. G., & Raikes, N. (2003). Auto-Marking: using computational linguistics to score, free text responses. 29th Annual Conference of the international Association for Educational Assessment, (pp. 1-5). Manchester, UK.

Valenti, S., Neri, F., & Cucchiarelli, A. (2017). An overview of current research on automated essay grading. Journal of Information Technology Education: Research 2: (pp. 319-330) https://orcid.org/10.28945/331.

Vantage Learning, (2002). A study of expert scoring, standard human scoring and InelliMetric scoring accuracy for statewide eight grade writing responses (R-726). Newtown, PA: Vantage Learning.

Wang, Q. (2022). The use of semantic similarity tools in automated content scoring of fact-based essays written by EFL learners. Education and Information Technologies, 27(9), 13021 13049.

Weigle, S. C. (2010). Validation of automated scores of TOEFL IBT tasks against non-test indicators of writing ability. Language Testing, 27(3), 335-353.

Yu, J. (2018). Research process on software development model. In IOP Conference Serie: Materials Science and Engineering, 394(3) pp.032-045. IOP Publishing.

Zhu, W., & Sun, Y. (2020). Automated essay scoring system using multi-model Machine Learning, david c. wyld et al. (eds): mlnlp, bdiot, itccma, csity, dtmn, aifz, sigpro.

Automated Essay Grading Software Sustainability in Assessment

a Critical Review for Quality Feedback and Stakeholders Involvement

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

sidebar

Quick Links

ref

Recommended Tools

DOI

DOI INDEX

Information