Analysis of Student Output on the Use of ChatGPT: A Predictive Model Approach

Analysis of Student Output on the Use of ChatGPT: A Predictive Model Approach

  IJETT-book-cover           
  
© 2024 by IJETT Journal
Volume-72 Issue-10
Year of Publication : 2024
Author : Jovelin M. Lapates, Mark Daniel G. Dacer, Derren N. Gaylo
DOI : 10.14445/22315381/IJETT-V72I10P121

How to Cite?
Jovelin M. Lapates, Mark Daniel G. Dacer, Derren N. Gaylo, "Analysis of Student Output on the Use of ChatGPT: A Predictive Model Approach," International Journal of Engineering Trends and Technology, vol. 72, no. 10, pp. 216-224, 2024. Crossref, https://doi.org/10.14445/22315381/IJETT-V72I10P121

Abstract
Artificial Intelligence (AI) has significantly transformed various aspects of education, with AI-powered language models like ChatGPT gaining popularity due to their unique features and advantages. This study aims to analyze student outputs and develop a predictive model to assess whether essay-type answers, Dropbox submissions, and machine problems were generated using ChatGPT, employing machine learning algorithms such as Naive Bayes (NB), Random Forest (RF), and K-Nearest Neighbors (KNN). Student outputs are evaluated using six AI detection tools: Contentatscale, Crossplag, GPTZero, KazanSEO, Sapling, and ZeroGPT. The results are predicted by NB, RF, and KNN, which were chosen for their strong performance in text classification, robustness, and ability to manage non-linear data. The analysis examines performance metrics, including Recall, Precision-Recall Curve (PRC) Area, and Class Accuracy, to provide insights into the predictive capabilities of these models. The findings reveal that NB outperformed the other algorithms, achieving the highest correctly classified instances at 23.19% and a Kappa statistic of 0.1072, indicating slight agreement in classification accuracy, while RF and KNN recorded 14.49% and 15.94%, respectively. Additionally, NB demonstrated the highest true positive rate of 0.232 and PRC area of 0.466, while KNN achieved the best PRC area at 0.566, reflecting varied performance across models. Generally, while Naive Bayes showed superior accuracy and predictive ability, each model has unique strengths that can be leveraged to analyze student outputs and evaluate the use of tools like ChatGPT in educational settings.

Keywords
ChatGPT, KNN, Random Forest, Naïve Bayes, AI detector, Students’ output.

References
[1] Partha Pratim Ray, “ChatGPT: A Comprehensive Review on Background, Applications, Key Challenges, Bias, Ethics, Limitations and Future Scope,” Internet of Things and Cyber-Physical Systems, vol. 3, pp. 121-154, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Woondeog Chang, and Jungkun Park, “A Comparative Study on the Effect of Chatgpt Recommendation and AI Recommender Systems on the Formation of a Consideration Set,” Journal of Retailing and Consumer Services, vol. 78, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Gunther Eysenbach, “The Role of Chatgpt, Generative Language Models, and Artificial Intelligence in Medical Education: A Conversation with ChatGPT and a Call for Papers,” JMIR Medical Education, vol. 9, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Ahmed Tlili et al., “What if the Devil is My Guardian Angel: Chatgpt as a Case Study of Using Chatbots in Education,” Smart Learning Environments, vol. 10, pp. 1-24, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Yogesh K. Dwivedi et al., “So What if Chatgpt Wrote It?” Multidisciplinary Perspectives on Opportunities, Challenges and Implications of Generative Conversational AI for Research, Practice and Policy,” International Journal of Information Management, vol. 71, pp. 1-63, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Hao Yu, “The Application and Challenges of ChatGPT in Educational Transformation: New Demands for Teachers' Roles,” Heliyon, vol. 10, no. 2, pp. 1-15, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Sushil Kumar Sharma, Shailendra C. Jain Palvia, and Kuldeep Kumar, “Changing the Landscape of Higher Education: From Standardized Learning to Customized Learning,” Journal of Information Technology Case and Application Research, vol. 19, no. 2, pp. 75-80, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Torrey Trust, Jeromie Whalen, and Chrystalla Mouza, “ChatGPT: Challenges, Opportunities, and Implications for Teacher Education,” Contemporary Issues in Technology and Teacher Education, vol. 23, no. 1, pp. 1-23, 2023.
[Google Scholar] [Publisher Link]
[9] Lasha Labadze, Maya Grigolia, and Lela Machaidze, “Role of AI Chatbots in Education: Systematic Literature Review,” International Journal of Educational Technology in Higher Education, vol. 20, pp. 1-17, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Ben Williamson et al., Chapter 25: Critical Perspectives on AI in Education: Political Economy, Discrimination, Commercialization, Governance and Ethics, Handbook of Artificial Intelligence in Education, Edward Elgar Publishing, pp. 553- 570, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Abderahman Rejeb et al., “Exploring the Impact of ChatGPT on Education: A Web Mining and Machine Learning Approach,” The International Journal of Management Education, vol. 22, no. 1, pp. 1-14, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Chenglu Li, and Wanli Xing, “Natural Language Generation using Deep Learning to Support MOOC Learners,” International Journal of Artificial Intelligence in Education, vol. 31, pp. 186-214, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Flori Needle, AI Detection: How to Pinpoint AI Generated Text and Imagery [+ Detection Tools], Hubspot. [Online]. Available: https://blog.hubspot.com/marketing/ai-detection#detect-ai-text [14] Md. Mostafizer Rahman, and Yutaka Watanobe, “ChatGPT for Education and Research: Opportunities, Threats, and Strategies,” Applied Sciences, vol. 13, no. 9, pp. 1-21, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Maha Zayoud et al., “Impact of ChatGPT on Education: Challenges and Opportunities,” International Conference of Management and Industrial Engineering, vol. 11, pp. 75-85, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Marta Montenegro-Rueda et al., “Impact of the Implementation of ChatGPT in Education: A Systematic Review,” Computers, vol. 12, no. 8, pp. 1-13, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Vraj Sheth, Urvashi Tripathi, and Ankit Sharma, “A Comparative Analysis of Machine Learning Algorithms for Classification Purpose,” Procedia Computer Science, vol. 215, pp. 422-431, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Wentian Kang et al., “ChatGPT-based Sentiment Analysis and Risk Prediction in the Bitcoin Market,” Procedia Computer Science, vol. 242, pp. 211-218, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Or Peretz, Michal Koren, and Oded Koren, “Naive Bayes classifier, An Ensemble Procedure for Recall and Precision Enrichment,” Engineering Applications of Artificial Intelligence, vol. 136, pp. 1-12, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Atsushi Mizumoto, Sachiko Yasuda, and Yu Tamura, “Identifying ChatGPT-Generated Texts in EFL Students’ Writing: Through Comparative Analysis of Linguistic Fingerprints,” Applied Corpus Linguistics, vol. 4, no. 3, pp. 1-11, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Harry Zhang, “The Optimality of Naive Bayes,” Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, Miami Beach, Florida, USA, 2004.
[Google Scholar] [Publisher Link]
[22] N.S. Altman, “An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression,” The American Statistician, vol. 46, no. 3, pp. 175-185, 1992.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Prajwal Singh, Diving into K-Nearest Neighbors (KNN) with ChatGPT, Medium, 2024. [Online]. Available: https://medium.com/@prajwlsingh/diving-into-k-nearest-neighbors-knn-with-chatgpt-d938b32d03aa
[24] Ernest Yeboah Boateng, Joseph Otoo, and Daniel A. Abaye, “Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review,” Journal of Data Analysis and Information Processing, vol. 8, no. 4, pp. 341-357, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Andy Liaw, and Matthew Wiener, Classification and Regression by Random Forest, R news, vol. 2, pp. 18-22, 2002.
[Google Scholar] [Publisher Link]
[26] Jesse Davis, and Mark Goadrich, “The Relationship between Precision-Recall and ROC Curves,” Proceedings of the 23rd International Conference on Machine Learning, New York, United States, pp. 233-240, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Kai Riemer, and Sandra Peter, “Conceptualizing Generative AI as Style Engines: Application Archetypes and Implications,” International Journal of Information Management, vol. 79, pp. 1-15, 2024.
[CrossRef] [Google Scholar] [Publisher Link]