A Multimodal Plagiarism Detection Framework Gemini AI for Text and Image Content Integrity

Anjali Naudiyal; Kapil Joshi; Rahul Mahala; Shivani Pant; Mohammed Ghouse Aleem

doi:https://doi.org/10.14445/22315381/IJETT-V74I5P124

Research Article | Open Access | Download PDF

Volume 74 | Issue 5 | Year 2026 | Article Id. IJETT-V74I5P124 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I5P124

A Multimodal Plagiarism Detection Framework Gemini AI for Text and Image Content Integrity

Anjali Naudiyal, Kapil Joshi, Rahul Mahala, Shivani Pant, Mohammed Ghouse Aleem

Received	Revised	Accepted	Published
19 Jan 2026	09 Mar 2026	28 Mar 2026	30 May 2026

Citation :

Anjali Naudiyal, Kapil Joshi, Rahul Mahala, Shivani Pant, Mohammed Ghouse Aleem, "A Multimodal Plagiarism Detection Framework Gemini AI for Text and Image Content Integrity," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 5, pp. 362-383, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I5P124

Abstract

In the recent advancement of technologies, Plagiarism in academia is increasing day by day. Nowadays, Plagiarism is not in the form of text, but it is in the form of images, screenshots, figures, logos, watermarks, and diagrams. The proposed system introduces several technical innovations that set it apart from conventional text-based plagiarism detection tools. One of the most significant advancements is its multi-modal analysis capability, which combines both visual and textual content understanding through integration with Google’s Gemini 1.5 Flash model (Gemini AI) with deep learning techniques. The proposed seven-stage framework unifies computer vision, natural language processing, and adaptive similarity analytics to move beyond conventional fingerprint or perceptual hash methods. First, textual extraction isolates embedded or overlaid text (captions, watermarks, OCR passages), supplying linguistic cues. Second, visual decomposition segments salient objects, layout structures, color palettes, and stylistic signatures. Third, authenticity assessment estimates manipulative edits. cropping, splicing, style transfer, generative fill via anomaly and provenance signals. Fourth, source candidate retrieval uses multimodal embeddings to surface likely originals or semantically proximate precursors from reference corpora and web indices. Fifth, plagiarism indicator evaluation aggregates cross-image overlaps: localized patch similarity, reconstructed text alignment, stylistic congruence, and watermark inheritance. Sixth, web search recommendation dynamically composes discriminative keyword visual descriptor queries that can expand external source discovery. Seventh, similarity fusion and scoring combine weighted textual, structural, and deep feature distances through a learned calibration layer. producing dual quantitative outputs: an Authenticity Integrity Score and a Plagiarism Likelihood Score. Experiments on a curated benchmark mixing authentic, lightly edited, heavily manipulated, and synthetically generated images show robust discrimination across perturbations. Preliminary comparative analyses indicate improved recall of subtle derivative works while maintaining controlled false positives. The system is implemented in practical applications in academic publishing, news verification, creative asset management, and legal evidence triage, while establishing an extensible foundation for future provenance standards and investigative journalism.

Keywords

Deep Learning, Image Plagiarism (IP), Text-Based Image Plagiarism (TBIP), Content Authenticity, Source Detection.

References

[1] A. Chitra, and Anupriya Rajkumar, “Plagiarism Detection using Machine Learning-based Paraphrase Recognizer,” Journal of Intelligent Systems, vol. 25, no. 3, pp. 351-359, 2015.
[CrossRef] [Google Scholar] [Publisher Link]

[2] Shashank Parmar, and Bhavya Jain, “VIBRANT-WALK: An Algorithm to Detect Plagiarism of Figures in Academic Papers,” Expert Systems with Applications, vol. 252, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[3] Ramesh R. Naik, Maheshkumar B. Landge, and C. Namrata Mahender. “A Review on Plagiarism Detection Tools,” International Journal of Computer Applications, vol. 125, no. 11, pp. 16-22, 2015.
[CrossRef] [Google Scholar] [Publisher Link]

[4] Jacob Devlin et al., “Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, vol. 1, pp. 4171-4186, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[5] Norman Meuschke et al., “An Adaptive Image-based Plagiarism Detection Approach,” JCDL '18: Proceedings of the 18^th ACM/IEEE on Joint Conference on Digital Libraries, pp. 131-140, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[6] Nils Reimers, and Iryna Gurevych, “Sentence-Bert: Sentence Embeddings Using Siamese Bert-Networks,” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9^th International Joint Conference on Natural Language Processing, Hong Kong, China, pp. 3982-3992, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[7] VijayaKumar Kadha, Sambit Bakshi, and Santos Kumar Das, “Unravelling Digital Forgeries: A Systematic Survey on Image Manipulation Detection and Localization,” ACM Computing Surveys, vol. 57, no. 12, pp. 1-36, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[8] David G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[CrossRef] [Google Scholar] [Publisher Link]

[9] Kannadhasan Suriyan, and R. Nagarajan, Recent Trends in Pattern Recognition, Challenges and Opportunities, Machine Learning Techniques and Industry Applications, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[10] Tedo Vrbanec, and Ana Meštrović, “The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity,” 2017 40^th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, pp. 870-875, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[11] Tomáš Foltýnek et al., “Testing of Support Tools for Plagiarism Detection,” International Journal of Educational Technology in Higher Education, vol. 17, no. 1, pp. 1-31, 2020.
[CrossRef] [Google Scholar] [Publisher Link]

[12] Quoc Le, and Tomas Mikolov, “Distributed Representations of Sentences and Documents,” Proceedings of the 31^st International Conference on Machine Learning, PMLR, vol. 32, no. 2, pp. 1188-1196, 2014.
[Google Scholar] [Publisher Link]

[13] Shaopan Wang et al., “Advances and Prospects of Multi-Modal Ophthalmic Artificial Intelligence based on Deep Learning: A Review,” Eye and Vision, vol. 11, no. 1, pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[14] Alec Radford et al., “Learning Transferable Visual Models from Natural Language Supervision,” Proceedings of the 38th International Conference on Machine Learning, PMLR, vol. 139, pp. 8748-8763, 2021.
[Google Scholar] [Publisher Link]

[15] Xinyu Zhou et al., “East: An Efficient and Accurate Scene Text Detector,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5551-5560, 2017.
[Google Scholar] [Publisher Link]

[16] Arwa Al Saqaabi et al., “A Deep Learning Approach for Paragraph-Level Paraphrase Generation for Plagiarism Detection,” Neural Processing Letters, vol. 57, no. 3, pp. 1-42, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[17] Baoguang Shi, Xiang Bai, and Cong Yao, “An End-To-End Trainable Neural Network for Image-Based Sequence Recognition and its Application to Scene Text Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2298-2304, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[18] Minghao Li et al., “Trocr: Transformer-based Optical Character Recognition with Pre-Trained Models,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37. no. 11, pp. 13094-13102, 2023.
[Google Scholar]

[19] Palvadi Srinivas Kumar, and Krishna Prasad, “Integrating OCR and NLP Techniques for Accurate Text Extraction and Plagiarism Detection in Image-Based Content,” Library Progress International, vol. 44, no. 3, pp. 2986-2996, 2024.
[Google Scholar] [Publisher Link]

[20] Palvadi Srinivas Kumar, and Krishna Prasad, “Integrating OCR and NLP Techniques for Accurate Text Extraction and Plagiarism Detection in Image-Based Content,” International Journal of Advanced Science and Computer Applications, vol. 4, no. 1, pp. 1-8, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[21] Basant Agarwal et al., “Siamese-Based Architecture for Cross-Lingual Plagiarism Detection in English-Hindi Language Pairs,” Big Data, vol. 11, no. 1, pp. 48-58, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[22] Alaa Sahl Gaafar, Jasim Mohammed Dahr, and Alaa Khalaf Hamoud, “Comparative Analysis of Performance of Deep Learning Classification Approach based on LSTM-RNN for Textual and Image Datasets,” Informatica, vol. 46, no. 5, pp. 21-28, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[23] Abdur Razaq et al., “Identification of Paraphrased Text in Research Articles through Improved Embeddings and Fine-Tuned BERT Model,” Multimedia Tools and Applications, vol. 83, no. 30, pp. 74205-74232, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[24] Youngmin Baek et al., “Character Region Awareness for Text Detection,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 9357-9366, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[25] Alzahrani, Salha M., Naomie Salim, and Ajith Abraham, “Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 2, pp. 133-149, 2011.
[CrossRef] [Google Scholar] [Publisher Link]

[26] Pon Abisheka, C. Deisy, and P. Sharmila, “T-SRE: Transformer-based Semantic Relation Extraction for Contextual Paraphrased Plagiarism Detection,” Journal of King Saud University-Computer and Information Sciences, vol. 36, no. 10, pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[27] Yu Han et al., “Breaking through Language Barriers: A Review of OCR Technology for Low-Resource Minority Languages Based on Deep Learning,” SSRN, pp. 1-42, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[28] Milind Agarwal, and Antonios Anastasopoulos, “A Concise Survey of OCR for Low-Resource Languages,” Proceedings of the 4^th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), Mexico City, Mexico, pp. 88-102, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[29] Joseph Redmon, and Ali Farhadi, “Yolov3: An Incremental Improvement,” arXiv Preprint, pp. 1-6, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[30] Alexey Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” arXiv Preprint, pp. 1-22, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[31] Kaiming He et al., “Deep Residual Learning for Image Recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
[Google Scholar] [Publisher Link]

[32] Franco Scarselli et al., “The Graph Neural Network Model,” IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61-80, 2009.
[CrossRef] [Google Scholar] [Publisher Link]

[33] Wenhai Wang et al., “Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions,” Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 568-578, 2021.
[Google Scholar] [Publisher Link]

[34] Jiyang Xie et al., “Deep Learning-Based Computer Vision for Surveillance in its: Evaluation of State-of-the-Art Methods,” IEEE Transactions on Vehicular Technology, vol. 70, no. 4, pp. 3027-3042, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[35] Srinivas Kumar Palvadi, and Krishna Prasad, “A Unified Framework for Text Extraction and Plagiarism Detection in Image-Based Content Using OCR and NLP,” Physiotherapy Issues, vol. 54, no. 1, pp. 132-141, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[36] Rishi Bommasani et al., “On the Opportunities and Risks of Foundation Models,” arXiv preprint, pp. 1-214, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[37] Tsung-Yi Lin et al., “Microsoft Coco: Common Objects in Context,” European Conference on Computer Vision, Zurich, Switzerland, vol. 7, pp. 740-755, 2014.
[CrossRef] [Google Scholar] [Publisher Link]

[38] Hamed Arabi, and Mehdi Akbari, “Improving Plagiarism Detection in Text Document Using Hybrid Weighted Similarity,” Expert Systems with Applications, vol. 207, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[39] Sheetal Harris et al., “Fake News Detection Revisited: An Extensive Review of Theoretical Frameworks, Dataset Assessments, Model Constraints, and Forward-Looking Research Agendas,” Technologies, vol. 12, no. 11, pp. 1-63, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[40] Md Kamrul Siam, Huanying Gu, and Jerry Q. Cheng, “Programming with Ai: Evaluating Chatgpt, Gemini, Alphacode, and Github Copilot for Programmers,” Proceedings of the 3^rd International Conference on Computing Advancements, Dhaka, Bangladesh, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[41] Noppol Anakpluek et al., “Improved Tesseract Optical Character Recognition Performance on Thai Document Datasets,” Big Data Research, vol. 39, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[42] Yuliia Zanevych, “Flask vs. Django vs. Spring boot: Navigating Framework Choices for Machine Learning Object Detection Projects,” Collection of Scientific Papers «ΛΌГOΣ», Cambridge, UK, pp. 311-318, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[43] Juan Pablo Bustos, and Luis Lopez Soria, Generative AI Application Integration Patterns: Integrate Large Language Models into Your Applications, Packt Publishing Ltd, 2024.
[Google Scholar] [Publisher Link]

[44] Amirul S. Bin Ibrahin, Othman O. Khalifa, and Diaa Eldein M. Ahmed, “Plagiarism Detection of Images,” 2020 IEEE Student Conference on Research and Development (SCOReD), Batu Pahat, Malaysia, pp. 183-188, 2020.
[CrossRef] [Google Scholar] [Publisher Link]