Research Article | Open Access | Download PDF
Volume 74 | Issue 5 | Year 2026 | Article Id. IJETT-V74I5P124 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I5P124A Multimodal Plagiarism Detection Framework Gemini AI for Text and Image Content Integrity
Anjali Naudiyal, Kapil Joshi, Rahul Mahala, Shivani Pant, Mohammed Ghouse Aleem
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 19 Jan 2026 | 09 Mar 2026 | 28 Mar 2026 | 30 May 2026 |
Citation :
Anjali Naudiyal, Kapil Joshi, Rahul Mahala, Shivani Pant, Mohammed Ghouse Aleem, "A Multimodal Plagiarism Detection Framework Gemini AI for Text and Image Content Integrity," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 5, pp. 362-383, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I5P124
Abstract
In the recent advancement of technologies, Plagiarism in academia is increasing day by day. Nowadays, Plagiarism is not in the form of text, but it is in the form of images, screenshots, figures, logos, watermarks, and diagrams. The proposed system introduces several technical innovations that set it apart from conventional text-based plagiarism detection tools. One of the most significant advancements is its multi-modal analysis capability, which combines both visual and textual content understanding through integration with Google’s Gemini 1.5 Flash model (Gemini AI) with deep learning techniques. The proposed seven-stage framework unifies computer vision, natural language processing, and adaptive similarity analytics to move beyond conventional fingerprint or perceptual hash methods. First, textual extraction isolates embedded or overlaid text (captions, watermarks, OCR passages), supplying linguistic cues. Second, visual decomposition segments salient objects, layout structures, color palettes, and stylistic signatures. Third, authenticity assessment estimates manipulative edits. cropping, splicing, style transfer, generative fill via anomaly and provenance signals. Fourth, source candidate retrieval uses multimodal embeddings to surface likely originals or semantically proximate precursors from reference corpora and web indices. Fifth, plagiarism indicator evaluation aggregates cross-image overlaps: localized patch similarity, reconstructed text alignment, stylistic congruence, and watermark inheritance. Sixth, web search recommendation dynamically composes discriminative keyword visual descriptor queries that can expand external source discovery. Seventh, similarity fusion and scoring combine weighted textual, structural, and deep feature distances through a learned calibration layer. producing dual quantitative outputs: an Authenticity Integrity Score and a Plagiarism Likelihood Score. Experiments on a curated benchmark mixing authentic, lightly edited, heavily manipulated, and synthetically generated images show robust discrimination across perturbations. Preliminary comparative analyses indicate improved recall of subtle derivative works while maintaining controlled false positives. The system is implemented in practical applications in academic publishing, news verification, creative asset management, and legal evidence triage, while establishing an extensible foundation for future provenance standards and investigative journalism.
Keywords
Deep Learning, Image Plagiarism (IP), Text-Based Image Plagiarism (TBIP), Content Authenticity, Source Detection.
References
[1] A. Chitra, and Anupriya Rajkumar, “Plagiarism
Detection using Machine Learning-based Paraphrase Recognizer,” Journal
of Intelligent Systems, vol. 25, no. 3, pp. 351-359, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Shashank Parmar, and Bhavya Jain, “VIBRANT-WALK:
An Algorithm to Detect Plagiarism of Figures in Academic Papers,” Expert
Systems with Applications, vol. 252, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Ramesh R. Naik, Maheshkumar B. Landge,
and C. Namrata Mahender. “A Review on Plagiarism Detection Tools,” International
Journal of Computer Applications, vol. 125, no. 11, pp. 16-22, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Jacob Devlin et al., “Bert:
Pre-Training of Deep Bidirectional Transformers for Language Understanding,” Proceedings
of the 2019 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Minneapolis,
Minnesota, vol. 1, pp. 4171-4186, 2019.
[CrossRef] [Google Scholar] [Publisher
Link]
[5] Norman Meuschke et al., “An Adaptive
Image-based Plagiarism Detection Approach,” JCDL '18: Proceedings of the 18th
ACM/IEEE on Joint Conference on Digital Libraries, pp. 131-140, 2018.
[CrossRef] [Google Scholar] [Publisher
Link]
[6] Nils Reimers, and Iryna Gurevych,
“Sentence-Bert: Sentence Embeddings Using Siamese Bert-Networks,” Proceedings
of the 2019 Conference on Empirical Methods in Natural Language Processing and
the 9th International Joint Conference on Natural Language
Processing, Hong Kong, China, pp. 3982-3992, 2019.
[CrossRef] [Google Scholar] [Publisher
Link]
[7] VijayaKumar Kadha, Sambit Bakshi, and
Santos Kumar Das, “Unravelling Digital Forgeries: A Systematic Survey on Image
Manipulation Detection and Localization,” ACM Computing Surveys,
vol. 57, no. 12, pp. 1-36, 2025.
[CrossRef] [Google Scholar] [Publisher
Link]
[8] David G. Lowe, “Distinctive Image
Features from Scale-Invariant Keypoints,” International Journal of
Computer Vision, vol. 60, no. 2, pp.
91-110, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Kannadhasan Suriyan, and R. Nagarajan, Recent
Trends in Pattern Recognition, Challenges and Opportunities, Machine
Learning Techniques and Industry Applications, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Tedo Vrbanec,
and Ana Meštrović, “The Struggle with Academic Plagiarism: Approaches
based on Semantic Similarity,” 2017 40th International
Convention on Information and Communication Technology, Electronics and
Microelectronics (MIPRO), Opatija, Croatia, pp. 870-875, 2017.
[CrossRef] [Google Scholar] [Publisher
Link]
[11] Tomáš
Foltýnek et al., “Testing of Support Tools for Plagiarism Detection,” International
Journal of Educational Technology in Higher Education, vol. 17, no. 1, pp.
1-31, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Quoc
Le, and Tomas Mikolov, “Distributed Representations of Sentences and
Documents,” Proceedings of the 31st International Conference
on Machine Learning, PMLR, vol. 32, no. 2, pp. 1188-1196, 2014.
[Google Scholar] [Publisher Link]
[13] Shaopan
Wang et al., “Advances and Prospects of Multi-Modal Ophthalmic Artificial
Intelligence based on Deep Learning: A Review,” Eye and Vision,
vol. 11, no. 1, pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Alec
Radford et al., “Learning Transferable Visual Models from Natural Language
Supervision,” Proceedings of the 38th International Conference on
Machine Learning, PMLR, vol. 139, pp. 8748-8763, 2021.
[Google Scholar] [Publisher
Link]
[15] Xinyu
Zhou et al., “East: An Efficient and Accurate Scene Text Detector,” Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 5551-5560, 2017.
[Google Scholar] [Publisher Link]
[16] Arwa Al Saqaabi et al., “A Deep Learning
Approach for Paragraph-Level Paraphrase Generation for Plagiarism Detection,” Neural
Processing Letters, vol. 57, no. 3, pp. 1-42, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Baoguang
Shi, Xiang Bai, and Cong Yao, “An End-To-End Trainable Neural Network for
Image-Based Sequence Recognition and its Application to Scene Text
Recognition,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 39, no. 11, pp. 2298-2304, 2016.
[CrossRef] [Google Scholar] [Publisher
Link]
[18] Minghao
Li et al., “Trocr: Transformer-based Optical Character Recognition with
Pre-Trained Models,” Proceedings of the AAAI Conference on Artificial
Intelligence, vol. 37. no. 11, pp. 13094-13102, 2023.
[Google Scholar]
[19] Palvadi Srinivas Kumar, and Krishna Prasad, “Integrating
OCR and NLP Techniques for Accurate Text Extraction and Plagiarism Detection in
Image-Based Content,” Library Progress International, vol. 44, no.
3, pp. 2986-2996, 2024.
[Google Scholar] [Publisher Link]
[20] Palvadi
Srinivas Kumar, and Krishna Prasad, “Integrating OCR and NLP Techniques for
Accurate Text Extraction and Plagiarism Detection in Image-Based
Content,” International Journal of Advanced Science and Computer
Applications, vol. 4, no. 1, pp. 1-8, 2025.
[CrossRef] [Google Scholar] [Publisher
Link]
[21] Basant Agarwal et al., “Siamese-Based
Architecture for Cross-Lingual Plagiarism Detection in English-Hindi Language
Pairs,” Big Data, vol. 11, no. 1, pp. 48-58, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Alaa
Sahl Gaafar, Jasim Mohammed Dahr, and Alaa Khalaf Hamoud, “Comparative Analysis
of Performance of Deep Learning Classification Approach based on LSTM-RNN for
Textual and Image Datasets,” Informatica, vol. 46, no. 5, pp.
21-28, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Abdur
Razaq et al., “Identification of Paraphrased Text in Research Articles through
Improved Embeddings and Fine-Tuned BERT Model,” Multimedia Tools and
Applications, vol. 83, no. 30, pp. 74205-74232, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Youngmin
Baek et al., “Character Region Awareness for Text Detection,” 2019 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach,
CA, USA, pp. 9357-9366, 2019.
[CrossRef] [Google Scholar] [Publisher
Link]
[25] Alzahrani, Salha M., Naomie Salim, and Ajith Abraham,
“Understanding Plagiarism Linguistic Patterns, Textual Features, and
Detection Methods,” IEEE Transactions on Systems, Man, and Cybernetics,
Part C (Applications and Reviews), vol. 42, no. 2, pp. 133-149, 2011.
[CrossRef] [Google Scholar] [Publisher
Link]
[26] Pon
Abisheka, C. Deisy, and P. Sharmila, “T-SRE: Transformer-based Semantic
Relation Extraction for Contextual Paraphrased Plagiarism Detection,” Journal
of King Saud University-Computer and Information Sciences, vol. 36, no. 10,
pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Yu
Han et al., “Breaking through Language Barriers: A Review of OCR Technology for
Low-Resource Minority Languages Based on Deep Learning,” SSRN, pp.
1-42, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Milind Agarwal, and Antonios Anastasopoulos, “A
Concise Survey of OCR for Low-Resource Languages,” Proceedings of the 4th
Workshop on Natural Language Processing for Indigenous Languages of the
Americas (AmericasNLP 2024), Mexico City, Mexico, pp. 88-102, 2024.
[CrossRef] [Google Scholar] [Publisher
Link]
[29] Joseph
Redmon, and Ali Farhadi, “Yolov3: An Incremental Improvement,” arXiv
Preprint, pp. 1-6, 2018.
[CrossRef] [Google Scholar] [Publisher
Link]
[30] Alexey
Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image
Recognition at Scale,” arXiv Preprint, pp. 1-22, 2022.
[CrossRef] [Google Scholar] [Publisher
Link]
[31] Kaiming
He et al., “Deep Residual Learning for Image Recognition,” Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.
770-778, 2016.
[Google Scholar] [Publisher Link]
[32] Franco
Scarselli et al., “The Graph Neural Network Model,” IEEE Transactions on
Neural Networks, vol. 20, no. 1, pp. 61-80, 2009.
[CrossRef] [Google Scholar] [Publisher
Link]
[33] Wenhai
Wang et al., “Pyramid Vision Transformer: A Versatile Backbone for Dense
Prediction Without Convolutions,” Proceedings of the IEEE/CVF International
Conference on Computer Vision (ICCV), pp. 568-578, 2021.
[Google Scholar] [Publisher Link]
[34] Jiyang
Xie et al., “Deep Learning-Based Computer Vision for Surveillance in its:
Evaluation of State-of-the-Art Methods,” IEEE Transactions on Vehicular
Technology, vol. 70, no. 4, pp. 3027-3042, 2021.
[CrossRef] [Google Scholar] [Publisher
Link]
[35] Srinivas Kumar Palvadi, and Krishna Prasad, “A
Unified Framework for Text Extraction and Plagiarism Detection in Image-Based
Content Using OCR and NLP,” Physiotherapy Issues, vol. 54, no. 1,
pp. 132-141, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Rishi
Bommasani et al., “On the Opportunities and Risks of Foundation Models,” arXiv
preprint, pp. 1-214, 2021.
[CrossRef] [Google Scholar] [Publisher
Link]
[37] Tsung-Yi
Lin et al., “Microsoft Coco: Common Objects in Context,” European Conference
on Computer Vision, Zurich, Switzerland, vol. 7, pp. 740-755, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Hamed
Arabi, and Mehdi Akbari, “Improving Plagiarism Detection in Text Document Using
Hybrid Weighted Similarity,” Expert Systems with Applications, vol.
207, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Sheetal
Harris et al., “Fake News Detection Revisited: An Extensive Review of
Theoretical Frameworks, Dataset Assessments, Model Constraints, and
Forward-Looking Research Agendas,” Technologies, vol. 12, no. 11,
pp. 1-63, 2024.
[CrossRef] [Google Scholar] [Publisher
Link]
[40] Md
Kamrul Siam, Huanying Gu, and Jerry Q. Cheng, “Programming with Ai: Evaluating
Chatgpt, Gemini, Alphacode, and Github Copilot for Programmers,” Proceedings
of the 3rd International Conference on Computing Advancements, Dhaka,
Bangladesh, 2024.
[CrossRef] [Google Scholar] [Publisher
Link]
[41] Noppol
Anakpluek et al., “Improved Tesseract Optical Character Recognition Performance
on Thai Document Datasets,” Big Data Research, vol. 39, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Yuliia
Zanevych, “Flask vs. Django vs. Spring boot: Navigating Framework Choices for
Machine Learning Object Detection Projects,” Collection of Scientific
Papers «ΛΌГOΣ», Cambridge, UK, pp. 311-318, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Juan Pablo Bustos, and Luis Lopez Soria, Generative
AI Application Integration Patterns: Integrate Large Language Models into Your
Applications, Packt Publishing Ltd, 2024.
[Google Scholar] [Publisher Link]
[44] Amirul
S. Bin Ibrahin, Othman O. Khalifa, and Diaa Eldein M. Ahmed, “Plagiarism
Detection of Images,” 2020 IEEE Student Conference on Research and
Development (SCOReD), Batu Pahat, Malaysia, pp. 183-188, 2020.
[CrossRef] [Google Scholar] [Publisher
Link]