Improving OCR Performance on Low-Quality Image Using Pre-processing and Post-processing Methods
Improving OCR Performance on Low-Quality Image Using Pre-processing and Post-processing Methods |
||
|
||
© 2023 by IJETT Journal | ||
Volume-71 Issue-6 |
||
Year of Publication : 2023 | ||
Author : Ivan Christian, Gede Putra Kusuma |
||
DOI : 10.14445/22315381/IJETT-V71I6P239 |
How to Cite?
Ivan Christian, Gede Putra Kusuma, "Improving OCR Performance on Low-Quality Image Using Pre-processing and Post-processing Methods," International Journal of Engineering Trends and Technology, vol. 71, no. 6, pp. 396-405, 2023. Crossref, https://doi.org/10.14445/22315381/IJETT-V71I6P239
Abstract
Optical Character Recognition (OCR) is a technology to recognize text inside images. One of the factors affecting the success rate of OCR is image quality. Therefore, it is necessary to improve the image quality before OCR processing. In addition to pre-processing, post-processing was also carried out. This was done to improve the success rate of the OCR. In the pre-processing stage, what is done is to resize the image using bicubic interpolation, which is then followed by deleting the background image. Bicubic interpolation was chosen because it can result in a smoother, enlarged image and has fewer interpolation artifacts. A grayscale conversion using luminance algorithm was also carried out to optimize the process. OCR processing is done using a tesseract. As for post-processing, what is done after OCR is done is to use the N-gram language model and the Levenshtein distance algorithm. The performance of the proposed model is assessed by comparing the success rate of the usual OCR and one of the existing OCR pre-processing or post-processing models with the developed OCR method. The best pre-processing method in this study is to use a combination of the shadow removal method and custom grayscale conversion with a total error rate of 14.56%. Then the post-processing method using a lookup table can also improve the final OCR performance with a total error rate of 13.94%. So, it can be concluded that combining the pre-processing shadow removal method, custom grayscale conversion, and post-processing lookup table method can improve the accuracy of OCR performance.
Keywords
Luminance algorithm, n-gram language, Optical Character Recognition, Post-processing, Pre-processing.
References
[1] Sameeksha Barve, “Optical Character Recognition Using Artificial Neural Network,” International Journal of Advance Technology And Engineering Research (IJATER), vol. 2, no. 2, pp. 139-142, 2012.
[Publisher Link]
[2] Mande Shen, and Hansheng Lei, “Improving OCR Performance with Background Image Elimination,” 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 1566 - 1570, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Di Ma, and Gady Agam, “A Super Resolution Framework for Low Resolution Document image OCR,” SPIE-IS&T Electronic Imaging, vol. 8658, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Matteo Brisinello et al., “Improving Optical Character Recognition Performance for Low Quality Images,” International Symposium ELMAR, pp. 167 - 171, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Abdeslam El Harraj, and Naoufal Raissouni, “OCR Accuracy Improvement on Document Image through a Novel Pre-Processing Approach,” Signal & Image Processing : An International Journal (SIPIJ), vol. 6, no. 4, pp. 1 - 18, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Christopher Kanan, and Garrison W. Cottrell, “Color-to-Grayscale: Does the Method Matter in Image Recognition?,” PLoS ONE, pp. 1-8, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Habeeb Imad Qasim, Al-zaydi Zeyad Qasim Habeeb, and Abdulkhudur Hanan Najm, “Selection Technique for Multiple Outputs of Optical Character Recognition,” Eurasian Journal of Mathematical and Computer Applications, vol. 8, no. 2, pp. 41-51, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Ram Krishna Pandey et al., “Binary Document Image Super Resolution for Improved Readability and OCR Performance," arXiv, Computer Vision and Pattern Recognition, pp. 1-13, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Karimi Mostafa, Veni Gopalkrishna, and Yu Yen-Yun, “Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-10, 2019.
[Google Scholar] [Publisher Link]
[10] Simple Batra, “Word Extraction Using X-Y Cut Algorithm,” Journal of Engineering Research and Application, vol. 8, no. 12, pp. 60 - 63, 2018.
[CrossRef] [Publisher Link]
[11] Imad Qasim Habeeb, Zeyad Qasim Al-Zaydi, and Hanan Najm Abdulkhudhur, “Enhanced Ensemble Technique for Optical Character Recognition,” New Trends in Information and Communications Technology Applications, pp. 213-225, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Shruti Rijhwani, Antonios Anastasopoulos, and Graham Neubig, “OCR Post Correction for Endangered Language Texts,” arXiv, Computation and Language, pp. 5931-5942, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Kusnantoro Kusnantoro, Tatang Rohana, and Dwi Sulistya Kusumaningrum, “Implementasi Metode Tesseract OCR (Optical Character Recognition) untuk Deteksi Plat Nomor Kendaraan Pada Sistem Parkir,” Scientific Student Journal for Information, Technology and Science, vol. 3, no. 1, pp. 59-67, 2022.
[Google Scholar] [Publisher Link]
[14] Satya Mallick, Image Thresholding in OpenCV, 2015. [Online]. Available: https://learnopencv.com/opencv-threshold-python-cpp/
[15] Yun-Hsuan Lin, Wen-Chin Chen, and Yung-Yu Chuang, “BEDSR-Net: A Deep Shadow Removal Network from a Single Document Image,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12905-12914, 2020.
[Google Scholar] [Publisher Link]
[16] Saikiran Subbagari, “Leveraging Optical Character Recognition Technology for Enhanced Anti-Money Laundering (AML) Compliance,” SSRG International Journal of Computer Science and Engineering, vol. 10, no. 5, pp. 1-7, 2023.
[CrossRef] [Publisher Link]
[17] Vaibhav Kumar, “Recurrent Neural Network based Language Modeling for Punjabi ASR,” SSRG International Journal of Computer Science and Engineering, vol. 7, no. 9, pp. 7-13, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Piyush Kiran Redgaonkar et al., “Imageprocessing Based Pincode Recognizing and Sectionwise Courier Sorting System,” SSRG International Journal of Electrical and Electronics Engineering, vol. 3, no. 3, pp. 16-18, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Pooja Goyal, Sushil Kumar, and Komal Kumar Bhatia, “Hashing and Clustering Based Novelty Detection,” SSRG International Journal of Computer Science and Engineering, vol. 6, no. 6, pp. 1-9, 2019.
[CrossRef] [Publisher Link]
[20] Asif Ansari, and NM. Sreenarayanan, “Analysis of Text Classification of Dataset Using NB-Classifier,” SSRG International Journal of Computer Science and Engineering, vol. 7, no. 6, pp. 24-28, 2020.
[CrossRef] [Publisher Link]