IMPLEMENTATION OF DEEP LEARNING ALGORITHM IN HANDWRITING TO TEXT DOCUMENT CONVERSION APPLICATION

Authors

  • Alif Lufti Universitas Islam Sumatera Utara
  • Khairuddin Nasution Universitas Islam Sumatera Utara
  • Tasliyah Haramaini Universitas Islam Sumatera Utara

Keywords:

Handwriting, recognition, Tesseract.js, OCR, Web Application, Digitalization

Abstract

The development of information and communication technology has driven the need for systems capable of efficiently converting handwritten text into digital text. This study aims to develop a web-based application capable of real-time handwriting recognition using Tesseract.js, a JavaScript library for optical character recognition (OCR). The application is designed to assist users in converting handwritten documents into editable text formats, thereby enhancing productivity and information accessibility.

The methods used in this study include uploading handwritten images, preprocessing the images to improve input quality, and applying OCR algorithms using Tesseract.js to recognize characters and words. The recognized results are then displayed on the user interface, with an option for manual correction if needed. The study also evaluates the accuracy of the text recognition produced by the application by comparing the recognition results with the original text.

The results show that the developed application is capable of recognizing handwriting with a satisfactory level of accuracy, despite variations in handwriting styles. This application is expected to make a significant contribution in the field of document digitization and data processing, and serve as a reference for the development of similar systems in the future.

References

Alhamad, H. A., Shehab, M., Shambour, M. K. Y., Abu-Hashem, M. A., Abuthawabeh, A., Al-Aqrabi, H., Daoud, M. S., & Shannaq, F. B. (2024). Handwritten Recognition Techniques: A Comprehensive Review. Symmetry, 16(6), 1–25. https://doi.org/10.3390/sym16060681

Ash, G. (2024). Perancangan Website Text Scanner Untuk Konversi Handwritten ke Teks Digital Dengan Menggunakan Optical Character Recognition. 10(3), 854–860.

Elngar, A. A., Arafa, M., Fathy, A., Moustafa, B., Mahmoud, O., Shaban, M., & Fawzy, N. (2021). Image Classification Based On CNN: A Survey. Journal of Cybersecurity and Information Management, 6(1), PP. 18-50. https://doi.org/10.54216/jcim.060102

Ghandi, S., & Ramadhan, Y. R. (2024). PENERAPAN METODE CONVOLUTIONAL NEURAL NETWORK ( CNN ) DALAM APLIKASI PENDETEKSI PENYAKIT DAUN TANAMAN KENTANG BERBASIS ANDROID. 8(5), 8701–8708.

Hermawan, B. (2013). Multimodality: Menfsir Verbal, Membaca Gambar, dan Memahami Teks. Jurnal Pendidikan Bahasa Dan Sastra, 13(2), 102–117. https://doi.org/10.17509/bs

Huda, B., & Priyatna, B. (2019). Penggunaan Aplikasi Content Management System (CMS) Untuk Pengembangan Bisnis Berbasis E-commerce. Systematics, 1(2), 81. https://doi.org/10.35706/sys.v1i2.2076

Li, R. (2024). A review of neural networks in handwritten character recognition. 0, 169–174. https://doi.org/10.54254/2755-2721/92/20241736

Memon, J., Sami, M., Khan, R. A., & Uddin, M. (2020). Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR). IEEE Access, 8, 142642–142668. https://doi.org/10.1109/ACCESS.2020.3012542

Mishra, R. K., Reddy, G. Y. S., & Pathak, H. (2021). The Understanding of Deep Learning: A Comprehensive Review. Mathematical Problems in Engineering, 2021. https://doi.org/10.1155/2021/5548884

Misra, S., & Li, H. (2019). Deep neural network architectures to approximate the fluid-filled pore size distributions of subsurface geological formations. In Machine Learning for Subsurface Characterization. Elsevier Inc. https://doi.org/10.1016/B978-0-12-817736-5.00007-7

Rakshit, S., Kundu, A., Maity, M., Mandal, S., Sarkar, S., & Basu, S. (2020). Recognition of handwritten Roman Numerals using Tesseract open source OCR engine. 1–5. http://arxiv.org/abs/1003.5898

Ten Brinke, W., Squire, D. M. G., & Bigelow, J. (2020). The Meaning of an Image in Content-Based Image Retrieval. CEUR Workshop Proceedings, 240(January 2020), 710–719.

Downloads

Published

2025-06-27