Region Based CNN for Segmenting Text in Epigraphical Images
Keywords:region based CNN, text segmentation, support vector machine, Resnet50, FAST RCNN, FASTER RCNN
Indian history derived from ancient writings on the inscriptions, palm leaves, copper plates, coins, and many more mediums. Epigraphers read these inscriptions and produce meaningful interpretations. Automating the process of reading is the interest of our study and in this paper, segmentation to detect text on digitized inscriptional images is dealt in detail. Character segmentation from Epigraphical images helps in optical character recognizer (OCR) in training and recognition of old regional scripts. Epigraphical images are drawn from estampages containing scripts from various periods starting from Brahmi in the 3rd century BC to the medieval period of the 15th century AD. The scripts or characters present in digitized epigraphical images are illegible and have complex noisy background textures. To achieve script/text segmentation, region based convolutional neural network (CNN) is employed to detect characters in the images. Proposed method uses selective search to identify text regions and forwards them to trained CNN models for drawing feature vectors. These feature vectors are fed to support vector machine (SVM) classifiers for classification and recognize text by drawing a bounding box based on confidence score. Alexnet, VGG16, Resnet50 and InceptionV3 are used as CNN models for experimentation and InceptionV3 performed well with good results. 197 images are used for experimentation, out of which 70 samples are of printed denoised epigraphical images, 40 denoised estampage images and 87 noisy estampage images. The segmentation result of 74.79% for printed denoised epigraphical images,71.53 % for denoised estampage epigraphical images and 18.11% for noisy estampage images are recorded by InceptionV3. The segmented characters are used for epigraphical applications like period/era prediction and Recognition of characters. FAST and FASTER Region based design approach was also tested and illustrated in this paper.
How to Cite
Copyright (c) 2022 Authors
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.