A Study into the Limitations of Cnn Recognition on Isolated Bengali Compound Characters

Tasnim Zia; Ankur Datta; Mohammad Raghib Noor; M Ashraful Amin; Amin Ahsan Ali; A K M Mahbubur Rahman

PDF

Published: Oct 5, 2023

Keywords:

Resnet, Gradcam, Gradcam , Bangla Compound Characters, CNN.

Tasnim Zia

Center for Computational & Data Sciences, Department of Computer Science and Engineering, Independent University, Bangladesh

Ankur Datta

Mohammad Raghib Noor

M Ashraful Amin

Amin Ahsan Ali

A K M Mahbubur Rahman

Abstract

There are over 265 million Bangla native and non-native speakers, however, the
advancements in Bangla Optical Character Recognition is falling behind when compared
with other languages because of a broader set of complex characters, multiple handwriting
styles, and a lack of datasets. Convolutional Neural Network models have been highly
successful in detecting the handwritten alphabet scripts. However, we found that nowadays,
two staged detectors, such as CNN-RNN, Encoder-Decoders, Vision Transformers have been
doing much better than pure CNNs in pattern recognition and Bengali Compound Character
Recognition. In order to understand why it is so, we chose five commonly used pretrained
CNN models from Pytorch: VGG-16, ResNet-50, ResNet-101, Wide ResNet-50-2, and
ResNeXt-50-32x4d to classify the characters and compare their performances. Grad-CAM
and Grad-CAM++ were used to generate heatmaps to see the key areas that the models
focused on while classifying. We found pattern problems in Bangla compound characters
along with problematic perceptions in our finetuned CNNs that we have thus listed with
detailed analysis.

Issue

Vol. 22 No. 01 (2023)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details