Publications by Sovila Srun

Publications by authors named "Sovila Srun"

Page 1 of 1

ViTSTR-Transducer: Cross-Attention-Free Vision Transformer Transducer for Scene Text Recognition.

Rina Buoy Masakazu Iwamura Sovila Srun Koichi Kise

J Imaging

December 2023

Attention-based encoder-decoder scene text recognition (STR) architectures have been proven effective in recognizing text in the real world, thanks to their ability to learn an internal language model. Nevertheless, the cross-attention operation that is used to align visual and linguistic features during decoding is computationally expensive, especially in low-resource environments. To address this bottleneck, we propose a cross-attention-free STR framework that still learns a language model.

View Article and Find Full Text PDF

Explainable Connectionist-Temporal-Classification-Based Scene Text Recognition.

Rina Buoy Masakazu Iwamura Sovila Srun Koichi Kise

J Imaging

November 2023

Connectionist temporal classification (CTC) is a favored decoder in scene text recognition (STR) for its simplicity and efficiency. However, most CTC-based methods utilize one-dimensional (1D) vector sequences, usually derived from a recurrent neural network (RNN) encoder. This results in the absence of explainable 2D spatial relationship between the predicted characters and corresponding image regions, essential for model explainability.

View Article and Find Full Text PDF