Improving the performance of theimage captioning systems using a pre- classification stage
Other Title(s)
تحسين أداء أنظمة وصف الصور باستخدام مرحلة التصنيف المسبق للصور
Source
Journal of Engineering Sciences and Information Technology
Issue
Vol. 6, Issue 1 (31 Mar. 2022), pp.150-164, 15 p.
Publisher
Publication Date
2022-03-31
Country of Publication
Palestine (Gaza Strip)
No. of Pages
15
Abstract EN
In this research, we introduce a novel image classification and captioning system by adding a classification layer before the image captioning models.
The suggested approach consists of three main steps and inspired by the state- of- art that generating image captioning inside small sub- classes categories is better than the unclassified large dataset.
In the first one, we have collected a dataset of two international datasets (MS- COCO and Flickr2k) including 10778 images in which 80% is used for training and 20% for validation.
In the next step, dataset images have been classified into 11 classes (10 classes of indoor and outdoor categories and one class of "Null" category) and fed into a deep learning classifier.
The classifier is re- trained again using our classes and learned to classify each image to the corresponding category.
At the final step, each classified image is used as input of 11 pre- trained classified image captioning models, and the final captioning sentence is generated.
The experiments show that adding the pre- classification step before the image captioning stage improves the performance significantly by (8.15% and 8.44%) and (12.7407% and 16.7048%) for Top- 1 and Top- 5 of English and Arabic systems respectively.
The classification step achieves a true classification rate of 71.32% and 73.09% for English and Arabic systems respectively.
American Psychological Association (APA)
Mualla, Rasha Muhammad& al-Khayr, Jafar& Sulayman, Samir. 2022. Improving the performance of theimage captioning systems using a pre- classification stage. Journal of Engineering Sciences and Information Technology،Vol. 6, no. 1, pp.150-164.
https://search.emarefa.net/detail/BIM-1408845
Modern Language Association (MLA)
Mualla, Rasha Muhammad…[et al.]. Improving the performance of theimage captioning systems using a pre- classification stage. Journal of Engineering Sciences and Information Technology Vol. 6, no. 1 (Mar. 2022), pp.150-164.
https://search.emarefa.net/detail/BIM-1408845
American Medical Association (AMA)
Mualla, Rasha Muhammad& al-Khayr, Jafar& Sulayman, Samir. Improving the performance of theimage captioning systems using a pre- classification stage. Journal of Engineering Sciences and Information Technology. 2022. Vol. 6, no. 1, pp.150-164.
https://search.emarefa.net/detail/BIM-1408845
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references : p. 162-164
Record ID
BIM-1408845