Improving the performance of theimage captioning systems using a pre- classification stage

Other Title(s)

تحسين أداء أنظمة وصف الصور باستخدام مرحلة التصنيف المسبق للصور

Source

Journal of Engineering Sciences and Information Technology

Issue

Vol. 6, Issue 1 (31 Mar. 2022), pp.150-164, 15 p.

Publisher

National Research Center

Publication Date

2022-03-31

Country of Publication

Palestine (Gaza Strip)

No. of Pages

15

Abstract EN

In this research, we introduce a novel image classification and captioning system by adding a classification layer before the image captioning models.

The suggested approach consists of three main steps and inspired by the state- of- art that generating image captioning inside small sub- classes categories is better than the unclassified large dataset.

In the first one, we have collected a dataset of two international datasets (MS- COCO and Flickr2k) including 10778 images in which 80% is used for training and 20% for validation.

In the next step, dataset images have been classified into 11 classes (10 classes of indoor and outdoor categories and one class of "Null" category) and fed into a deep learning classifier.

The classifier is re- trained again using our classes and learned to classify each image to the corresponding category.

At the final step, each classified image is used as input of 11 pre- trained classified image captioning models, and the final captioning sentence is generated.

The experiments show that adding the pre- classification step before the image captioning stage improves the performance significantly by (8.15% and 8.44%) and (12.7407% and 16.7048%) for Top- 1 and Top- 5 of English and Arabic systems respectively.

The classification step achieves a true classification rate of 71.32% and 73.09% for English and Arabic systems respectively.

American Psychological Association (APA)

Mualla, Rasha Muhammad& al-Khayr, Jafar& Sulayman, Samir. 2022. Improving the performance of theimage captioning systems using a pre- classification stage. Journal of Engineering Sciences and Information Technology،Vol. 6, no. 1, pp.150-164.
https://search.emarefa.net/detail/BIM-1408845

Modern Language Association (MLA)

Mualla, Rasha Muhammad…[et al.]. Improving the performance of theimage captioning systems using a pre- classification stage. Journal of Engineering Sciences and Information Technology Vol. 6, no. 1 (Mar. 2022), pp.150-164.
https://search.emarefa.net/detail/BIM-1408845

American Medical Association (AMA)

Mualla, Rasha Muhammad& al-Khayr, Jafar& Sulayman, Samir. Improving the performance of theimage captioning systems using a pre- classification stage. Journal of Engineering Sciences and Information Technology. 2022. Vol. 6, no. 1, pp.150-164.
https://search.emarefa.net/detail/BIM-1408845

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 162-164

Record ID

BIM-1408845