Visual Experience-Based Question Answering with Complex Multimodal Environments
Author
Source
Mathematical Problems in Engineering
Issue
Vol. 2020, Issue 2020 (31 Dec. 2020), pp.1-18, 18 p.
Publisher
Hindawi Publishing Corporation
Publication Date
2020-11-19
Country of Publication
Egypt
No. of Pages
18
Main Subjects
Abstract EN
This paper proposes a novel visual experience-based question answering problem (VEQA) and the corresponding dataset for embodied intelligence research that requires an agent to do actions, understand 3D scenes from successive partial input images, and answer natural language questions about its visual experiences in real time.
Unlike the conventional visual question answering (VQA), the VEQA problem assumes both partial observability and dynamics of a complex multimodal environment.
To address this VEQA problem, we propose a hybrid visual question answering system, VQAS, integrating a deep neural network-based scene graph generation model and a rule-based knowledge reasoning system.
The proposed system can generate more accurate scene graphs for dynamic environments with some uncertainty.
Moreover, it can answer complex questions through knowledge reasoning with rich background knowledge.
Results of experiments using a photo-realistic 3D simulated environment, AI2-THOR, and the VEQA benchmark dataset prove the high performance of the proposed system.
American Psychological Association (APA)
Kim, Incheol. 2020. Visual Experience-Based Question Answering with Complex Multimodal Environments. Mathematical Problems in Engineering،Vol. 2020, no. 2020, pp.1-18.
https://search.emarefa.net/detail/BIM-1201433
Modern Language Association (MLA)
Kim, Incheol. Visual Experience-Based Question Answering with Complex Multimodal Environments. Mathematical Problems in Engineering No. 2020 (2020), pp.1-18.
https://search.emarefa.net/detail/BIM-1201433
American Medical Association (AMA)
Kim, Incheol. Visual Experience-Based Question Answering with Complex Multimodal Environments. Mathematical Problems in Engineering. 2020. Vol. 2020, no. 2020, pp.1-18.
https://search.emarefa.net/detail/BIM-1201433
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references
Record ID
BIM-1201433