Visual Experience-Based Question Answering with Complex Multimodal Environments

Author

Kim, Incheol

Source

Mathematical Problems in Engineering

Issue

Vol. 2020, Issue 2020 (31 Dec. 2020), pp.1-18, 18 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2020-11-19

Country of Publication

Egypt

No. of Pages

18

Main Subjects

Civil Engineering

Abstract EN

This paper proposes a novel visual experience-based question answering problem (VEQA) and the corresponding dataset for embodied intelligence research that requires an agent to do actions, understand 3D scenes from successive partial input images, and answer natural language questions about its visual experiences in real time.

Unlike the conventional visual question answering (VQA), the VEQA problem assumes both partial observability and dynamics of a complex multimodal environment.

To address this VEQA problem, we propose a hybrid visual question answering system, VQAS, integrating a deep neural network-based scene graph generation model and a rule-based knowledge reasoning system.

The proposed system can generate more accurate scene graphs for dynamic environments with some uncertainty.

Moreover, it can answer complex questions through knowledge reasoning with rich background knowledge.

Results of experiments using a photo-realistic 3D simulated environment, AI2-THOR, and the VEQA benchmark dataset prove the high performance of the proposed system.

American Psychological Association (APA)

Kim, Incheol. 2020. Visual Experience-Based Question Answering with Complex Multimodal Environments. Mathematical Problems in Engineering،Vol. 2020, no. 2020, pp.1-18.
https://search.emarefa.net/detail/BIM-1201433

Modern Language Association (MLA)

Kim, Incheol. Visual Experience-Based Question Answering with Complex Multimodal Environments. Mathematical Problems in Engineering No. 2020 (2020), pp.1-18.
https://search.emarefa.net/detail/BIM-1201433

American Medical Association (AMA)

Kim, Incheol. Visual Experience-Based Question Answering with Complex Multimodal Environments. Mathematical Problems in Engineering. 2020. Vol. 2020, no. 2020, pp.1-18.
https://search.emarefa.net/detail/BIM-1201433

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1201433