Visual Question Answering

Problem Statement

  • Give an image and natural language question about the image, the task is to perform a complex resoning to provide accurate natural language answer.

Dataset

  • Image Dataset-:MS-COCO and QA Dataset -: VQA

  • Image Dataset-:MS-COCO and QA Dataset -: COCO-qa

  • Image Dataset-:DAQUAR and QA Dataset -: DAQUAR

  • Image Dataset-:MS-COCO and QA Dataset -: Visual7W

Reference Paper for VQA

  • L. Ma, Z. Lu, and H. Li., ‘‘Learning to Answer Questions From Image using Convolutional Neural Network”,CoRR abs/1506.00333, Nov, 2015.

  • H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang and W. Xu.,‘‘Are you talking to a machine? dataset and methods for multilingual image question answering.”,arXiv 1505.05612v3, Nov, 2015.

  • M. Ren, R. Kiros, and R. S. Zemel, ‘‘Exploring models and data for image question answering”,arXiv 1505.02074,2015.

  • M. Malinowski, M. Rohrbach, and M. Fritz.,‘‘Ask your neurons: A neural-based approach to answering questions about images.”,arXiv 1505.01121, Nov, 2015.