Welcome to the VQA Challenge 2017!


We are pleased to announce the Visual Question Answering (VQA) Challenge 2017. Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a detailed understanding of the image and complex reasoning.

The VQA v2.0 train and validation sets, containing more than 120,000 images and 650,000 questions, are available on the download page. All questions are annotated with 10 concise, open-ended answers each. Annotations on the training and validation sets are publicly available.

The test set will be released soon. In view of the problems encountered in Codalab during VQA Challenge 2016, we are looking into alternatives for hosting test servers. Stay tuned for more details.

VQA Challenge 2017 is the second edition of the VQA Challenge. VQA Challenge 2016 was organized last year, and the results were announced at VQA Challenge Workshop, CVPR 2016. More details about VQA Challenge 2016 can be found here.

Challenge Guidelines

Coming soon.

Tools and Instructions

We provide API support for the VQA annotations and evaluation code. To download the VQA API, please visit our GitHub repository. For an overview of how to use the API, please visit the download page and consult the section entitled VQA API. To obtain API support for COCO images, please visit the COCO download page. To obtain API support for abstract scenes, please visit the GitHub repository.

For additional questions, please contact visualqa@gmail.com.