Visual Question Answering and Dialog Workshop
Location: Long Beach Convention & Entertainment Center
at CVPR 2019, June 17, Long Beach, California, USA




Introduction

The primary goal of this workshop is two-fold. First is to benchmark progress in Visual Question Answering and Visual Dialog.

    Visual Question Answering
    There will be three tracks in the Visual Question Answering Challenge this year.

  • VQA 2.0: This track is the 4th challenge on the VQA v2.0 dataset introduced in Goyal et al., CVPR 2017. The 2nd and 3rd editions were organised at CVPR 2017 and CVPR 2018 on the VQA v2.0 dataset, and the 1st edition was organised at CVPR 2016 on the VQA v1.0 dataset introduced in Antol et al., ICCV 2015. VQA v2.0 is more balanced and reduces language biases over VQA v1.0, and is about twice the size of VQA v1.0.

    Challenge link: https://visualqa.org/challenge
    Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/163/overview
    Submission Deadline: May 10, 2019 23:59:59 GMT ()

  • TextVQA: This track is the 1st challenge on the TextVQA dataset. TextVQA requires algorithms to look at an image, read text in the image, reason about it, and answer a given question.

    Challenge link: Coming soon!

  • GQA: This track is the 1st challenge on the GQA dataset. GQA is a new dataset that focuses on real-world compositional reasoning. The dataset contains 20M image and question pairs, each of them comes with an underlying structured representation of their semantics. The dataset is complemented with a suite of new evaluation metrics to test consistency, validity and grounding. To know more about GQA, visit https://visualreasoning.org.

    Challenge link: https://cs.stanford.edu/people/dorarad/gqa/challenge.html
    Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/225/overview
    Submission Deadline: May 03, 2019 23:59:59 GMT ()

  • Visual Dialog
  • The 2nd edition of the Visual Dialog Challenge will be hosted on the VisDial v1.0 dataset introduced in Das et al., CVPR 2017. The 1st edition of the Visual Dialog Challenge was organised on the VisDial v1.0 dataset at ECCV 2018. See leaderboard and analysis here. Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history (consisting of the image caption and a sequence of previous questions and answers), the agent has to answer a follow-up question in the dialog.

    Challenge link: https://visualdialog.org/challenge
    Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/161/overview
    Submission Deadline: May 18, 2019 23:59:59 GMT ()

  • The second goal of this workshop is to continue to bring together researchers interested in visually-grounded question answering, dialog systems, and language in general to share state-of-the-art approaches, best practices, and future directions in multi-modal AI. In addition to invited talks from established researchers, we invite submissions of extended abstracts of at most 2 pages describing work in the relevant areas including: Visual Question Answering, Visual Dialog, (Textual) Question Answering, (Textual) Dialog Systems, Commonsense knowledge, Vision + Language, etc. All accepted abstracts will be presented as posters at the workshop to disseminate ideas. The workshop is on June 17th, 2019, at the IEEE Conference on Computer Vision and Pattern Recognition, 2019.

    Prizes
    The winning team of each track will receive Google Cloud Platform credits worth $10k!


Invited Speakers


Alex Schwing
University of Illinois at Urbana-Champaign


He He
Amazon Web Services  --> New York University


Lisa Hendricks
University of California, Berkeley


Yoav Artzi
Cornell University


Layla El Asri
Microsoft Research


More speakers to be added.

Dates

Jan 2019 Challenge Announcement
mid-May 2019 Challenge Submission
May 15, 2019 Workshop Paper Submission
May 22, 2019 Notification to Authors
Jun 2019 Workshop


Organizers


Abhishek Das
Georgia Tech


Karan Desai
Georgia Tech


Ayush Shrivastava
Georgia Tech


Yash Goyal
Georgia Tech


Aishwarya Agrawal
Georgia Tech


Amanpreet Singh
Facebook AI Research


Meet Shah
Facebook AI Research


Drew Hudson
Stanford


Satwik Kottur
Carnegie Mellon


Rishabh Jain
Georgia Tech


Vivek
Natarajan


Stefan Lee
Georgia Tech


Peter Anderson
Georgia Tech


Xinlei Chen
Facebook AI Research


Marcus Rohrbach
Facebook AI Research


Dhruv Batra
Georgia Tech / Facebook AI Research


Devi Parikh
Georgia Tech / Facebook AI Research


Sponsors

This work is supported by grants awarded to Dhruv Batra and Devi Parikh.


Contact: visualqa@gmail.com