Visual Question Answering and Dialog Workshop
Location: Seaside Ballroom B, Long Beach Convention & Entertainment Center
at CVPR 2019, June 17, Long Beach, California, USA

Introduction

The primary goal of this workshop is two-fold. First is to benchmark progress in Visual Question Answering and Visual Dialog.

Visual Question Answering

VQA 2.0: This track is the 4th challenge on the VQA v2.0 dataset introduced in Goyal et al., CVPR 2017. The 2nd and 3rd editions were organised at CVPR 2017 and CVPR 2018 on the VQA v2.0 dataset, and the 1st edition was organised at CVPR 2016 on the VQA v1.0 dataset introduced in Antol et al., ICCV 2015. VQA v2.0 is more balanced and reduces language biases over VQA v1.0, and is about twice the size of VQA v1.0.

Challenge link: https://visualqa.org/challenge
Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/163/overview
Submission Deadline: May 10, 2019 23:59:59 GMT

TextVQA: This track is the 1st challenge on the TextVQA dataset introduced in Singh et al., 2019. TextVQA requires algorithms to look at an image, read text in the image, reason about it, and answer a given question.

Challenge link: https://textvqa.org/challenge
Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/244/overview
Submission Deadline: May 27, 2019 23:59:59 GMT [Extended]

GQA: This track is the 1st challenge on the GQA dataset introduced in Hudson et al., 2019. GQA is a new dataset that focuses on real-world compositional reasoning. The dataset contains 20M image and question pairs, each of them comes with an underlying structured representation of their semantics. The dataset is complemented with a suite of new evaluation metrics to test consistency, validity and grounding. To know more about GQA, visit https://visualreasoning.org.

Challenge link: https://cs.stanford.edu/people/dorarad/gqa/challenge.html
Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/225/overview
Submission Deadline: May 15, 2019 23:59:59 GMT

Visual Dialog

The 2nd edition of the Visual Dialog Challenge will be hosted on the VisDial v1.0 dataset introduced in Das et al., CVPR 2017. The 1st edition of the Visual Dialog Challenge was organised on the VisDial v1.0 dataset at ECCV 2018. See leaderboard and analysis here. Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history (consisting of the image caption and a sequence of previous questions and answers), the agent has to answer a follow-up question in the dialog.

Challenge link: https://visualdialog.org/challenge
Evaluation Server: https://evalai.cloudcv.org/web/challenges/challenge-page/161/overview
Submission Deadline: May 18, 2019 23:59:59 GMT

IEEE Conference on Computer Vision and Pattern Recognition, 2019

Prizes

Google Cloud Platform

Invited Speakers

Alex Schwing
University of Illinois at Urbana-Champaign

Lisa Hendricks
University of California, Berkeley

Yoav Artzi
Cornell University

Layla El Asri
Microsoft Research

Christopher Manning
Stanford University

Sanja Fidler
University of Toronto / NVIDIA

Karl Moritz Hermann
Google DeepMind

(Tentative) Program (Venue: Seaside Ballroom B, Convention Center)

9:00 AM - 9:10 AM		Welcome Devi Parikh (Georgia Tech / Facebook AI Research) [Slides] [Video]
9:10 AM - 9:35 AM		Invited Talk Alex Schwing (University of Illinois at Urbana-Champaign) [Video]
9:35 AM - 10:00 AM		Invited Talk Lisa Hendricks (University of California, Berkeley) [Slides] [Video]
10:00 AM - 10:15 AM		VQA Challenge Talk (Overview, Analysis and Winner Announcement) Ayush Shrivastava (Georgia Tech) [Slides] [Video]
10:15 AM - 10:20 AM		VQA Challenge Runner-up Talk Team: MSM@MSRA Members: Bei Liu, Zhicheng Huang, Zhaoyang Zeng, Zheyu Chen and Jianlong Fu [Slides] [Video]
10:20 AM - 10:25 AM		VQA Challenge Winner Talk Team: MIL@HDU Members: Zhou Yu, Jun Yu, Yuhao Cui and Jing Li [Slides] [Video]
10:25 AM - 10:50 AM		Morning Break
10:50 AM - 11:15 AM		Invited Talk Christopher Manning (Stanford University) [Slides] [Video]
11:15 AM - 11:30 AM		GQA Challenge Talk (Overview, Analysis and Winner Announcement) Drew Hudson (Stanford University) [Slides] [Video]
11:30 AM - 11:40 AM		GQA Challenge Winner Talk Team: Kakao Brain Members: Eun-Sol Kim, Yu-Jung Heo and Woo-Young Kang [Slides] [Video]
11:40 AM - 11:55 AM		TextVQA Challenge Talk (Overview, Analysis and Winner Announcement) Amanpreet Singh (Facebook AI Research) [Slides] [Video]
11:55 AM - 12:00 PM		TextVQA Challenge Runner-up Talk Team: Team-Schwail Members: Harsh Agrawal, Jyoti Aneja, Maghav Kumar and Alex Schwing [Slides] [Video]
12:00 PM - 12:05 PM		TextVQA Challenge Winner Talk Team: DCD_ZJU Members: Yuetan Lin, Hongrui Zhao, Yanan Li and Donghui Wang [Slides] [Video]
12:05 PM - 1:35 PM		Lunch (On your own)
1:35 PM - 2:00 PM		Invited Talk Karl Moritz Hermann (Google DeepMind) [Slides] [Video]
2:00 PM - 2:25 PM		Invited Talk Layla El Asri (Microsoft Research) [Slides] [Video]
2:25 PM - 2:40 PM		Visual Dialog Challenge Talk (Overview, Analysis and Winner Announcement) Abhishek Das (Georgia Tech) [Slides] [Video]
2:40 PM - 2:50 PM	Alibaba DAMO Academy	Visual Dialog Challenge Winner Talk Team: MReaL - BDAI Members: Jiaxin Qi, Yulei Niu, Hanwang Zhang, Jianqiang Huang, Xian-Sheng Hua and Ji-Rong Wen [Slides] [Video]
2:50 PM - 4:05 PM		Poster session and Afternoon break Location: Pacific Arena Ballroom Allotted Poster Boards: #168 to #207
4:05 PM - 4:30 PM		Invited Talk Sanja Fidler (University of Toronto / NVIDIA) [Video]
4:30 PM - 4:55 PM		Invited Talk Yoav Artzi (Cornell University) [Slides] [Video]
4:55 PM - 5:40 PM		Panel: Future Directions [Video]
5:40 PM - 5:50 PM		Closing Remarks Devi Parikh (Georgia Tech / Facebook AI Research) [Slides] [Video]

Poster Presentation Instructions

The physical dimensions of the poster stands that will be available this year are 8 feet wide by 4 feet high. Please review the reference poster template for more details on how to prepare your poster. You do NOT have to use this template, but please read the instructions carefully and prepare your posters accordingly.

Submission Instructions

We invite submissions of extended abstracts of at most 2 pages describing work in areas such as: Visual Question Answering, Visual Dialog, (Textual) Question Answering, (Textual) Dialog Systems, Commonsense knowledge, Video Question Answering, Video Dialog, Vision + Language, and Vision + Language + Action (Embodied Agents). Accepted submissions will be presented as posters at the workshop. The extended abstract should follow the CVPR formatting guidelines and be emailed as a single PDF to the email id mentioned below. Please use the following LaTeX/Word templates.

LaTeX/Word Templates (zip): cvpr2019AuthorKit.zip

Dual Submissions

Where to Submit?

visualqa.workshop@gmail.com

Dates

Jan 2019 Challenge Announcement mid-May 2019 Challenge Submission May 24, 2019 Extended Workshop Paper Submission Jun 2, 2019 Notification to Authors Jun 17, 2019 Workshop