Track description


The focus of this track is to estimate future 2D facial landmarks, hand, and upper body pose of a target individual in a dyadic interaction for 2 seconds (50 frames), given an observed time window of at least 4 seconds of both interlocutors, from two individual views. Participants are expected to exploit context-aware information which may affect the way individuals behave. The labels used for this track will be automatically generated, i.e., they will be treated as soft labels, obtained using state-of-the-art methods for 3D facial landmarks, hand, and upper body pose estimation. We assume the training data may contain some noisy labels due to some small failures of the recognition methods. However, we will manually clean and fix wrong automatically obtained annotations in the validation and test sets in order to provide a fair evaluation. Challenge participants can also use context information and utterance-level transcriptions for this track.

News


DYAD@ICCV2021 Dataset access rules updated

It is now possible to request dataset access using a digital certificate! Please check the updated instructions here.

DYAD@ICCV2021 Validation set released

The masked-out validation set is available now for download.

DYAD@ICCV2021 Dataset access rules

The dataset access rules have been updated! Please check them out here.

ICCV 2021 Challenge

The ChaLearn Looking at People Understanding Social Behavior in Dyadic and Small Group Interactions Challenge webpage has been released.