Track description

Input to the model

Similar to Track 2, in this track participants build their models based on RGBA input data. We also provide SMPL root joint distance to camera in the validation/test sets to avoid scale ambiguity. Participants can also make use of a rich metadata during training.

RGB+Alpha data is compressed in a .mkv video for each sequence and participants are free to use temporal data to build their models. You can follow the provided instructions in the starting kit to uncompress videos.



The output of the model are 3D reconstructed garments per frame, texture image/s and UV map/s. Note that ground truth garments are relative to SMPL root joint. Participants are free to predict 3D garments as a mesh, point cloud or volumetric data. However, evaluation is done based on mesh format. Therefore in the case of a point cloud or volumetric data, participants must apply an additional post-processing to convert predicted garments to the right format.


Providing a submission

A submission is a zip file with the following structure:

  • <sequence 1>
    • <garment 1>.pc16    <======== a file to store vertex locations for the whole sequence
    • <garment 1>.png      <======== texture image
    • <garment 1>.obj       <======== a file to store a mesh and its UV map. Mesh must be triangulated.
    • <garment 2>.pc16
    • <garment 2>.png
    • <garment 2>.obj
  • <sequence 2>
    • ....

where "sequence i" must have the same name as the validation/test data and "garment j" name is the garment type from the set {Top, Tshirt, Trousers, Skirt, Jumpsuit, Dress} and must be estimated from RGB images. As can be seen, garment topology, #vertices, testure and UV map are fixed in the whole sequence. Participants must use the provided functions in the starting kit to write "obj" and "pc16" files to ensure bug free submissions. "obj" file contains the following data (V, F, Vt, Ft) where V and F are mesh vertices and faces, and Vt and Ft are UV map vertices and faces. In this track, we do not evaluate V. Therefore participants can fill it with zero. Note that mesh data must be triangulated.

Important: Participants can use the whole sequence in the val/test set to predict garments, but they must save frames every 10 frames in their submissions. For instance if <sequence 1> has 300 frames, the following frames must be written in the "pc16" file: [10, 20, 30, ..., 300].

Click here to enter the competition



Evaluation is done once based on qualitative metric defined here after the end of final phase. However we show surface-to-surface metric on the leaderboard during development and test phase. To avoid heavy evaluation processing (that can block compute workers for hours), we limit number of vertices for each outfit to a maximum of 20K. In the evaluation code, we penalize outfits with more than 20K vertices. Specifically, we ignore them and assign a high error to them.


There are no news registered in