ChaLearn

Evaluation metrics

The challenge comprises three types of evaluations, each conveying its own assessment mechanism as follows:

Quantitative evaluation (interview recommendation). The performance of solutions will be evaluated according to the performance of solutions when predicting the interview variable in the test data. However, the authors are expected to send predictions of personality traits and use them to improve their performance.
Qualitative evaluation (explanatory mechanisms). Participants should provide a textual description that explains the decision made for "invite for an interview" in test data. Performance will be evaluated in terms of the creativity of participants and the explanatory effectiveness of the mechanisms-interface. For this evaluation we will invite a set of experts in the fields of psychological behaviour analysis, recruitment, machine learning and computer vision. Since the explainability component of the challenge requires qualitative evaluations and hence human effort, the scoring of participants will be made based on a small subset of the videos. Specifically, a small subset of videos from the validation data and a small subset of videos from the test data will be systematically selected to best represent the variability of the personality traits and invite-for-interview values in the entire dataset. The jury will only evaluate a single validation and a single test phase submission per participant. A separate jury member will serve as a tiebreaker.
Coopetition evaluation (code sharing). Participants will be evaluated by the usefulness of their shared code in the collaborative competition scheme.

Both competition stages will be independently evaluated, and top 3 ranked participants at each stage will be awarded:

-1st stage of the competition (quantitative) will be evaluated in terms of regression error of the continuous "job interview recommendation" variable. Top 3 ranked participants will be awarded for this 1st quantitative stage. A quantitative sample prediction file can be found here.

-2nd stage of the competition: (qualitative coopetition). All participants that participate in this stage HAVE to provide their code from 1st stage. All participants in this stage can use any code from the rest of participants, fuse them, and provide the "explainable file of the recommendation" with the improved obtained results from this code sharing strategy. This second stage will be evaluated by the jury based on the following criteria. On a scale 0 to 5, 5 is best, evaluate the following:

Clarity: Is the text understandable / written in proper English?
Explainability: Does the text provide relevant explanations to the hiring decision made?
Soundness: Are the explanations rational and, in particular, do they seem scientific and/or related to behavioral cues commonly used in psychology.
Model interpretability: Are the explanation useful to understand the functioning of the predictive model?
Creativity: How original / creative are the explanations?

Participants should submit 10 video descriptions in the qualitative validation phase. We used the following procedure to make sure that it is a representative sample:

- Clustered the validation videos into 10 clusters based on the interview annotation as well as the five trait annotations.

- From each cluster, we picked the example that is closest to the center of that cluster.

Therefore, the following videos are considered in the evaluation of validation set:

cT3oyHhUznw.000.mp4
sHVXhr7_EOs.000.mp4
ax8wm9K41og.002.mp4
B2riMsP8LD8.002.mp4
kSk-rf7a1Ig.004.mp4
o2wtRccAgjE.005.mp4
DVh_7dO2cWY.001.mp4
2SzC9dm4Yy4.001.mp4
7fOxteINSUg.002.mp4
EvZ0esZgPK4.005.mp4

For the test phase, participants have to submit predictions and explanations (similarly as in the development stage) for the whole test set. The jury will evaluate a subset of those videos based on variability criteria to compute your final score.

News

January 10: CVPR 2017 competition started

Compatition on explainable impressions started and participants can enter the competiotion througth CodaLab here.