For the evaluation process, the output from your system should be a real-valued confidence so that a precision/recall curve can be drawn. Greater confidence values signify greater confidence that the image contains belongs to the class of interest. The classification task will be judged by the precision/recall curve. The principal quantitative measure used will be the average precision (AP). We use the area under this curve to the computation of the average precision (AP), which is calculated by numerical integration.
Participants must submit their predictions in the following format: In this track, the participants should submit a text file for each category e.g. `La_Tomatina'. Each line should contain a single identifier and the confidence output by the classifier, separated by a space, for example:
La_Tomatina.txt: ... 002234.jpg 0.056313 010256.jpg 0.127031 010987.jpg 0.287153 ...
Data access and evaluation scripts:
For this track, we provide a file evaluate.py which contains the method for data access and evaluation. This script allows to get the final mean average precision value. To obtain the final mean average precision (mAP) for all the categories, in the same way performed in the Codalab platform, we must go to the main directory and run the next command:
>> python program/evaluate.py input output