Owing to the limited amount of training samples on the released gesture datasets, it is hard to apply them on real applications. Therefore, we have built a large-scale gesture dataset: Chalearn LAP RGB-D Continous Gesture Dataset (ConGD). The focus of the challenges is "large-scale" learning and "user independent", which means gestures per each class are more than 200 RGB and depth videos, and training samples from the same person do not appear in the validation and testing sets. The Chalearn LAP ConGD dataset is derived from the Chalearn Gesture Dataset (CGD)  that is used on "one-shot-learning". Because the CGD dataset has totally more than 54,000 gestures which are split into subtasks. To reuse the CGD dataset, we finally obtained 249 gesture labels and manually labeled temporal segmentation to obtain the start and end frames for each gesture in continuous videos from the CGD dataset.
This database includes 47933 RGB-D gestures in 22535 RGB-D gesture videos (about 4G). Each RGB-D video may represent one or more gestures, and there are 249 gestures labels performed by 21 different individuals.
The database has been divided to three sub-datasets for the convenience of using, and these three subsets are mutually exclusive.
train.mat ==> Training Set . A structure array includes: train.video_name for RGB-D videos name, train.label for the label information and train.temproal_segment for the start and end points for each gesture in continuous videos. Three .mat files were shipped with this database: train.mat, valid.mat and test.mat.
valid.mat ==> Validation Set. A structure array includes: valid.video_name for RGB-D videos name, valid.label for the label information (an empty cell array) and valid.temproal_segment for the start and end points for each gesture in continuous videos (an empty cell array).
test.mat ==> Testing Set. A structure array includes: test.video_name for RGB-D videos name, test.label for the label information (an empty cell array) and test.temproal_segment for the start and end points for each gesture in continuous videos (an empty cell array).