The dataset contains the following directories:
- Train1: for training single frame pose recovery which contains 16bit raw depth images, segment images labeled from 1 to 25 (0 for background) and txt files for joint locations. The format of the txt file is as follows:
-1 <finger 1 status> <finger 2 status> <finger 3 status> <finger 4 status> <finger 5 status>
0 X Y Z U V
19 X Y Z U V
finger status is a continous number showing how open the finger is. X, Y and Z are 3D joint locations in each axis in blender in meters, and U and V are joint image plane coordinates.
- Train2: for training temporal pose recovery, just contains txt file in the same format as above. In this dataset, global view (or palm joints) is fixed as a reference for all frames. This dataset was recorded with different finger deformations and movement speed.
- Test: for evaluating the algorithm.
We trained some coefficients to convert 16bit raw depth image to real depth values and a bias to adjust depth values with kinect2 camera as:
a = 1.537713182783716e+03;
b = -1.958992153351393e+02;
c = 5.456122487076314e+02-10;
kinect2 = -160;
dim(pixel_idx) = kinect2 + c + b * dim(pixel_idx) / 65535 + a * (dim(pixel_idx) / 65535) ^ 2; % dim is a depth image
Z = Z * 1000 + kinect2; % 1000 is multiplied to convert to mm
We use kinect2 camera parameters to reconstruct point cloud:
intrinsics=[519.30 0 334.00
0 516.60 236.00
0 0 1];
Dataset is available here to download (~6GB).