Despite the significant progress that depth-based three-dimensional (3D) hand pose estimation methods have made in recent years, thanks to advancements in Deep Learning, they still require a large amount of labeled training data to achieve high accuracy. However, collecting such data is both costly and time-consuming. To tackle this issue, we propose a semi-supervised Deep Learning method to significantly reduce the dependence on labeled training data. The proposed method consists of two identical networks trained jointly: a teacher network and a student network. The teacher network is trained using both the available labeled and unlabeled samples. It leverages the unlabeled samples via a loss formulation that encourages estimation equivariance under a set of affine transformations. The student network is trained using the unlabeled samples with their pseudo-labels provided by the teacher network. For inference at test time, only the student network is used. Extensive experiments on challenging benchmarks (ICVL, NYU, MSRA) demonstrate the proposed method’s effectiveness. Notably, our approach significantly outperforms state-of-the-art semi-supervised methods across all datasets. Crucially, using only 25% of the available labeled data, our method achieves accuracy comparable to, and sometimes exceeding, fully supervised state-of-the-art methods trained on 100% of the labels. Even with just 1% of labels, our method surpasses prior semi-supervised techniques, achieving a mean distance error of 6.94mm and 8.71mm on ICVL and NYU, respectively. These results signify a substantial reduction in annotation requirements, making high-accuracy 3D hand pose estimation more practical and accessible.
Building similarity graph...
Analyzing shared references across papers
Loading...
Rezaei et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6987eb5df6bacdd2fe8fca6c — DOI: https://doi.org/10.1007/s11042-026-21305-7
Synapse has enriched 4 closely related papers on similar clinical questions. Consider them for comparative context:
Mohammad Rezaei
Farnaz Farahanipad
Alex Dillhoff
Multimedia Tools and Applications
Building similarity graph...
Analyzing shared references across papers
Loading...