ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis

Summary

Estimating the 3D hand-object poses from a single image file is very difficult and many datasets lack the features that can be used to determine these orientations. ArtiBoost is a project that aims to estimate various hand-object poses in multiple orientations through sampling in a Composited hand-object Configuration and View-point space (CVV-space).

Objective

The objective of the research is to efficient hand-object pose estimation (HOPE). A human hand has around 21 degrees of freedom (DoF); this makes it very difficult to estimate hand-object poses. ArtiBoost is an online data enhancement method that aims to boost the articulated hand-object pose estimation by exploration and synthesis.

Method

Firstly, a discrete 3D space called the CCV-space is designed with object types, hand poses, and viewpoints as components. At the exploration step, ArtiBoost explores the CCV-space and samples hand-object-viewpoint triplets from it. During synthesis, the hand and object in the triplet will be rendered on the image from the viewpoint in the triplet. This essentially produces a synthetic image which is then later mixed with real-world source images in batches to train the HOPE model. The losses are fed back to the exploration step.

Result

The results of an experimented conducted indicated that a model using 10% N of the real-world poses and 100% N of the synthetic poses had the highest accuracy in any case. Limitations include that a diversity of poses help train the model better than more realistic features of the hand-object pose inputs.

Inference

WARG’s video data could be pipelined into ArtiBoost to create a CCV-space which will help estimate hand-object poses. This could be useful for various applications including hand gesture recognition. However, this model may not work when the UAV is in flight far from the object. Furthermore, the paper does not mention the computational time or resources taken the employ the model.

Autonomy