Home  | Publications | CDZ+24

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

MCML Authors

Abstract

Category-level object pose estimation, aiming to predict the 6D pose and 3D size of objects from known categories, typically struggles with large intra-class shape variation. Existing works utilizing mean shapes often fall short of cap-turing this variation. To address this issue, we present Sec-ondPose, a novel approach integrating object-specific ge-ometric features with semantic category priors from DI-NOv2. Leveraging the advantage of DINOv2 in providing SE(3)-consistent semantic features, we hierarchically extract two types of SE(3)-invariant geometric features to further encapsulate local-to-global object-specific information. These geometric features are then point-aligned with DINOv2 features to establish a consistent object represen-tation under SE(3) transformations, facilitating the map-ping from camera space to the pre-defined canonical space, thus further enhancing pose estimation. Extensive exper-iments on NOCS-REAL275 demonstrate that SecondPose achieves a 12.4% leap forward over the state-of-the-art. Moreover, on a more complex dataset HouseCat6D which provides photometrically challenging objects, SecondPose still surpasses other competitors by a large margin.

inproceedings


CVPR 2024

IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA, Jun 17-21, 2024.
Conference logo
A* Conference

Authors

Y. Chen • Y. Di • G. Zhai • F. Manhardt • C. Zhang • R. Zhang • F. TombariN. NavabB. Busam

Links

DOI

Research Areas

 B1 | Computer Vision

 C1 | Medicine

BibTeXKey: CDZ+24

Back to Top