Hyperbolic Cosine Transformer for LiDAR 3D Object Detection
Article
Figures
Metrics
Preview PDF
Reference
Related
Cited by
Materials
Abstract:
Recently, Transformer has achieved great success in computer vision. However, it is constrained because the spatial and temporal complexity grows quadratically with the number of large points in 3D object detection applications. Previous point-wise methods are suffering from time consumption and limited receptive fields to capture information among points. To address these limitations, we propose the cosh-attention, which reduces the computation complexity of space and time from the quadratic order to linear order with respect to the number of points. In the cosh-attention, the traditional softmax operator is replaced by non-negative ReLU activation and hyperbolic-cosine-based operator with re-weighting mechanism. Then based on the cosh-attention, we present a two-stage hyperbolic cosine transformer (ChTR3D) for 3D object detection from point clouds. It refines proposals by applying cosh-attention in linear computation complexity to encode rich contextual relationships among points. Extensive experiments on the widely used KITTI dataset demonstrate that, compared with vanilla attention, the cosh-attention significantly improves the inference speed with competitive performance. Experiment results show that, among two-stage state-of-the-art methods using point-level features to refine proposals, the proposed ChTR3D is the fastest one.
Keywords:
Project Supported:
The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)