Multi-scale feature integration remains a critical challenge in point cloud processing. While existing encoder-decoder frameworks extract multi-scale features through sequential downsampling and upsampling, their delayed fusion strategy leads to insufficient feature interaction and spatial accuracy degradation. Inspired by high-resolution networks in 2D vision, we propose HRFN3D, a novel architecture that maintains parallel high-to-low resolution streams throughout the network. Our key innovation lies in early-stage feature fusion through two core components: A Local Sequence Operator combining grouped vector attention with KNN-based local aggregation, and An Early Upward Feature Fusion module enabling cross-resolution interaction via learnable upsampling. Furthermore, we introduce a Channel-Spatial Attention Module to enhance global context modeling. Extensive experiments demonstrate state-of-the-art performance, achieving 86.5% instance mean Intersection over Union on ShapeNetPart and 91.5% class average accuracy on ModelNet40. The code will be released upon publication.
Zhu et al. (Sat,) studied this question.