This paper presents a multimodal deep learning approach for precise 3D reconstruction of mechanical parts from images and textual descriptions, offering a cost-effective alternative to traditional methods. By combining high-resolution, multi-angle images with technical text data, the model generates accurate 3D representations. A ResNet-based CNN extracts visual features, while BERT encodes textual descriptions, with a depth estimation module enhancing spatial accuracy. The features are fused to produce a 3D point cloud and mesh. The results demonstrate good performance in capturing the overall shape of the mechanical parts; however, further improvements are needed to enhance the precision of the metric parameters.
Building similarity graph...
Analyzing shared references across papers
Loading...
Issam Dridi
Taher Haddad
Noureddine Ben Yahia
MATEC Web of Conferences
Building similarity graph...
Analyzing shared references across papers
Loading...
Dridi et al. (Wed,) studied this question.
www.synapsesocial.com/papers/68e040f3a99c246f578b3820 — DOI: https://doi.org/10.1051/matecconf/202541404007
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: