This paper presents a multimodal deep learning approach for precise 3D reconstruction of mechanical parts from images and textual descriptions, offering a cost-effective alternative to traditional methods. By combining high-resolution, multi-angle images with technical text data, the model generates accurate 3D representations. A ResNet-based CNN extracts visual features, while BERT encodes textual descriptions, with a depth estimation module enhancing spatial accuracy. The features are fused to produce a 3D point cloud and mesh. The results demonstrate good performance in capturing the overall shape of the mechanical parts; however, further improvements are needed to enhance the precision of the metric parameters.
Building similarity graph...
Analyzing shared references across papers
Loading...
Dridi et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69843405f1d9ada3c1fb1a09 — DOI: https://doi.org/10.1051/matecconf/202541404007/pdf
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Issam Dridi
Taher Haddad
Noureddine Ben Yahia
Building similarity graph...
Analyzing shared references across papers
Loading...