Creating interactive VR content typically requires extensive modeling effort, making scalable production difficult. We present an automated computational pipeline that converts a single image into an interactive 3D scene with plausible kinesthetic feedback. Our approach leverages a Large Language Model (LLM) to detect objects in the image and infer haptic properties from visual and textual cues. These extracted results are then utilized to synthesize 3D models and optimize haptic properties for perceptual distinguishability, while combining them into a complete interactive environment. A user study shows that the generated VR scenes provide compelling visuo-haptic experiences, highlighting the potential of our method for scalable multisensory world generation. To our knowledge, this is the first system to automatically produce VR scenes with force feedback from a single image, pointing toward a practical direction for lowering the barriers to creating haptically enriched VR content.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jaejun Park
Soyeon Nam
Jeongwoo Kim
IEEE Transactions on Visualization and Computer Graphics
Pohang University of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Park et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69d893406c1944d70ce044a9 — DOI: https://doi.org/10.1109/tvcg.2026.3680620