May 24, 2024Open Access

LM4LV: A Frozen Large Language Model for Low-level Vision Tasks

Key Points

Key points are not available for this paper at this time.

Abstract

The success of large language models (LLMs) has fostered a new research trend of multi-modality large language models (MLLMs), which changes the paradigm of various fields in computer vision. Though MLLMs have shown promising results in numerous high-level vision and vision-language tasks such as VQA and text-to-image, no works have demonstrated how low-level vision tasks can benefit from MLLMs. We find that most current MLLMs are blind to low-level features due to their design of vision modules, thus are inherently incapable for solving low-level vision tasks. In this work, we purpose LM4LV, a framework that enables a FROZEN LLM to solve a range of low-level vision tasks without any multi-modal data or prior. This showcases the LLM's strong potential in low-level vision and bridges the gap between MLLMs and low-level vision tasks. We hope this work can inspire new perspectives on LLMs and deeper understanding of their mechanisms.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Zheng et al. (Fri,) studied this question.

www.synapsesocial.com/papers/68e68aacb6db64358761232e — DOI: https://doi.org/10.48550/arxiv.2405.15734

Authors

Boyang Zheng

Jinjin Gu

Shijun Li

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

LM4LV: A Frozen Large Language Model for Low-level Vision Tasks

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Also consider