June 1, 2023

ImageBind すべてを結び付けるための単一埋め込み空間

Key Points

Key points are not available for this paper at this time.

Abstract

我々はImageBindを提案する。これは、画像、テキスト、オーディオ、深度、熱、IMUデータの6つの異なるモダリティにまたがる共同埋め込みを学習する手法である。共同埋め込みを訓練するために全ての組み合わせのペアデータが必要なわけではなく、画像とペアのデータのみでモダリティを連結できることを示す。ImageBindは最近の大規模なビジョン・言語モデルを活用し、それらのゼロショット性能を画像との自然なペアリングを利用するだけで新たなモダリティに拡張できる。これにより、クロスモーダル検索、算術操作によるモダリティの合成、クロスモーダル検出および生成など、箱から出した瞬間に使える新しい応用が可能となる。この発現的能力は画像エンコーダーの性能向上とともに改善し、モダリティ間のゼロショット認識タスクで専門的な教師ありモデルを上回る新しい最先端を樹立する。最後に、従来研究を上回る強力な少数ショット認識結果を示し、ImageBindが視覚的および非視覚的タスクのためのビジョンモデル評価の新しい手段となることを示す。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Rohit Girdhar

Alaaeldin El-Nouby

Zhuang Liu

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Girdharら（木曜日）はこの問題を研究した。

www.synapsesocial.com/papers/69dab430615cc0c8eaa3d097 — DOI: https://doi.org/10.1109/cvpr52729.2023.01457

Also consider

Synapse has enriched 2 closely related papers on similar clinical questions. Consider them for comparative context:

OmniMAE: Single Model Masked Pretraining on Images and Videos· 2023 · 67 citations
Representation Learning with Contrastive Predictive Coding· 2018 · 4,515 citations

ImageBind すべてを結び付けるための単一埋め込み空間

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider