Concept Bottleneck Models (CBMs) are neural networks designed to conjoin high performance withante-hoc interpretability. CBMs work by first mapping inputs (e.g., images) to high-level concepts(e.g., visible objects and their properties) and then use these to solve a downstream task (e.g., taggingor scoring an image) in an interpretable manner. Their performance and interpretability, however,hinge on the quality of the concepts they learn. The go-to strategy for ensuring good quality conceptsis to leverage expert annotations, which are expensive to collect and seldom available in applications.Researchers have recently addressed this issue by introducing “VLM-CBM” architectures that replacemanual annotations with weak supervision from foundation models. It is however unclear whatis the impact of doing so on the quality of the learned concepts. To answer this question, we putstate-of-the-art VLM-CBMs to the test, analyzing their learned concepts empirically using a selectionof significant metrics. Our results show that, depending on the task, VLM supervision can sensiblydiffer from expert annotations, and that concept accuracy and quality are not strongly correlated. Ourcode is available at https://github.com/debryu/CQA.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nicola Debole
Pietro Barbiero
Francesco Giannini
University of Trento
Scuola Normale Superiore
IBM Research - Zurich
Building similarity graph...
Analyzing shared references across papers
Loading...
Debole et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69df2c50e4eeef8a2a6b1489 — DOI: https://doi.org/10.5281/zenodo.19398254