What question did this study set out to answer?

This research aims to enhance neural network compression by addressing redundancy at both filter and architectural levels.

April 10, 2026Open Access

IDAP++: Advancing Divergence-Aware Pruning with Joint Filter and Layer Optimization

Key Points

This research aims to enhance neural network compression by addressing redundancy at both filter and architectural levels.
Developed a two-stage optimization process combining filter pruning and layer optimization.
Implemented iterative divergence-aware pruning to identify and remove redundant filters.
Analyzed layer-wise contributions to optimize the overall architecture.
Tested the method across various neural network designs including convolutional networks and transformers.
Achieved substantial model compression while maintaining competitive accuracy across different architectures.
Parameter reduction outcomes are comparable to state-of-the-art methods and outperform them in various cases.
Demonstrated effective use of flow divergence for optimizing both filters and layers.

Abstract

Modern knowledge and large volumes of data are increasingly encoded within neural networks, making the task of simplifying their structures and reducing the number of parameters especially relevant, both to improve efficiency and to facilitate deployment in resource-constrained environments. This paper presents a novel approach to neural network compression that addresses redundancy at both the filter and architectural levels through a unified framework grounded in information flow analysis. Building upon the concept of tensor flow divergence, which quantifies how information transforms across network layers, we develop a two-stage optimization process. The first stage employs iterative divergence-aware pruning to identify and remove redundant filters while preserving critical information pathways. The second stage extends this principle to higher-level architecture optimization by analyzing layer-wise contributions to information propagation and selectively eliminating entire layers that demonstrate minimal impact on network performance. The proposed method naturally adapts to diverse architectures, including convolutional networks, transformers, and hybrid designs, providing a consistent metric for comparing the structural importance across different layer types. Experimental validation across multiple modern architectures and datasets reveals that this combined approach achieves substantial model compression while maintaining competitive accuracy. The presented approach achieves parameter reduction results that are globally comparable to state-of-the-art solutions and outperform them across a wide range of modern neural network architectures, from convolutional models to transformers. The results demonstrate how flow divergence serves as an effective guiding principle for both filter-level and layer-level optimization, offering practical benefits for deployment in resource-constrained environments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Aleksei Samarin

Artem Nazarenko

Egor Kotenko

Journals

Proceedings of the ACM on Management of Data

Actions

Institutions

St Petersburg University

ITMO University

United Way

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

IDAP++: Advancing Divergence-Aware Pruning with Joint Filter and Layer Optimization

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study