Full fine-tuning of pre-trained models sometimes requires inserting trainable layers into the middle of a pre-trained backbone, but such middle-layer insertion can severely degrade downstream performance. We hypothesize that this degradation arises because conventionally inserted layers, when randomly initialized and combined with output-side activation, perturb intermediate representations before the pre-trained model has adapted. We study this phenomenon across natural language processing and computer vision benchmarks by varying insertion locations, the number of inserted layers, and activation designs. To address this problem, we propose a practical stabilization method for middle-layer insertion under full fine-tuning: a bias-free inserted layer with unit initialization and weight-side activation. This design is intended to remain closer to an identity-like transformation at initialization, thereby reducing initialization-time perturbation rather than claiming exact preservation of the original representations. In the tested DeBERTa-v3, T5-base, and ViT-base settings, the proposed method substantially mitigates the severe degradation caused by naive middle-layer insertion and maintains performance close to the no-added-layer baseline, including settings with up to 24 inserted layers.
Kim et al. (Mon,) studied this question.