Improving convolutional neural networks : advanced regularization methods and architectural innovations

Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In recent years, Convolutional Neural Networks (CNNs) have demonstrated significant advancements in a variety of computer vision applications, such as image recognition, object detection and image segmentation. Despite their success, such networks suffer from overfitting due to the limited size of training data and the high capacity of the model. Overfitting describes the phenomenon where a CNN achieves perfect performance on the training set but poor generalization to new, unseen data. To address overfitting, various regularization methods have been developed. However, there remains a need for strategies that can further improve generalization and provide complementary benefits when combined with existing techniques. This work introduces three novel regularization methods, each uniquely designed to build upon existing approaches to enhance the generalization of CNNs in different ways: Weight Compander (WC), Spectral Batch Normalization (SBN), Spectral Wavelet Dropout (SWD). While regularization methods improve the generalization of CNNs, they do not increase the capacity of the networks to process and represent more complex features. This is because CNNs are constrained by their fixed architectures. Therefore, advancements in network architecture are needed to offer improvements that go beyond the adjustment in training behavior. Specifically, CNNs lack a mechanism for dynamic feature retention similar to the memory of the human brain. To address this, we propose the Squeeze-and-Remember (SR) Block, a new architectural unit for CNNs that allows them to store high-level features during training and recall those features during inference. Despite their remarkable performance, CNNs often require substantial computational power and extensive memory usage. This poses considerable challenges when deploying parameter-heavy models on devices with limited computational resources. To mitigate this, we finally introduce a technique called Mixture-of-Depths (MoD) for CNNs which enhances the computational efficiency of CNNs by selectively processing channels based on their relevance to the current prediction.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By