Szczegóły publikacji
Opis bibliograficzny
Enhancing convnets with pruning and symmetry-based filter augmentation / Igor Ratajczyk, Adrian HORZYK // W: Neural Information Processing : 31st International Conference, ICONIP 2024 : Auckland, New Zealand, December 2–6, 2024: proceedings , Pt. 1 / eds. Mufti Mahmud, [et al.]. — Singapore : Springer Nature Singapore, cop. 2025. — ( Lecture Notes in Computer Science ; ISSN 0302-9743 ; LNCS 15286 ). — ISBN: 978-981-96-6575-4; e-ISBN: 978-981-96-6576-1. — S. 396–409. — Bibliogr., Abstr. — Publikacja dostępna online od: 2025-06-08
Autorzy (2)
Słowa kluczowe
Dane bibliometryczne
| ID BaDAP | 163072 |
|---|---|
| Data dodania do BaDAP | 2025-09-30 |
| DOI | 10.1007/978-981-96-6576-1_27 |
| Rok publikacji | 2025 |
| Typ publikacji | materiały konferencyjne (aut.) |
| Otwarty dostęp | |
| Wydawca | Springer |
| Konferencja | International Conference on Neural Information Processing 2024 |
| Czasopismo/seria | Lecture Notes in Computer Science |
Abstract
Contemporary neural networks usually use rigid or transferred architectures that are not adapted during the training process, which often causes problems with underfitting and overfitting or produces unnecessary large architectures that require complex and long-lasting regularization or implementation of attention blocks. The architecture optimization for a given dataset proceeds in an empirical process, where a few architectures of different hyperparameters are trained and compared to choose the one with the highest performance. When working with convnets, we have a priori set of some filters in different layers to allow the network to learn to represent the most frequent training data patterns to minimize underfitting and overfitting. Establishing a good enough network architecture is still a challenge for developers. In this paper, we propose a new method for augmenting filters during the training process to remove poorly developed or very similar filters and add new filters that better reproduce the frequent patterns occurring in the training data. The presented approaches remove unnecessary bias and harmful inferences produced by low-quality filters developed automatically during casual training. Moreover, this method can also reduce the network size, utilize filters more efficiently, reduce computational costs, accelerate the training process, and achieve better generalization in the same number of epochs as the models that do not use the presented approach.