Szczegóły publikacji
Opis bibliograficzny
Towards textual data augmentation for neural networks: synonyms and maximum loss / Michał JUNGIEWICZ, Aleksander SMYWIŃSKI-POHL // Computer Science ; ISSN 1508-2806. — 2019 — vol. 20 no. 1, s. 57–83. — Bibliogr. s. 79–83, Abstr.
Autorzy (2)
Słowa kluczowe
Dane bibliometryczne
| ID BaDAP | 122464 |
|---|---|
| Data dodania do BaDAP | 2019-06-28 |
| Tekst źródłowy | URL |
| DOI | 10.7494/csci.2019.20.1.3023 |
| Rok publikacji | 2019 |
| Typ publikacji | artykuł w czasopiśmie |
| Otwarty dostęp | |
| Creative Commons | |
| Czasopismo/seria | Computer Science |
Abstract
Data augmentation is one of the ways to deal with labeled data scarcity and overfitting. Both of these problems are crucial for modern deep-learning algo- rithms, which require massive amounts of data. The problem is better explored in the context of image analysis than for text; this work is a step forward to help close this gap. We propose a method for augmenting textual data when training convolutional neural networks for sentence classification. The aug- mentation is based on the substitution of words using a thesaurus as well as Princeton University's WordNet. Our method improves upon the baseline in most of the cases. In terms of accuracy, the best of the variants is 1.2% (pp.) better than the baseline. © 2019 AGH University of Science and Technology Press.