Szczegóły publikacji
Opis bibliograficzny
Molecular Topological Profile (MOLTOP) - simple and strong baseline for molecular graph classification / Jakub ADAMCZYK, Wojciech CZECH // W: ECAI 2024 [Dokument elektroniczny] : 27th European Conference on Artificial Intelligence : 19-24 October 2024, Santiago de Compostella, Spain : including 13th conference on Prestigious Applications of Intelligent Systems (PAIS 2024) / ed. by U. Endriss, [et al.]. — Wersja do Windows. — Dane tekstowe. — [Amsterdam] : IOS Press, cop. 2024. — ( Frontiers in Artificial Intelligence and Applications ; ISSN 0922-6389 ; vol. 392 ). — e-ISBN: 978-1-64368-548-9. — S. 1575-1582. — Bibliogr. s. 1581-1582, Abstr.
Autorzy (2)
Dane bibliometryczne
| ID BaDAP | 156136 |
|---|---|
| Data dodania do BaDAP | 2024-11-14 |
| Tekst źródłowy | URL |
| DOI | 10.3233/FAIA240663 |
| Rok publikacji | 2024 |
| Typ publikacji | materiały konferencyjne (aut.) |
| Otwarty dostęp | |
| Creative Commons | |
| Wydawca | IOS Press |
| Konferencja | European Conference on Artificial Intelligence 2024 |
| Czasopismo/seria | Frontiers in Artificial Intelligence and Applications |
Abstract
We revisit the effectiveness of topological descriptors for molecular graph classification and design a simple, yet strong baseline. We demonstrate that a simple approach to feature engineer- ing - employing histogram aggregation of edge descriptors and one- hot encoding for atomic numbers and bond types - when combined with a Random Forest classifier, can establish a strong baseline for Graph Neural Networks (GNNs). The novel algorithm, Molecular Topological Profile (MOLTOP), integrates Edge Betweenness Cen- trality, Adjusted Rand Index and SCAN Structural Similarity score. This approach proves to be remarkably competitive when compared to modern GNNs, while also being simple, fast, low-variance and hyperparameter-free. Our approach is rigorously tested on Molecu- leNet datasets using fair evaluation protocol provided by Open Graph Benchmark. We additionally show out-of-domain generation ca- pabilities on peptide classification task from Long Range Graph Benchmark. The evaluations across eleven benchmark datasets reveal MOLTOP’s strong discriminative capabilities, surpassing the 1-WL test and even 3-WL test for some classes of graphs. Our conclusion is that descriptor-based baselines, such as the one we propose, are still crucial for accurately assessing advancements in the GNN domain.