Szczegóły publikacji
Opis bibliograficzny
Deep reinforcement learning-based scheduling for Wi-Fi multi-access point coordination / David Nunez, Francesc Wilhelmi, Maksymilian WOJNAR, Katarzyna KOSEK-SZOTT, Szymon SZOTT, Boris Bellalta // IEEE Transactions on Machine Learning in Communications and Networking [Dokument elektroniczny]. — Czasopismo elektroniczne ; ISSN 2831-316X . — 2026 — vol. 4, s. 744–757. — Wymagania systemowe: Adobe Reader. — Bibliogr. s. 757, Abstr. — Publikacja dostępna online od: 2026-04-09
Autorzy (6)
- Nunez David
- Wilhelmi Francesc
- AGHWojnar Maksymilian
- AGHKosek-Szott Katarzyna
- AGHSzott Szymon
- Bellalta Boris
Słowa kluczowe
Dane bibliometryczne
| ID BaDAP | 167780 |
|---|---|
| Data dodania do BaDAP | 2026-06-16 |
| Tekst źródłowy | URL |
| DOI | 10.1109/TMLCN.2026.3682239 |
| Rok publikacji | 2026 |
| Typ publikacji | artykuł w czasopiśmie |
| Otwarty dostęp | |
| Creative Commons | |
| Czasopismo/seria | IEEE Transactions on Machine Learning in Communications and Networking |
Abstract
Multi-access point coordination (MAPC) is a key feature of IEEE 802.11bn, with a potential impact on future Wi-Fi networks. MAPC enables joint scheduling decisions across multiple access points (APs) to improve throughput, latency, and reliability in dense Wi-Fi deployments. However, implementing efficient scheduling policies under diverse traffic and interference conditions in overlapping basic service sets (OBSSs) remains a complex task. This paper presents a method to minimize the network-wide worst-case latency by formulating MAPC scheduling as a sequential decision-making problem and proposing a deep reinforcement learning (DRL) mechanism to minimize worst-case delays in OBSS deployments. Specifically, we train a DRL agent using proximal policy optimization (PPO) within an 802.11bn-compatible Gymnasium environment. This environment provides observations of queue states, delay metrics, and channel conditions, enabling the agent to schedule multiple AP-station pairs to transmit simultaneously by leveraging spatial reuse (SR) groups. Simulations demonstrate that our proposed solution outperforms state-of-the-art heuristic strategies across a wide range of network loads and traffic patterns. The trained machine learning (ML) models consistently achieve lower 99th-percentile delays, showing up to a 30% improvement over the best baseline.