Szczegóły publikacji
Opis bibliograficzny
Reinforcement-learning-based fuzzy bipartite consensus for multiagent systems: a novel scaling off-policy learning scheme / Jing Wang, Qing Yang, Jinde Cao, Leszek RUTKOWSKI, Hao Shen // IEEE Transactions on Cybernetics ; ISSN 2168-2267 . — 2025 — vol. 55 no. 9, s. 4491–4501. — Bibliogr. s. 4500–4501, Abstr. — Publikacja dostępna online od: 2025-06-04. — L. Rutkowski - dod. afiliacja: Systems Research Institute of the Polish Academy of Sciences, Warsaw, Poland
Autorzy (5)
- Wang Jing
- Yang Qing
- Cao Jinde
- AGHRutkowski Leszek
- Shen Hao
Słowa kluczowe
Dane bibliometryczne
| ID BaDAP | 165162 |
|---|---|
| Data dodania do BaDAP | 2025-12-22 |
| Tekst źródłowy | URL |
| DOI | 10.1109/TCYB.2025.3572104 |
| Rok publikacji | 2025 |
| Typ publikacji | artykuł w czasopiśmie |
| Otwarty dostęp | |
| Czasopismo/seria | IEEE Transactions on Cybernetics |
Abstract
The bipartite consensus (BC) issue for nonlinear multiagent systems (NMASs) with unknown system dynamics information is investigated in this article. Initially, the dynamics of NMASs are represented using the Takagi-Sugeno (T-S) fuzzy model. Subsequently, to achieve distributed control, a minmax game policy is introduced, where each agent aims to minimize its performance index while its neighbors attempt to maximize it. Consequently, the BC problem for NMASs is reformulated as a zero-sum game, transforming the controller design into solving a set of game algebraic Riccati equations (GAREs). To solve such equations, a novel scaling off-policy iteration (PI) algorithm is proposed. The key features of the proposed learning algorithm can be outlined in three main aspects: 1) during the learning process, the reliance on system dynamics is relaxed; 2) compared with the PI method, the requirement for initial admissible control policies is eliminated; and 3) a more rapid convergence speed is achieved than traditional value iteration. Finally, the effectiveness and advantages of the proposed method are validated through a simulation example and a series of comparative experiments. © 2025 IEEE.