Szczegóły publikacji

Opis bibliograficzny

On explainability of reinforcement learning-based machine learning agents trained with proximal policy optimization that utilizes visual sensor data / Tomasz HACHAJ, Marcin PIEKARCZYK // Applied Sciences (Basel) [Dokument elektroniczny]. — Czasopismo elektroniczne ; ISSN 2076-3417. — 2025 — vol. 15 iss. 2 art. no. 538, s. 1–26. — Wymagania systemowe: Adobe Reader. — Bibliogr. s. 23–26, Abstr. — Publikacja dostępna online od: 2025-01-08

Autorzy (2)

Słowa kluczowe

GradCAM decision tree visual sensor semantic data reinforcement learning explainability proximal policy optimization

Dane bibliometryczne

ID BaDAP	157752
Data dodania do BaDAP	2025-02-19
Tekst źródłowy	URL
DOI	10.3390/app15020538
Rok publikacji	2025
Typ publikacji	artykuł w czasopiśmie
Otwarty dostęp
Creative Commons
Czasopismo/seria	Applied Sciences (Basel)

Abstract

In this paper, we address the issues of the explainability of reinforcement learning-based machine learning agents trained with Proximal Policy Optimization (PPO) that utilizes visual sensor data. We propose an algorithm that allows an effective and intuitive approximation of the PPO-trained neural network (NN). We conduct several experiments to confirm our method’s effectiveness. Our proposed method works well for scenarios where semantic clustering of the scene is possible. Our approach is based on the solid theoretical foundation of Gradient-weighted Class Activation Mapping (GradCAM) and Classification and Regression Tree with additional proxy geometry heuristics. It excels in the explanation process in a virtual simulation system based on a video system with relatively low resolution. Depending on the convolutional feature extractor of the PPO-trained neural network, our method obtains 0.945 to 0.968 accuracy of approximation of the black-box model. The proposed method has important application aspects. Through its use, it is possible to estimate the causes of specific decisions made by the neural network due to the current state of the observed environment. This estimation makes it possible to determine whether the network makes decisions as expected (decision-making is related to the model’s observation of objects belonging to different semantic classes in the environment) and to detect unexpected, seemingly chaotic behavior that might be, for example, the result of data bias, bad design of the reward function or insufficient generalization abilities of the model. We publish all source codes so our experiments can be reproduced.

Publikacje, które mogą Cię zainteresować

fragment książki

#129154Data dodania: 25.6.2020

Automatic management of cloud applications with use of proximal policy optimization / Włodzimierz FUNIKA, Paweł Koperek, Jacek KITOWSKI // W: Computational Science - ICCS 2020 : 20th International Conference : Amsterdam, The Netherlands, June 3–5, 2020 : proceedings, Pt. 1 / eds. Valeria V. Krzhizhanovskaya, [et al.]. — Cham : Springer Nature Switzerland, cop. 2020. — (Lecture Notes in Computer Science ; ISSN 0302-9743 ; LNCS 12137. Theoretical Computer Science and General Issues ; ISSN 0302-9743). — ISBN: 978-3-030-50370-3; e-ISBN: 978-3-030-50371-0. — S. 73–87. — Bibliogr. s. 85–87, Abstr. — Publikacja dostępna online od: 2020-06-15. — J. Kitowski - dod. afiliacja: ACC CYFRONET AGH

Szczegóły

artykuł

#151629Data dodania: 12.3.2024

Application of reinforcement learning in decision systems: lift control case study / Mateusz WOJTULEWICZ, Tomasz SZMUC // Applied Sciences (Basel) [Dokument elektroniczny]. — Czasopismo elektroniczne ; ISSN 2076-3417. — 2024 — vol. 14 iss. 2 art. no. 569, s. 1–12. — Wymagania systemowe: Adobe Reader. — Bibliogr. s. 11–12, Abstr. — Publikacja dostępna online od: 2024-01-09

Szczegóły