Szczegóły publikacji
Opis bibliograficzny
Automating air pollution map analysis with multi-modal AI and visual context engineering / Szymon COGIEL, Mateusz ZARĘBA, Tomasz DANEK, Filip Arnaut // Atmosphere [Dokument elektroniczny]. — Czasopismo elektroniczne ; ISSN 2073-4433 . — 2026 — vol. 17 iss. 1 art. no. 2, s. 1-14. — Wymagania systemowe: Adobe Reader. — Bibliogr. s. 12-14, Abstr. — Publikacja dostępna online od: 2025-12-19
Autorzy (4)
- AGHCogiel Szymon
- AGHZaręba Mateusz
- AGHDanek Tomasz
- Arnaut Filip
Słowa kluczowe
Dane bibliometryczne
| ID BaDAP | 165587 |
|---|---|
| Data dodania do BaDAP | 2026-02-03 |
| Tekst źródłowy | URL |
| DOI | 10.3390/atmos17010002 |
| Rok publikacji | 2026 |
| Typ publikacji | artykuł w czasopiśmie |
| Otwarty dostęp | |
| Creative Commons | |
| Czasopismo/seria | Atmosphere |
Abstract
The increasing volume of data from IoT sensors has made manual inspection time-consuming and prone to bias, particularly for spatiotemporal air pollution maps. While rule-based methods are adequate for simple datasets or individual maps, they are insufficient for interpreting multi-year time series data with 1 h timestamps, which require both domain-specific expertise and significant time investment. This limitation is especially critical in environmental monitoring, where analyzing long-term spatiotemporal PM2.5 maps derived from 52 low-cost sensors remains labor-intensive and susceptible to human error. This study investigates the potential of generative artificial intelligence, specifically multi-modal large language models (MLLMs), for interpreting spatiotemporal PM2.5 maps. Both open-source models (Janus-Pro and LLaVA-1.5) and commercial large language models (GPT-4o and Gemini 2.5 Pro) were evaluated. The initial results showed a limited performance, highlighting the difficulty of extracting meaningful information directly from raw sensor-derived maps. To address this, a visual context engineering framework was introduced, comprising systematic optimization of colormaps, normalization of intensity ranges, and refinement of map layers and legends to improve clarity and interpretability for AI models. Evaluation using the GEval metric demonstrated that visual context engineering increased interpretation accuracy (defined as the detection of PM2.5 spatial extrema) by over 32.3% (relative improvement). These findings provide strong evidence that tailored visual preprocessing enables MLLMs to effectively interpret complex environmental time series data, representing a novel approach that bridges data-driven modeling with ecological monitoring and offers a scalable solution for automated, reliable, and reproducible analysis of high-resolution air quality datasets.