Szczegóły publikacji

Opis bibliograficzny

Automating air pollution map analysis with multi-modal AI and visual context engineering / Szymon COGIEL, Mateusz ZARĘBA, Tomasz DANEK, Filip Arnaut // Atmosphere [Dokument elektroniczny]. — Czasopismo elektroniczne ; ISSN 2073-4433 . — 2026 — vol. 17 iss. 1 art. no. 2, s. 1-14. — Wymagania systemowe: Adobe Reader. — Bibliogr. s. 12-14, Abstr. — Publikacja dostępna online od: 2025-12-19

Autorzy (4)

Słowa kluczowe

multimodal large language models environmental data interpretation geospatial deep learning automated map understanding spatiotemporal air pollution mapping explainable environmental AI PM2.5 sensor networks AI-driven ecological monitoring generative artificial intelligence visual context engineering

Dane bibliometryczne

ID BaDAP	165587
Data dodania do BaDAP	2026-02-03
Tekst źródłowy	URL
DOI	10.3390/atmos17010002
Rok publikacji	2026
Typ publikacji	artykuł w czasopiśmie
Otwarty dostęp
Creative Commons
Czasopismo/seria	Atmosphere

Abstract

The increasing volume of data from IoT sensors has made manual inspection time-consuming and prone to bias, particularly for spatiotemporal air pollution maps. While rule-based methods are adequate for simple datasets or individual maps, they are insufficient for interpreting multi-year time series data with 1 h timestamps, which require both domain-specific expertise and significant time investment. This limitation is especially critical in environmental monitoring, where analyzing long-term spatiotemporal PM2.5 maps derived from 52 low-cost sensors remains labor-intensive and susceptible to human error. This study investigates the potential of generative artificial intelligence, specifically multi-modal large language models (MLLMs), for interpreting spatiotemporal PM2.5 maps. Both open-source models (Janus-Pro and LLaVA-1.5) and commercial large language models (GPT-4o and Gemini 2.5 Pro) were evaluated. The initial results showed a limited performance, highlighting the difficulty of extracting meaningful information directly from raw sensor-derived maps. To address this, a visual context engineering framework was introduced, comprising systematic optimization of colormaps, normalization of intensity ranges, and refinement of map layers and legends to improve clarity and interpretability for AI models. Evaluation using the GEval metric demonstrated that visual context engineering increased interpretation accuracy (defined as the detection of PM2.5 spatial extrema) by over 32.3% (relative improvement). These findings provide strong evidence that tailored visual preprocessing enables MLLMs to effectively interpret complex environmental time series data, representing a novel approach that bridges data-driven modeling with ecological monitoring and offers a scalable solution for automated, reliable, and reproducible analysis of high-resolution air quality datasets.