Vision Transformer-Based High-Resolution Satellite Road Extraction: Architecture and Performance Evaluation

Authors

  • Jerzy Baran Faculty of Computer Science and Telecommunications, Tadeusz Kościuszko Cracow University of Technology, Kraków 31-155, Poland
  • Konrad Pietrzak Faculty of Informatics, University of Białystok, Białystok 15-328, Poland
  • Łukasz Gajda Faculty of Informatics, University of Białystok, Białystok 15-328, Poland

DOI:

https://doi.org/10.64972/jiic.2026v4.179p12s:151-163

Keywords:

Vision Transformer, Road Extraction, Remote Sensing, Satellite Imagery

Abstract

Accurately extracting road networks from high-resolution satellite pictures is necessary for transportation management, urban planning, and the development of geographic information systems (GIS). In order to solve the geographical fragmentation and continuity issues of remote-sensing-based road segmentation, this research presents a unique Vision Transformer framework. To guarantee the precision of delineation and the stability of connection, a specific structure for feature-level fusion and loss function modification has been suggested in the new model. With over 10,000 annotated samples including urban, rural, and coastal environments, three well-known public datasets from various locations and circumstances were employed for the experiment. In every test, the ViT-based approach's mean F1-score and Intersection over Union were consistently higher than 0.82 and 0.71, respectively, and demonstrated a notable improvement over the convolutional and transformer baselines. The suggested method can preserve road connectivity and lessen the issue of false alerts in a crowded and complicated urban region, according to the experiments mentioned above. The model will be used in large-scale mapping pipelines because of its outstanding segmentation accuracy and computational economy. This work has shown that attention-driven multi-scale representations enhance automated road extraction's accuracy and spatial consistency. This approach's increased generalizability and accuracy have produced some positive outcomes and offered solid scientific basis for the subsequent creation of high-precision satellite image analysis systems.

Downloads

Published

2026-02-13

How to Cite

Baran, J., Pietrzak, K., & Gajda, Łukasz. (2026). Vision Transformer-Based High-Resolution Satellite Road Extraction: Architecture and Performance Evaluation. Journal of Intelligent Information and Communication, 4, 12s:151–163. https://doi.org/10.64972/jiic.2026v4.179p12s:151-163

Issue

Section

Articles

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.