2SWUNet:small window SWinUNet based on tansformer for building extraction from high-resolution remote sensing images

doi:https://doi.org/10.1007/s11801-024-3179-1

Home > Archive>Volume 20, Issue 10, 2024 >599-605. DOI:https://doi.org/10.1007/s11801-024-3179-1

2SWUNet:small window SWinUNet based on tansformer for building extraction from high-resolution remote sensing images
DOI:
                        https://doi.org/10.1007/s11801-024-3179-1
                    
CSTR:
                        [cstr]
                    
Author:
                        YU Jiamin1YU Jiamin
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
CHAN Sixian1CHAN Sixian
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China;Key Laboratory of Meteorological Disaster KLME, Ministry of Education & Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters CIC-FEMD, Nanjing University of Information Science & Technology, Nanjing 210044, China;College of Geographic Information Modern Industry, Zhejiang University of Technology, Hangzhou 310023, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LEI Yanjing1LEI Yanjing
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WU Wei1,3WU Wei
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China;College of Geographic Information Modern Industry, Zhejiang University of Technology, Hangzhou 310023, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG Yuan1WANG Yuan
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHOU Xiaolong4ZHOU Xiaolong
College of Electrical and Information Engineering, Quzhou University, Quzhou 324000, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:1. College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China;2.Key Laboratory of Meteorological Disaster (KLME), Ministry of Education & Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters (CIC-FEMD), Nanjing University of Information Science & Technology, Nanjing 210044, China;3. College of Geographic Information Modern Industry, Zhejiang University of Technology, Hangzhou 310023, China;4. College of Electrical and Information Engineering, Quzhou University, Quzhou 324000, China
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Models dedicated to building long-range dependencies often exhibit degraded performance when transferred to remote sensing images. Vision transformer (ViT) is a new paradigm in computer vision that uses multi-head self-attention (MSA) rather than convolution as the main computational module, with global modeling capabilities. However, its performance on small datasets is usually far inferior to that of convolutional neural networks (CNNs). In this work, we propose a small window SWinUNet (2SWUNet) for building extraction from high-resolution remote sensing images. Firstly, the 2SWUNet is trained based on swin transformer by designing a fully symmetric encoder-decoder U-shaped architecture. Secondly, to construct a reasonable U-shaped architecture for building extraction from high-resolution remote sensing images, different forms of patch expansion are explored to simulate up-sampling operations and recover feature map resolution. Then, the small window-based multi-head self-attention (W-MSA) is designed to reduce the computational and memory burden, which is more appropriate for the features of remote sensing images. Meanwhile, the pre-training mechanism is advanced to make up for the lack of decoder parameters. Finally, comparison experiments with other mainstream CNNs and ViTs validate the superiority of the proposed model.

Get Citation

YU Jiamin, CHAN Sixian, LEI Yanjing, WU Wei, WANG Yuan, ZHOU Xiaolong.2SWUNet:small window SWinUNet based on tansformer for building extraction from high-resolution remote sensing images[J]. Optoelectronics Letters,2024,20(10):599-605

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 29,2023
Revised:April 03,2024
Adopted:
Online: September 03,2024
Published:

Home

About us

Authors

Editors

News

Contents

Contact us

Get Citation

Share

Article Metrics

History

Article QR Code