Abstract:To address the accuracy challenge in obstacle detection for autonomous driving, we propose an improved YOLOX-S ob-stacle detection model that can detect multiple targets, including people, cars, bicycles, motorcycles, and buses. This paper aims to propose a model based on YOLOX-S that surpasses the baseline model and is capable of real-time detection:(1) We suggest that the existing YOLOX-S backbone be replaced with the Swin Transformer-Tiny backbone. This change aims to improve the local feature extraction capability, leading to more accurate detection of obstacles under real-world vehicle conditions. (2)We decreased the number of channels between the Swin Transformer and PA-FPN from [96, 192, 384, 768] to [192, 384, 768]. This reduces the computational cost and makes the Swin Transformer-Tiny more compati-ble with the PA-FPN. Conclusively, compared to YOLOX-S, our proposed method, YOLOX-S based on swin trans-former-tiny (ST-YOLOX-S), has a 6.1% improvement in accuracy on the COCO dataset. Among the five types of obsta-cles that will appear in simulated actual vehicle conditions, our ST-YOLOX-S has shown excellent improvement in accu-racy compared to YOLOX-S. Furthermore, the detection accuracy is significantly improved compared to YOLOv3, showing the effectiveness of the proposed algorithm.