Abstract:Underwater target detection, a critical aspect of underwater robotics and ocean information processing, presents formidable challenges due to the distinctive imaging conditions beneath the water's surface. Issues such as noise interference, indistinct texture features, low contrast, and color distortion necessitate advanced computer vision techniques. In this paper, we introduce the MCR-YOLO model, a deep neural network for underwater target detection. This model leverages the Multi-Color Spatial Feature framework to address these challenges. It employs two feature extraction branches: the first focuses on the luminance channel (Y) of the RGB image in the YCbCr color space to extract non-color features, using an enhanced ResNet50 architecture. The output features from three scales are integrated for information exchange. Additionally, low-frequency information is incorporated through a separate feature extraction branch, operating on the three-channel RGB image. The features obtained from the three scales in both branches are fused at corresponding scales, enabling comprehensive feature integration. The culmination of this process results in multi-scale feature fusion and robust target detection, integrating the PANet framework. This innovative approach promises to significantly enhance the reliability and accuracy of underwater target detection in challenging underwater environments.