Abstract:For low-light image enhancement tasks, RAW images surpass RGB images due to their high information content, yet their noise and single-channel nature challenge feature extraction. Existing methods using multi-stage CNN frameworks struggle with global feature extraction, while single-stage CNN-Transformer fusions often result in residual noise. To overcome these limitations, this paper introduces a multi-stage RAW image enhancement network combining CNN and Transformer. In consideration of the characteristics inherent to the task at hand, we have devised a CNN-based denoising block for the denoising stage and incorporated wavelet information to enhance frequency features accordingly. A Transformer-based correction block has been designed for the color and white balance recovery stage, with the white balance being adjusted dynamically using a signal-to-noise ratio map. With this design, our method outperforms other state-of-the-art models in all indicators on the Sony and Fuji datasets of SID, and achieves optimal SSIM on the MCR dataset.