Abstract:The goal of Multi-View Stereo (MVS) is to robustly recover an accurate 3D point cloud from multiple views. In this pa-per, we propose a novel Multi-View Stereo network with Regional consistency and Discrepancy cost volume, denoted as MVSRD. Firstly, a full-feature interaction transformer is presented, which learns the regional consistency between the reference and source views, improving the robustness of reconstruction. Secondly, a discrepancy cost volume is designed to emphasize pixel-level differences between feature volumes. It facilitates the constructing of high-quality cost volume to enhance the accuracy of reconstruction. Extensive experiments on the DTU and Tanks & Temples datasets demonstrate that our MVSRD achieves state-of-the-art performance.