Abstract:Crowd counting is a challenging task, which is partly due to the multiscale variation and perspective distortion of crowd images. To solve these problems, an improved deep multiscale crowd counting network with perspective awareness was proposed. This network contains two branches. One branch uses the improved ResNet50 network to extract multiscale features, and the other extracts perspective information using a perspective-aware network formed by fully convolutional networks. The proposed network structure improves the counting accuracy when the crowd scale changes, and reduce the influence of perspective distortion. To accommodate various crowd scenarios, data-driven approaches are used to fine-tune the trained convolutional neural networks (CNN) model of the target scenes. The extensive experiments on three public datasets demonstrate the validity and reliability of the proposed method.