Abstract:
Event-based cameras generate sparse event streams and capture high-speed motion information, however, as the time resolution increases, the spatial resolution will decrease sharply. Although the generative adversarial network has achieved remarkable results in traditional image restoration, directly using it for event inpainting will obscure the fast response characteristics of the event camera, and the sparsity of the event stream is not fully utilized. To tackle the challenges, an event-inpainting network is proposed. The number and structure of the network are redesigned to adapt to the sparsity of events, and the dimensionality of the convolution is increased to retain more spatiotemporal information. To ensure the time consistency of the inpainting image, an event sequence discriminator is added. The tests on the DHP19 and MVSEC datasets were performed. Compared with the state-of-the-art traditional image inpainting method, the method in this paper reduces the number of parameters by 93.5% and increases the inference speed by 6 times without reducing the quality of the restored image too much. In addition, the human pose estimation experiment also revealed that this model can fill in human motion information in high frame rate scenes.