Multi-frame super-resolution reconstruction based on global motion estimation using a novel CNN descriptor
CSTR:
Author:
Affiliation:

1.School of Automation Science and Engineering, South China University of Technology, Guangzhou 510640, China;2. Guangdong Polytechnic Normal University, Guangzhou 510665, China

  • Article
  • | |
  • Metrics
  • |
  • Reference [33]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    In this paper, we introduce a novel feature descriptor based on deep learning that trains a model to match the patches of images on scenes captured under different viewpoints and lighting conditions for Multi-frame super-resolution. The patch matching of images capturing the same scene in varied circumstances and diverse manners is challenging. We develop a model which maps the raw image patch to a low dimensional feature vector. As our experiments show, the proposed approach is much better than state-of-the-art descriptors and can be considered as a direct replacement of SURF. The results confirm that these techniques further improve the performance of the proposed descriptor. Then we propose an improved Random Sample Consensus algorithm for removing false matching points. Finally, we show that our neural network based image descriptor for image patch matching outperforms state-of-the-art methods on a number of benchmark datasets and can be used for image registration with high quality in multi-frame super-resolution reconstruction.

    Reference
    [1] Hyde R. Eyeglass, SPIE 4849, 28 (2002).
    [2] Bay H., Ess A., Tuytelaars T. and Van Gool L., Computer Vision and image Understanding 110, 346 (2008).
    [3] Brown M., Hua G. and Winder S., IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 43 (2010).
    [4] Trzcinski T., Christoudias M. and Lepetit V., IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 597 (2015).
    [5] Trzcinski T., Christoudias M., Fua P. and Lepetit, V., Boosting Binary Key-Point Descriptors, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2874 (2013).
    [6] Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., Huang Z., Karpathy A., Khosla A., Bernstein M., Berg A. and Fei-Fei L., International Journal Of Computer Vision 115, 211 (2015).
    [7] Fischer P., Dosovitskiy A. and Brox T., Descriptor Matching with Convolutional Neural Networks:a Comparison to SIFT, arXiv:1405.5769, 2014.
    [8] Simo-Serra E., Trulls E., Ferraz L., Kokkinos I., Fua P. and Moreno-Noguer F., Discriminative Learning of Deep Convolutional Feature Point Descriptors, IEEE International Conference on Computer Vision, 118 (2015).
    [9] Han X., Leung T., Jia Y., Sukthankar R. and Berg A.,Matchnet:Unifying Feature and Metric Learning for Patch-Based Matching, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3279 (2015).
    [10] Yi K., Trulls E., Lepetit V. and Fua P., LIFT:Learned Invariant Feature Transform, European Conference Computer Vision, 467 (2016).
    [11] Tian Y., Fan B., Wu F., L2-Net:Deep Learning of Discriminative Patch Descriptor in Euclidean Space, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 6128 (2017).
    [12] Chen M., Wang C. and Qin H., Computer Aided Geometric Design 62, 192 (2018).
    [13] Brown L., ACM Computing Surveys 24, 325 (1992).
    [14] Zitova B. and Flusser J., Image and Vision Computing 21, 977 (2003).
    [15] Lucas B. and Kanade T., An Iterative Image Registration Technique with an Application to Stereo Vision, The 7th International Joint Conference on Artificial Intelligence, 674 (1981).
    [16] Harris C. and Stephens M., A Combined Corner and Edge Detector, The 4th Alvey Vision Conference, 10 (1988).
    [17] Lowe D., International Journal of Computer Vision 60, 91 (2004).
    [18] Keren D., Peleg S. and Brada R., Image Sequence Enhancement Using Subpixel Displacements. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 742 (1988).
    [19] Irani M. and Peleg S., CVGIP:Graphical Models & Image Processing 53, 231 (1991).
    [20] Schultz R. and Stevenson R., IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society 5, 996 (1996).
    [21] Baker S. and Kanade T., IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1167 (2002).
    [22] Liao R., Tao X., Li, R., Video Super-Resolution via Deep Draft-Ensemble Learning, IEEE International Conference on Computer Vision, 531 (2015).
    [23] Kappeler A., Yoo S., Dai Q. and Katsaggelos A., IEEE Transactions on Computational Imaging 2, 109 (2016).
    [24] Caballero J., Ledig C., Aitken A., Acosta A., Totz J., Wang Z. and Shi W, Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation, IEEE Computer Vision and Pattern Recognition, 2848 (2017).
    [25] Tao X., Gao H., Liao R., Wang J. and Jia J., Detail-Revealing Deep Video Super-Resolution, IEEE International Conference on Computer Vision, 4482 (2017).
    [26] Ren S., He K., Girshick R. and Sun J., IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1137 (2017).
    [27] Fischler M. and Bolles R., Communications of the ACM 24, 381 (1981).
    [28] Verdie Y., Yi K., Fua P. and Lepetit V., TILDE:A Temporally Invariant Learned Detector, IEEE Conference on Computer Vision and Pattern Recognition, 5279 (2015).
    [29] Strecha C., Hansen W., Van Gool L., Fua P. and Thoennessen, U., On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery, IEEE Conference on Computer Vision and Pattern Recognition, 1 (2008).
    [30] Rublee E., Rabaud V., Konolidge K. and Bradski G., ORB:An Efficient Alternative to SIFT or SURF, International Conference on Computer Vision, 2564 (2011).
    [31] Balntas V., Johns E., Tang L. and Mikolajczyk K., PN-Net:Conjoined Triple Deep Network for Learning Local Image Descriptors, arXiv:1601.05030, 2016.
    [32] Han X., Leung T., Jia Y., Sukthankar R. and Berg A., MatchNet:Unifying Feature and Metric Learning for Patch-Based Matching, IEEE Conference on Computer Vision and Pattern Recognition, 3279 (2015).
    [33] Wang Z., Bovik A., Sheikh H. and Simoncelli E., IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society 13, 600 (2004).
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

GAO Hong-xia, XIE Wang, KANG Hui, LIN Guo-yuan. Multi-frame super-resolution reconstruction based on global motion estimation using a novel CNN descriptor[J]. Optoelectronics Letters,2019,15(6):468-475

Copy
Share
Article Metrics
  • Abstract:957
  • PDF: 0
  • HTML: 0
  • Cited by: 0
History
  • Received:December 31,2018
  • Revised:March 20,2019
  • Online: May 01,2020
Article QR Code