Monocular 3D gaze estimation using feature discretization and attention mechanism
CSTR:
Author:
Affiliation:

1. School of Microelectronics, Tianjin University, Tianjin 300072, China;2. Institute of Microelectronics, University of Macau, Macau 999078, China

  • Article
  • | |
  • Metrics
  • |
  • Reference [23]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Gaze estimation has become an important field of image and information processing. Estimating gaze from full-face images using convolutional neural network (CNN) has achieved fine accuracy. However, estimating gaze from eye images is very challenging due to the less information contained in eye images than in full-face images, and it’s still vital since eye-image-based methods have wider applications. In this paper, we propose the discretization-gaze network (DGaze-Net) to optimize monocular three-dimensional (3D) gaze estimation accuracy by feature discretization and attention mechanism. The gaze predictor of DGaze-Net is optimized based on feature discretization. By discretizing the gaze angle into K bins, a classification constraint is added to the gaze predictor. In the gaze predictor, the gaze angle is pre-applied with a binned classification before regressing with the real gaze angle to improve gaze estimation accuracy. In addition, the attention mechanism is applied to the backbone to enhance the ability to extract eye features related to gaze. The proposed method is validated on three gaze datasets and achieves encouraging gaze estimation accuracy.

    Reference
    [1] CANIGUERAL R, HAMILTON A F D C. The role of eye gaze during natural social interactions in typical and autistic people[J]. Frontiers in psychology, 2019, 10.
    [2] LI B, ZHANG Y, ZHENG X, et al. A smart eye tracking system for virtual reality[C]//2019 IEEE MTT-S International Microwave Biomedical Conference (IMBioC), May 6-8, 2019, Nanjing, China. New York:IEEE, 2019:1-3.
    [3] WANG H, DONG X, CHEN Z, et al. Hybrid gaze/EEG brain computer interface for robot arm control on a pick and place task[C]//2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), August 25-29, 2015, Milan, Italy. New York:IEEE, 2015:1476-1479.
    [4] SIROHEY S, ROSENFELD A, DURIC Z. A method of detecting and tracking irises and eyelids in video[J]. Pattern recognition, 2002, 35(6):1389-1401.
    [5] WU L, XU X, SHEN C. Eye detection and tracking using IR source[J]. Optoelectronics letters, 2006, 2:145-147.
    [6] HIROTAKE Y, AKIRA U, TOMOKO Y, et al. Remote gaze estimation with a single camera based on facial-feature tracking without special calibration actions[C]//Proceedings of the 2008 Symposium on Eye Tracking Research & Applications (ETRA '08), March 26-28, 2008, Savannah, Georgia. New York:ACM, 2008:245-250.
    [7] ZHANG X, SUGANO Y, FRITZ M, et al. Appearance-based gaze estimation in the wild[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, USA. New York:IEEE, 2015:4511-4520.
    [8] ZHANG X, SUGANO Y, FRITZ M. MPIIGaze:real-world dataset and deep appearance-based gaze estimation[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(1):162-175.
    [9] CHENG Y, LU F, ZHANG X. Appearance-based gaze estimation via evaluation-guided asymmetric regression[C]//2018 European Conference on Computer Vision (ECCV), September 8-14, 2018, Munich, Germany. Berlin, Heidelberg:Springer, 2018.
    [10] CHEN Z, SHI B E. Appearance-based gaze estimation using dilated-convolutions[C]//2018 Asian Conference on Computer Vision, December 2-6, 2018, Perth, Australia. Berlin, Heidelberg:Springer, 2019:309-324.
    [11] ZHANG X, SUGANO Y, FRITZ M, et al. It’s written all over your face:full-face appearance-based gaze estimation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, July 21-26, 2017, Honolulu, HI, USA. New York:IEEE, 2017.
    [12] LIU S, LIU D, WU H. Gaze estimation with multi-scale channel and spatial attention[C]//Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition (ICCPR), October 30-November 1, 2020, Xiamen, China. New York:ACM, 2020:303-309.
    [13] MURTHY L R D, BISWAS P. Appearance-based gaze estimation using attention and difference mechanism[C]//2021 IEEE Conference on Computer Vision and Pattern Recognition Workshops, June 20-25, 2021, virtual. New York:IEEE, 2021:3143-3152.
    [14] ABDELRAHMAN A, HEMPEL T, KHALIFA A, et al. L2CS-Net:fine-grained gaze estimation in unconstrained environments[C]//Proceedings of the 2022 29th IEEE International Conference on Image Processing (ICIP), October 16-19, 2022, Bordeaux, France. New York:IEEE, 2022.
    [15] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 2015 International Conference on Learning Representations (ICLR), May 7-9, 2015, San Diego, CA, USA. Banff:Computational and Biological Learning Society, 2015.
    [16] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 26-July 1, 2016, Las Vegas, USA. New York:IEEE, 2016.
    [17] WOO S, PARK J, LEE J Y, et al. CBAM:convolutional block attention module[C]//Proceedings of the 2018 European Conference on Computer Vision (ECCV), September 8-14, 2018, Munich, Germany. Berlin, Heidelberg:Springer, 2018.
    [18] KENDALL A, GAL Y, CIPOLLA R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2018, Salt Lake City, USA. New York:IEEE, 2018:7482-7491.
    [19] FISCHER T, CHANG H, DEMIRIS Y. RT-GENE:real-time eye gaze estimation in natural environments[C]//Proceedings of the 2018 European Conference on Computer Vision (ECCV), September 8-14, 2018, Munich, Germany. Berlin, Heidelberg:Springer, 2018.
    [20] SUGANO Y, MATSUSHITA Y, SATO Y. Learning-by-synthesis for appearance-based 3D gaze estimation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 23-28, 2014, Columbus, USA. New York:IEEE, 2014:1821-1828.
    [21] XIONG Y, KIM H, SINGH V. Mixed effects neural networks (MeNets) with applications to gaze estimation[C]//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 16-20, 2019, Long Beach, USA. New York:IEEE, 2019:7735-7744.
    [22] BERNARD V, WANNOUS H, VANDEBORRE J P. Eye-gaze estimation using a deep capsule-based regression network[C]//Proceedings of the 2021 International Conference on Content-Based Multimedia Indexing (CBMI), June 28-30, 2021, Lille, France. New York:IEEE, 2021:1-6.
    [23] MAHMUD Z, HUNGLER P, ETEMAD A. Multistream gaze estimation with anatomical eye region isolation by synthetic to real transfer learning[EB/OL]. (2022-06-18) [2022-11-12]. https://arxiv.org/abs/2206.09256.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

SHA Tong, SUN Jinglin, PUN Siohang, LIU Yu. Monocular 3D gaze estimation using feature discretization and attention mechanism[J]. Optoelectronics Letters,2023,19(5):301-306

Copy
Share
Article Metrics
  • Abstract:316
  • PDF: 450
  • HTML: 0
  • Cited by: 0
History
  • Received:October 09,2022
  • Revised:January 08,2023
  • Online: June 19,2023
Article QR Code