Global-local feature attention network with reranking strategy for image caption generation

doi:https://doi.org/10.1007/s11801-017-7185-4

Home > Archive>Volume 13, Issue 6, 2017 >448-451. DOI:https://doi.org/10.1007/s11801-017-7185-4

Global-local feature attention network with reranking strategy for image caption generation
DOI:
                        https://doi.org/10.1007/s11801-017-7185-4
                    
CSTR:
                        [cstr]
                    
Author:
                        WU Jie1WU Jie
College of Engineering, Shantou University, Shantou 515063, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
XIE Si-ya1XIE Si-ya
College of Engineering, Shantou University, Shantou 515063, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
SHI Xin-bao1SHI Xin-bao
College of Engineering, Shantou University, Shantou 515063, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
CHEN Yao-wen2CHEN Yao-wen
Key Laboratory of Digital Signal and Image Processing of Guangdong, Shantou University, Shantou 515063, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:1. College of Engineering, Shantou University, Shantou 515063, China
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Related [20]

Cited by

Materials

Comments

Abstract:

In this paper, a novel framework, named as global-local feature attention network with reranking strategy (GLAN-RS), is presented for image captioning task. Rather than only adopting unitary visual information in the classical models, GLAN-RS explores the attention mechanism to capture local convolutional salient image maps. Furthermore, we adopt reranking strategy to adjust the priority of the candidate captions and select the best one. The proposed model is verified using the Microsoft Common Objects in Context (MSCOCO) benchmark dataset across seven standard evaluation metrics. Experimental results show that GLAN-RS significantly outperforms the state-of-the-art approaches, such as multimodal recurrent neural network (MRNN) and Google NIC, which gets an improvement of 20% in terms of BLEU4 score and 13 points in terms of CIDER score.

Get Citation

WU Jie, XIE Si-ya, SHI Xin-bao, CHEN Yao-wen. Global-local feature attention network with reranking strategy for image caption generation[J]. Optoelectronics Letters,2017,13(6):448-451

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 10,2017
Revised:
Adopted:
Online: November 17,2017
Published:

Home

About us

Authors

Editors

News

Contents

Contact us

Get Citation

Share

Article Metrics

History

Article QR Code