Document Type

Conference Paper

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

Computer Sciences

Publication Details

Artificial Neural Networks and Machine Learning – ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part I. Springer International Publishing, (2018). LNCS 11139. ISBN 978-3-030-01418-6.

Abstract

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty.

We make our source code publicly available online: https://github.com/AnnikaLindh/Diverse_and_Specific_Image_Captioning

DOI

https://doi.org/10.1007/978-3-030-01418-6_18

Funder

ADAPT Centre for Digital Content Technology

Creative Commons License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.


Share

COinS