A complete human verified Turkish caption dataset for MS COCO and performance evaluation with well-known image caption models trained against it
Published in International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), 2022
S. B. Golech, S. B. Karacan, E. B. Sönmez and H. Ayral, “A complete human verified Turkish caption dataset for MS COCO and performance evaluation with well-known image caption models trained against it,” 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Maldives, Maldives, 2022, pp. 1-6, doi: 10.1109/ICECCME55909.2022.9988025.Abstract: The procedure of generating natural language captions for an image is known as image captioning. Automatic image captioning is a particularly challenging task that stands at the junction of Computer Vision and Natural Language Processing. It has a variety of applications, including text-based image retrieval, assisting visually impaired users, and human-robot interaction. The majority of publications on the subject focus on the English language, which is an analytical language with characteristics differing from the agglutinative Turkish language. This work introduces the Turkish MS COCO dataset that extends the original MS COCO collection with captions in the Turkish language; experimental results surpass the current state-of-the-art for the Turkish image captioning field. Furthermore, the newly introduced database is also applicable for the study of machine translation. On the Turkish MS COCO dataset, the best performance has been achieved with the Meshed Memory Transformers with a Bleu-1 score of 0.72. The database is publicly available at https://github.com/BilgiAILAB/TurkishImageCaptioning. It is desired that the Turkish MS COCO dataset with the proposed benchmark will be an excellent resource for future studies on Turkish image captioning. keywords: {Performance evaluation;Mechatronics;Databases;Neural networks;Image retrieval;Human-robot interaction;Transformers;Computer Vision;Natural Language Processing;Transformers;Turkish image captioning;Turkish MS COCO database;CNN;RNN;LSTM;deep neural networks},URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9988025&isnumber=9987729
Recommended citation: S. B. Golech, S. B. Karacan, E. B. Sönmez and H. Ayral, "A complete human verified Turkish caption dataset for MS COCO and performance evaluation with well-known image caption models trained against it" 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Maldives, Maldives, 2022, pp. 1-6, doi: 10.1109/ICECCME55909.2022.9988025.
Download Paper | Download Slides