Fine-grained image classification is a challenging task due to the presence of hierarchical coarse-to-fine-grained distribution in the dataset. Generally, parts are used to discriminate various objects in fine-grained datasets, however, not all parts are beneficial and indispensable. In recent years, natural language descriptions are used to obtain information on discriminative parts of the object. This paper leverages on natural language description and proposes a strategy for learning the joint representation of natural language description and images using a two-branch network with multiple layers to improve the fine-grained classification task. Extensive experiments show that our approach gains significant improvements in accuracy for the fine-grained image classification task. Furthermore, our method achieves new state-of-the-art results on the CUB-200-2011 dataset.

Are These Birds Similar: Learning Branched Networks for Fine-grained Representations

Shah Nawaz;Alessandro Calefati;CARAFFINI, MORENO;Nicola Landro;Ignazio Gallo
2019-01-01

Abstract

Fine-grained image classification is a challenging task due to the presence of hierarchical coarse-to-fine-grained distribution in the dataset. Generally, parts are used to discriminate various objects in fine-grained datasets, however, not all parts are beneficial and indispensable. In recent years, natural language descriptions are used to obtain information on discriminative parts of the object. This paper leverages on natural language description and proposes a strategy for learning the joint representation of natural language description and images using a two-branch network with multiple layers to improve the fine-grained classification task. Extensive experiments show that our approach gains significant improvements in accuracy for the fine-grained image classification task. Furthermore, our method achieves new state-of-the-art results on the CUB-200-2011 dataset.
2019
2019 International Conference on Image and Vision Computing New Zealand (IVCNZ 2019)
9781728141879
International Conference on Image and Vision Computing New Zealand
Dunedin, New Zealand
2-4 December, 2019
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2083822
 Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 18
  • ???jsp.display-item.citation.isi??? 11
social impact