In the last years, deep learning models have achieved remarkable generalization capability on computer vision tasks, obtaining excellent results in fine-grained classification problems. Sophisticated approaches based-on discriminative feature learning via patches have been proposed in the literature, boosting the model performances and achieving the state-of-the-art over well-known datasets. Cross-Entropy (CE) loss function is commonly used to enhance the discriminative power of the deep learned features, encouraging the separability between the classes. However, observing the activation map generated by these models in the hidden layer, we realize that many image regions with low discriminative content have a high activation response and this could lead to misclassifications. To address this problem, we propose a loss function called Gaussian Mixture Centers (GMC) loss, leveraging on the idea that data follow multiple unimodal distributions. We aim to reduce variances considering many centers per class, using the information from the hidden layers of a deep model, and decreasing the high response from the unnecessary areas of images detected along the baselines. Using jointly CE and GMC loss, we improve the learning generalization model overcoming the performance of the baselines in several use cases. We show the effectiveness of our approach by carrying out experiments over CUB-200-2011, FGVC-Aircraft, Stanford-Dogs benchmarks, and considering the most recent Convolutional Neural Network (CNN).

Learning to Navigate in the Gaussian Mixture Surface

La Grassa R.;Gallo I.;Landro N.
2021-01-01

Abstract

In the last years, deep learning models have achieved remarkable generalization capability on computer vision tasks, obtaining excellent results in fine-grained classification problems. Sophisticated approaches based-on discriminative feature learning via patches have been proposed in the literature, boosting the model performances and achieving the state-of-the-art over well-known datasets. Cross-Entropy (CE) loss function is commonly used to enhance the discriminative power of the deep learned features, encouraging the separability between the classes. However, observing the activation map generated by these models in the hidden layer, we realize that many image regions with low discriminative content have a high activation response and this could lead to misclassifications. To address this problem, we propose a loss function called Gaussian Mixture Centers (GMC) loss, leveraging on the idea that data follow multiple unimodal distributions. We aim to reduce variances considering many centers per class, using the information from the hidden layers of a deep model, and decreasing the high response from the unnecessary areas of images detected along the baselines. Using jointly CE and GMC loss, we improve the learning generalization model overcoming the performance of the baselines in several use cases. We show the effectiveness of our approach by carrying out experiments over CUB-200-2011, FGVC-Aircraft, Stanford-Dogs benchmarks, and considering the most recent Convolutional Neural Network (CNN).
2021
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
978-3-030-89127-5
978-3-030-89128-2
19th International Conference on Computer Analysis of Images and Patterns, CAIP 2021
-
2021
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2125883
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact