In the last years, deep learning models have achieved remarkable generalization capability on computer vision tasks, obtaining excellent results in fine-grained classification problems. Sophisticated approaches based-on discriminative feature learning via patches have been proposed in the literature, boosting the model performances and achieving the state-of-the-art over well-known datasets. Cross-Entropy (CE) loss function is commonly used to enhance the discriminative power of the deep learned features, encouraging the separability between the classes. However, observing the activation map generated by these models in the hidden layer, we realize that many image regions with low discriminative content have a high activation response and this could lead to misclassifications. To address this problem, we propose a loss function called Gaussian Mixture Centers (GMC) loss, leveraging on the idea that data follow multiple unimodal distributions. We aim to reduce variances considering many centers per class, using the information from the hidden layers of a deep model, and decreasing the high response from the unnecessary areas of images detected along the baselines. Using jointly CE and GMC loss, we improve the learning generalization model overcoming the performance of the baselines in several use cases. We show the effectiveness of our approach by carrying out experiments over CUB-200-2011, FGVC-Aircraft, Stanford-Dogs benchmarks, and considering the most recent Convolutional Neural Network (CNN).
Learning to Navigate in the Gaussian Mixture Surface
La Grassa R.;Gallo I.;Landro N.
2021-01-01
Abstract
In the last years, deep learning models have achieved remarkable generalization capability on computer vision tasks, obtaining excellent results in fine-grained classification problems. Sophisticated approaches based-on discriminative feature learning via patches have been proposed in the literature, boosting the model performances and achieving the state-of-the-art over well-known datasets. Cross-Entropy (CE) loss function is commonly used to enhance the discriminative power of the deep learned features, encouraging the separability between the classes. However, observing the activation map generated by these models in the hidden layer, we realize that many image regions with low discriminative content have a high activation response and this could lead to misclassifications. To address this problem, we propose a loss function called Gaussian Mixture Centers (GMC) loss, leveraging on the idea that data follow multiple unimodal distributions. We aim to reduce variances considering many centers per class, using the information from the hidden layers of a deep model, and decreasing the high response from the unnecessary areas of images detected along the baselines. Using jointly CE and GMC loss, we improve the learning generalization model overcoming the performance of the baselines in several use cases. We show the effectiveness of our approach by carrying out experiments over CUB-200-2011, FGVC-Aircraft, Stanford-Dogs benchmarks, and considering the most recent Convolutional Neural Network (CNN).I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.