This study investigates how the learning order between segmentation and classification tasks influences performance and generalization in medical image analysis. We propose a Sequential Swin Transformer framework that reuses a shared Transformer backbone with alternating task-specific heads to compare two sequential strategies: (i) segmentation followed by classification and (ii) classification followed by segmentation. Unlike conventional multitask or preprocessing-based pipelines, the proposed framework isolates the impact of task ordering on feature transfer under an identical architecture. Evaluated on the HAM10000 skin lesion dataset, the segmentation-then-classification configuration achieves the highest multiclass accuracy (up to 86.9%) while maintaining strong segmentation performance (Jaccard index ≈ 86%). Statistical tests confirm its superiority in accuracy and macro F1 score, whereas Grad-CAM and t-distributed stochastic neighbor embedding (t-SNE) analyses reveal that segmentation-first training yields more lesion-centered attention and a more discriminative latent space. Cross-domain evaluation on gastrointestinal endoscopy images further demonstrates robust segmentation (Jaccard index ≈ 91%) and multiclass accuracy (≈94.5%), confirming the generalizability of the sequential paradigm. Overall, the proposed method provides a theoretically grounded, clinically interpretable, and reproducible alternative to joint multitask learning approaches, enhancing feature transfer and generalization in medical imaging.

A Sequential Segmentation and Classification Learning Approach for Skin Lesion Images

Gallazzi, Mirco
;
Gallo, Ignazio;Corchs, Silvia
2025-01-01

Abstract

This study investigates how the learning order between segmentation and classification tasks influences performance and generalization in medical image analysis. We propose a Sequential Swin Transformer framework that reuses a shared Transformer backbone with alternating task-specific heads to compare two sequential strategies: (i) segmentation followed by classification and (ii) classification followed by segmentation. Unlike conventional multitask or preprocessing-based pipelines, the proposed framework isolates the impact of task ordering on feature transfer under an identical architecture. Evaluated on the HAM10000 skin lesion dataset, the segmentation-then-classification configuration achieves the highest multiclass accuracy (up to 86.9%) while maintaining strong segmentation performance (Jaccard index ≈ 86%). Statistical tests confirm its superiority in accuracy and macro F1 score, whereas Grad-CAM and t-distributed stochastic neighbor embedding (t-SNE) analyses reveal that segmentation-first training yields more lesion-centered attention and a more discriminative latent space. Cross-domain evaluation on gastrointestinal endoscopy images further demonstrates robust segmentation (Jaccard index ≈ 91%) and multiclass accuracy (≈94.5%), confirming the generalizability of the sequential paradigm. Overall, the proposed method provides a theoretically grounded, clinically interpretable, and reproducible alternative to joint multitask learning approaches, enhancing feature transfer and generalization in medical imaging.
2025
deep learning; gastrointestinal disease detection; medical imaging; segmentation and classification; sequential learning; skin lesion analysis; swin transformer; transfer learning
Gallazzi, Mirco; Gallo, Ignazio; Corchs, Silvia
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2202671
 Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact