Recent years have seen an explosion in multimodal data on the web. It is therefore important to perform multimodal learning to understand the web. However, it is challenging to join various modalities because each modality has a different representation and correlational structure. In addition, various modalities generally carry different kinds of information that may provide enrich understanding; for example, the visual signal of a flower may provide happiness; however, its scent might not be pleasant. Multimodal information may be useful to make an informed decision. Therefore, we focus on improving representations from individual modalities to enhance multimodal representation and learning. In this doctoral thesis, we presented techniques to enhance representations from individual and multiple modalities for multimodal applications including classification, cross-modal retrieval, matching and verification on various benchmark datasets.

Multimodal representation and learning / Nawaz, Shah. - (2019).

Multimodal representation and learning

Nawaz, Shah
2019-01-01

Abstract

Recent years have seen an explosion in multimodal data on the web. It is therefore important to perform multimodal learning to understand the web. However, it is challenging to join various modalities because each modality has a different representation and correlational structure. In addition, various modalities generally carry different kinds of information that may provide enrich understanding; for example, the visual signal of a flower may provide happiness; however, its scent might not be pleasant. Multimodal information may be useful to make an informed decision. Therefore, we focus on improving representations from individual modalities to enhance multimodal representation and learning. In this doctoral thesis, we presented techniques to enhance representations from individual and multiple modalities for multimodal applications including classification, cross-modal retrieval, matching and verification on various benchmark datasets.
2019
Multimodal, cross-modal retrieval, cross-modal matching, cross-modal verification
Multimodal representation and learning / Nawaz, Shah. - (2019).
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_NawazShah_completa.pdf

accesso aperto

Descrizione: testo completo tesi
Tipologia: Tesi di dottorato
Licenza: Non specificato
Dimensione 4.19 MB
Formato Adobe PDF
4.19 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2090709
 Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact