Improved HIV-1 Subtyping Accuracy Using near Full-Length Sequencing: A Comparison of Common Tools

IRIS - Institutional Research Information System
IRIS è il sistema di gestione integrata dei dati della ricerca (persone, progetti, pubblicazioni, attività) adottato dall'Università degli Studi dell’Insubria.

IRInSubria - Institutional Repository Insubria
IRInSubria raccoglie, conserva, documenta e dissemina le informazioni sulla produzione scientifica dell'Università degli Studi dell’Insubria anche ai fini della valutazione della ricerca.

The extensive genetic diversity of HIV-1, also represented by the circulation of multiple subtypes and circulating recombinant forms (CRFs), poses significant challenges for accurate subtype classification, especially when sequencing is limited to partial genomic regions. This study evaluated the performance of four commonly used automated subtyping tools (Stanford HIVdb, COMET, REGA, and Geno2pheno) by comparing their outputs with molecular phylogenetic analysis (Mphy), considered the gold standard, using three NGS-derived sequence data sets: protease-reverse transcriptase (PR-RT), pol, and near full-length (NFL). One hundred plasma samples were processed to generate sequences of increasing length, which were analyzed to assess concordance, sensitivity, and specificity. NFL-based Mphy identified a higher proportion of circulating recombinant forms (51.6%) than PR-RT and pol (44.1%) and enabled the reclassification of 13 samples as more complex CRFs. Automated tools displayed good concordance with Mphy for PR-RT and pol, particularly for pure subtypes, whereas concordance decreased considerably for NFL sequences, especially among non-B subtypes and CRFs. Sensitivity varied substantially across tools and subtypes, while specificity remained consistently high. Overall, the findings indicate that whole genome or NFL sequencing enhances the detection of CRFs and that the accuracy of automated tools is strongly influenced by the completeness and updating of their reference databases.

Improved HIV-1 Subtyping Accuracy Using near Full-Length Sequencing: A Comparison of Common Tools

Smoquina F.;Berno G.;Forbici F.;Sberna G.;Rozera G.;Abbate I.;Lazzari E.;Amendola A.;Mazzotta V.;Gagliardini R.;Antinori A.;Girardi E.;Maggi F.;Fabeni L.

2025-01-01

Abstract

The extensive genetic diversity of HIV-1, also represented by the circulation of multiple subtypes and circulating recombinant forms (CRFs), poses significant challenges for accurate subtype classification, especially when sequencing is limited to partial genomic regions. This study evaluated the performance of four commonly used automated subtyping tools (Stanford HIVdb, COMET, REGA, and Geno2pheno) by comparing their outputs with molecular phylogenetic analysis (Mphy), considered the gold standard, using three NGS-derived sequence data sets: protease-reverse transcriptase (PR-RT), pol, and near full-length (NFL). One hundred plasma samples were processed to generate sequences of increasing length, which were analyzed to assess concordance, sensitivity, and specificity. NFL-based Mphy identified a higher proportion of circulating recombinant forms (51.6%) than PR-RT and pol (44.1%) and enabled the reclassification of 13 samples as more complex CRFs. Automated tools displayed good concordance with Mphy for PR-RT and pol, particularly for pure subtypes, whereas concordance decreased considerably for NFL sequences, especially among non-B subtypes and CRFs. Sensitivity varied substantially across tools and subtypes, while specificity remained consistently high. Overall, the findings indicate that whole genome or NFL sequencing enhances the detection of CRFs and that the accuracy of automated tools is strongly influenced by the completeness and updating of their reference databases.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista
	
				INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
			
	DOI
	
				https://dx.doi.org/10.3390/ijms262311666
			
	Codice PUBMED
	
				41373814
			
	Codice Web of Science
	
				WOS:001634899500001
			
	Codice Scopus
	
				2-s2.0-105024641209
			
	Parole chiave
	
				circulating recombinant forms; genetic diversity; HIV-1; HIV-1 subtypes; molecular phylogeny; near full-length; next generation sequencing; subtyping automated tools; whole genome sequencing
			
	Tutti gli autori
	
						Smoquina, F.; Berno, G.; Forbici, F.; Sberna, G.; Rozera, G.; Abbate, I.; Lazzari, E.; Amendola, A.; Mazzotta, V.; Gagliardini, R.; Antinori, A.; Gira...espandi
						
	Appare nelle tipologie:
	
				Articolo su Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2208050

Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni

1

3

3

social impact