The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5 ' cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for the scientific community and provide important insights into the mechanisms that regulate the transcription of SARS-CoV-2 sgRNAs.

Nanopore ReCappable sequencing maps SARS-CoV-2 5' capping sites and provides new insights into the structure of sgRNAs

Clementi, Massimo;Mancini, Nicasio;
2022-01-01

Abstract

The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5 ' cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for the scientific community and provide important insights into the mechanisms that regulate the transcription of SARS-CoV-2 sgRNAs.
2022
2022
Ugolini, Camilla; Mulroney, Logan; Leger, Adrien; Castelli, Matteo; Criscuolo, Elena; Williamson, Maia Kavanagh; Davidson, Andrew D; Almuqrin, Abdulaziz; Giambruno, Roberto; Jain, Miten; Frigè, Gianmaria; Olsen, Hugh; Tzertzinis, George; Schildkraut, Ira; Wulf, Madalee G; Corrêa, Ivan R; Ettwiller, Laurence; Clementi, Nicola; Clementi, Massimo; Mancini, Nicasio; Birney, Ewan; Akeson, Mark; Nicassio, Francesco; Matthews, David A; Leonardi, Tommaso
File in questo prodotto:
File Dimensione Formato  
NAR2022.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 2.84 MB
Formato Adobe PDF
2.84 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2149022
Citazioni
  • ???jsp.display-item.citation.pmc??? 10
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 10
social impact