A versatile tool for precise variant calling in mycobacterium tuberculosis genetic polymorphisms

Document Type

Article

Department

Paediatrics and Child Health; Pathology and Laboratory Medicine

Abstract

Background: Whole genome sequencing (WGS) facilitates the diagnosis of multidrug-resistant MDR-TB through the interpretation of sequence variations (SV) in Mycobacterium tuberculosis (MTB) genes. Information on phenotypic and genotypic resistance associations continues to evolve, it is important to identify SV within genes of interest. We developed an MTB-VCF variant calling pipeline that can compare against the reference genome for any gene of interest. We demonstrate its utility for calling SV in genes associated with Rifampicin (RIF), Isoniazid (INH), Ethambutol (EM), and Streptomycin (SM) resistance.
Methods: MTB-VCF is a Python-based command line Variant Calling pipeline designed to streamline batch processing from raw reads (FastQ) files. SV called by MTB-VCF were compared with those identified by TBProfiler, KVARQ, CASTB, Mykrobe Predictor and Phy-ResSE pipelines. The sensitivity and Specificity of MTB-VCF SV calling were calculated against the drug susceptibility testing (DST) phenotype.
Results: MTB-VCF identified 868 SV present in 200 phenotypically resistant MDR-TB isolates. These were across rpsl, rrs, rpoB, inhA, katG, ahpC, gidB and embCAB genes. Of these, 684 SV were known to be associated with a resistance genotype, leading to a specificity of 97.75%. The SV called by the MTB-VCF was compared separately to resistance genotypes called by TB-Profiler, KvarQ, CASTB, Mykrobe Predictor, and PhyRes-SE pipelines, demonstrating a sensitivity of 99.5%.
Conclusion: The MTB-VCF pipeline offers a rapid and accurate solution for identifying SV in target genes for interpretation later. It can be run in large batches, proving flexible computing that allows for the customization of core bioinformatic pipelines, enabling the analysis of WGS data from different technologies.

Comments

Volume, issue and pagination are not provided by the author/publisher.

Publication (Name of Journal)

BioRxiv

DOI

10.1101/2023.07.24.550283

Share

COinS