The MAGMA pipeline for comprehensive genomic analyses of clinical Mycobacterium tuberculosis samples
by Tim H. Heupink, Lennert Verboven, Abhinav Sharma, Vincent Rennie, Miguel de Diego Fuertes, Robin M. Warren, Annelies Van Rie
BackgroundWhole genome sequencing (WGS) holds great potential for the management and control of tuberculosis. Accurate analysis of samples with low mycobacterial burden, which are characterized by low (<20x) coverage and high (>40%) levels of contamination, is challenging. We created the MAGMA (Maximum Accessible Genome for Mtb Analysis) bioinformatics pipeline for analysis of clinical Mtb samples.
Methods and resultsHigh accuracy variant calling is achieved by using a long seedlength during read mapping to filter out contaminants, variant quality score recalibration with machine learning to identify genuine genomic variants, and joint variant calling for low Mtb coverage genomes. MAGMA automatically generates a standardized and comprehensive output of drug resistance information and resistance classification based on the WHO catalogue of Mtb mutations. MAGMA automatically generates phylogenetic trees with drug resistance annotations and trees that visualize the presence of clusters. Drug resistance and phylogeny outputs from sequencing data of 79 primary liquid cultures were compared between the MAGMA and MTBseq pipelines. The MTBseq pipeline reported only a proportion of the variants in candidate drug resistance genes that were reported by MAGMA. Notable differences were in structural variants, variants in highly conserved rrs and rrl genes, and variants in candidate resistance genes for bedaquiline, clofazmine, and delamanid. Phylogeny results were similar between pipelines but only MAGMA visualized clusters.
ConclusionThe MAGMA pipeline could facilitate the integration of WGS into clinical care as it generates clinically relevant data on drug resistance and phylogeny in an automated, standardized, and reproducible manner.