General files

International WGS pipeline validation
In order to validate our pipeline, we have compared our results against other standardised pipelines developed by Public Health TB institutions. These TB reference laboratories include: the Research Center Borstel from Germany, Public Health England from UK, and National Institute for Public Health and the Environment (RIVM) from The Netherlands. In all comparisons, we had similar results on SNP calls and predictions on transmission clusters. Furthermore, the parameters used in our pipelines have been discussed and approved by the TB community (Meehan, CJ. et al. 2019. Nat. Rev. Microbiol. doi:10.1038/s41579-019-0214-5)
Pipeline available at: https://gitlab.com/tbgenomicsunit/ThePipeline
For a detailed flowchart of the pipeline, please see this document.

MTB Inferred Ancestor Sequence
Sequence for the inferred MTB ancestor from Comas et al. 2010.

SNP panel for lineage classification
We use this SNP panel for lineage typing purpouses. This list have been constructed from different sources. Initially, the SNPs came from the Coll et al. publication. Later, we modified the L2 classification using the SNPs from Shitikov et al. work, but using the lineage nomenclature of Rutaihwa et al.
The L4 nomenclature and SNPs were also updated based on the ones proposed by Stucki et al.
Regarding the animal-adapted strains, we have calculated two lineage defining SNPs for M. bovis and M. caprae, from our own collection of samples. In addition, we have calculated one lineage defining SNP for each of the A3, A2 and A1 lineages (as defined by Brites et al.).
As we used the MTBC inferred ancestor sequence as the mapping reference in our analyses, the reference allele in our list will match the reference allele in the ancestral genome. Please, bear this in mind when trying to classify lineage 4 and lineage 4.10 strains in genomic sequences mapped against the H37Rv refrence genome.



Sorted by author

Álvaro Chiner-Oms

Galo Goig Serrano