plosone-phylo
pone.0007815.g002.png
Maximum parsimony phylogeny based on concatenates of 89 gene sequences from 108 MTBC strains from global sources as previously reported [31].
Six main lineages can be observed within the human MTBC (numbered 1 to 6 and indicated in different colours). As shown previously, these lineages are highly congruent to the ones defined based on genomic deletions or large sequence polymorphisms (LSPs) [31], [33], [34]. Corresponding spoligotyping data for each strain are shown on the right, where black squares indicate the presence of a particular spacer and a white square the absence of a particular spacer (see Figure 1 for details on the methodology). Because the various typing techniques have classified MTBC strains into several lineages and strain families using differing nomenclatures, some of the traditional names are also shown. Some of the traditional groupings defined by spoligotyping correlate with SNP-based lineages (see also Table S1). For example, EAI (East-African-Indian) corresponds to the pink lineage, AFR1 and AFR2 correspond to the green and brown lineage, respectively (these strains are also known as M. africanum), and CAS (Central-Asian) corresponds to the purple lineage. However, other strain groupings defined by spoligotyping should be regarded as sub-lineages within the main lineages. For example, the ‘Beijing’ strain family is part of the blue lineage, and the five spoligotyping groups ‘Cameroon’, ‘Uganda’ ‘X’, ‘Haarlem’, and ‘LAM (Latin-American-Mediterranean)’ are sub-lineages within the main red lineage. This highlights another limitation of spoligotyping, which is that phylogenetic relationships between strain groupings cannot be defined. In addition, asterisks indicate spoligotyping patterns that cannot be classified at all using standard ‘signature patterns’ [26]. PGG1, PGG2, and PGG3 indicate Principal Genetic Group 1, 2, and 3, respectively. The PGG nomenclature is based on two SNPs originally described by Sreevatsan at al. [7]. Comparison to the MLSA data shows these groups are not phylogenetically equivalent as most of the MTBC diversity groups within PGG1, and PGG3 includes only a small subset of strains.
pone.0007815.g002.png
Maximum parsimony phylogeny based on concatenates of 89 gene sequences from 108 MTBC strains from global sources as previously reported [31].
Six main lineages can be observed within the human MTBC (numbered 1 to 6 and indicated in different colours). As shown previously, these lineages are highly congruent to the ones defined based on genomic deletions or large sequence polymorphisms (LSPs) [31], [33], [34]. Corresponding spoligotyping data for each strain are shown on the right, where black squares indicate the presence of a particular spacer and a white square the absence of a particular spacer (see Figure 1 for details on the methodology). Because the various typing techniques have classified MTBC strains into several lineages and strain families using differing nomenclatures, some of the traditional names are also shown. Some of the traditional groupings defined by spoligotyping correlate with SNP-based lineages (see also Table S1). For example, EAI (East-African-Indian) corresponds to the pink lineage, AFR1 and AFR2 correspond to the green and brown lineage, respectively (these strains are also known as M. africanum), and CAS (Central-Asian) corresponds to the purple lineage. However, other strain groupings defined by spoligotyping should be regarded as sub-lineages within the main lineages. For example, the ‘Beijing’ strain family is part of the blue lineage, and the five spoligotyping groups ‘Cameroon’, ‘Uganda’ ‘X’, ‘Haarlem’, and ‘LAM (Latin-American-Mediterranean)’ are sub-lineages within the main red lineage. This highlights another limitation of spoligotyping, which is that phylogenetic relationships between strain groupings cannot be defined. In addition, asterisks indicate spoligotyping patterns that cannot be classified at all using standard ‘signature patterns’ [26]. PGG1, PGG2, and PGG3 indicate Principal Genetic Group 1, 2, and 3, respectively. The PGG nomenclature is based on two SNPs originally described by Sreevatsan at al. [7]. Comparison to the MLSA data shows these groups are not phylogenetically equivalent as most of the MTBC diversity groups within PGG1, and PGG3 includes only a small subset of strains.