In a recently published article Nature’s medicineResearchers have investigated the potential impact of single-nucleotide variations in gut-dwelling bacterial species (the microbiome) on human health.

Background
It is well recognized that the bacterial species that make up the gut microbiome influence human (host) health and cause diseases including inflammatory bowel disease (IBD), obesity, etc.
Previous genome-wide association studies (GWASs) have revealed that bacterial single-nucleotide polymorphisms (SNPs) can enable bacteria to infect new host species.
Furthermore, subspecies diversity, strain diversity, mobile gene(s) composition, and copy number variation of different gut microbiomes affect the phenotypic characteristics of each host differently.
Despite the relevance of SNP-level diversity in the gut microbiome in relation to host-microbiome interactions, many metagenomic-based studies have not systematically investigated the link between bacterial SNPs and human traits.
About the study
Thus, in the present study, researchers designed a framework for metagenome-wide association studies (MWASs) to identify SNP-level inter-species variability in gut-resident bacteria and identify mechanistic links between individual bacterial SNPs and human traits/phenotypes. . , in this case, body mass index (BMI).
They obtained metagenomic samples from a cohort of 7,190 healthy individuals from Israel; However, ultimately, they only analyzed a sample of 7,056 participants whose records of age, sex, and BMI were complete.
First, they identified genomic sequences unique to a species using the unique relative abundance (URA) algorithm, which aligns sequence reads to a larger high-quality reference set of the species (read assignment step). Next, they compared all reads assigned to the same genomic location to find the global dominant allele.
Further, they filtered all genomic loci by their coverage (≥1,000 samples) and variability (major allele frequency ≤99%, on average). A population of gut bacteria can have any number of allele copies. Therefore, the team modeled the genotype of each sample as a continuous number (0 to 1) representing the ‘major allele frequency’.
In total, 12,686,191 genomic loci, spread across the genomes of 348 enteric bacterial species, were identified as SNPs. In fact, the average number of SNPs detected in a genome was 3,221.
Then, they built a linear regression model for each SNP, with major allele frequency as the independent variable and BMI as the explanatory variable. Applying a clumping method to select SNPs in each species linked to the phenotype by the P value of association helped them first select the SNP with the smallest P value and remove all SNPs linked to it.
The researchers calculated the statistical significance of the association between each SNP-phenotypic trait pair based on the p-value estimate for the SNP. They corrected all P values using the Bonferroni method. A filtered list of SNPs correlated with the phenotype and uncorrelated with each other exhibited a correlation coefficient threshold of 0.3.
To distinguish unique SNPs associated with host phenotypes from potentially confounding phenotypes arising from differences in host diet, medication, and physical activity, they used a simple GWAS approach, in which other host characteristics, such as age, served as covariates.
BMI-associated SNPs were identified in the genomes of 27 bacterial species. Thus, the researchers investigated whether the relative abundance of these bacterial species was also associated with BMI. Using a relative abundance of species as a covariate helps to prevent intra- and interspecies mixing.
The researchers then assessed the robustness and reproducibility of the observed SNP-phenotype associations using an independent cohort of 8,204 individuals from the Dutch Microbiome Project cohort.
result
Of the 1,358 bacterial SNPs found to be associated with host BMI, only 40 showed independent associations.
When a similar MWAS analysis with different sample sizes was used to estimate the statistical power, in 44% of cases, there was one SNP associated with BMI in one species. However, relative abundance of species was not associated with BMI.
Thus, 12 BMI-associated SNPs identified in 27 bacterial species showed no association by relative abundance analysis. For example, a BMI-associated SNP was found in an inflammatory pathway Bilophila wadsworthia and another group of SNPs in a region encoding for energy metabolism a Faecalibacterium prasnitzii the genome
Importantly, 52% of BMI-associated SNPs were discovered in species unrelated to BMI by their relative abundance.
In a geographically and technically distinct Dutch cohort, 17 of 40 BMI-SNP associations were replicated (42.5%), and an additional one was significantly associated but in an opposite direction, suggesting that these associations were not random.
Furthermore, seven of the 14 species whose SNP-BMI associations were replicated in the second cohort did not have species-level relative abundance correlations with BMI, further validating the additional information found at the SNP level.
Additional MWAS analyzes for the 40 SNPs using diet, medication, and exercise as covariates in the regression analysis showed that diet, exercise, or medication could not explain most of the SNP-BMI association. Even diet and exercise only confounded the two SNP-BMI associations, possibly influencing bacterial genetics and host obesity status independently.
Conclusion
Genus- or species-level taxonomic characterization of the gut microbiome is insightful. However, this does not contribute to a comprehensive understanding of the interconnectedness of the gut microbiome and human health. In contrast, a fine-resolution view of host-microbiome interactions, in particular, SNPs, can help identify specific bacterial functions associated with host characteristics.
The currently used MWAS framework overcomes the limitations of human GWAS and has shown how individual SNPs in the microbiome are associated with host BMI.
It demonstrated how each observed association could be mapped to a specific bacterium, gene loci, and even protein domain and studied in its functional context, which even helped to generate mechanistic hypotheses on the effect of the microbiome on host weight.
Importantly, some BMI-related SNPs may have a causal role and, once validated, may aid in personalized therapeutics. For example, the mean BMI difference between allele groups for some SNPs was >2 points—equivalent to a difference of 5.8 kg for a 1.7-meter-tall individual. Thus, causal treatments based on these SNPs may have potentially large effect sizes. Similarly, some of the BMI-related SNPs discovered in this study were adaptive, which may help improve microbiome-based treatments.
Future research should improve this MWAS framework by developing methods accounting for bacterial population structure.