Presented by: Quang Nguyen
View Abstract
Background: High-dimensionality and sparsity are challenging problems in statistical analysis of microbiome relative abundance data. One approach is to aggregate taxa to sets, most commonly to Linnean taxonomic categories identified through classification of representative sequences. However, most researchers perform aggregation through the pairwise summation of counts, preventing comparison across sets of different sizes.
Methods: We developed a taxa set enrichment method based on the isometric log-ratio transformation (cILR) for microbiome relative abundance data. Our method generates sample-specific taxa set enrichment scores with a well-defined null hypothesis corresponding to the Q2 competitive null hypothesis in the gene set testing literature. Significance testing was performed by estimating the empirical null distribution accounting for variance inflation due to inter-taxa correlation.
Results: Here we demonstrated the performance of our method using both real data and parametric simulations for multiple microbiome analysis tasks, which are: single sample enrichment testing, differential abundance testing, and disease prediction.
Conclusions: The cILR method provides a flexible way to aggregate taxonomic variables to pre-defined sets, allowing for a comparison of enrichment across sets of different sizes. The statistic corresponds to a well-defined null hypothesis and is designed to address the compositional nature of microbiome data.
Quang Nguyen – Poster Description (Audio Clip)
If you have any questions regarding the poster, feel free to reach out to Quang Nguyen here.