Presented by: Yancong Zhang
View Abstract
Microbial communities are rich reservoirs for molecular functions that influence environmental and host-associated chemistry, with numerous roles in ecosystem maintenance, health, and disease. However, our knowledge of these molecular mechanisms is limited, due to the massive range of microbial genetic material in comparison to the limited throughput available for experimental characterization. Here, we used a novel method to systematically predict functional capacity of uncharacterized proteins in the human microbiome by assessing high-dimensionality community-wide data. We predicted potential functions for the large proportion of uncharacterized protein families (~70% of total) in 1,595 metagenomes and 800 metatranscriptomes from the Integrative Human Microbiome Project (HMP2). Using only MTX-based information, our approach achieved an average AUC for Gene Ontology (GO) biological process term prediction of approximately 0.7. By aggregating predictions from other types of information, the AUC was further improved to about 0.88. Further evaluations showed that our method is capable of recapitulating comparable, realistic prediction profiles from communities when compared against state-of-art single-organism tools. These results demonstrate the effectiveness of MTX-based evidence, which represents community-specific information independent of sequence similarity and can be used for predicting functions for novel microbial community proteins (i.e. those which lack significant homologs to sequences in characterized microbial isolates). Our method is generalizable to any types of microbial communities, providing a new approach to predict microbial protein functions. We implemented it as an open-source tool, FUGAsseM (Uncharacterized Gene products by Assessing high-dimensional community data in Microbiome), along with documentation available at http://huttenhower.sph.harvard.edu/fugassem. This study expands our understanding of the capability of uncharacterized proteins in the human microbiome and establishes a key community-based methodology to unravel function for uncharacterized microbial proteins in communities.
Yancong Zhang – Poster Description (Audio Clip)