The Microbiome Analysis Core at the Harvard T.H. School of Public Health (HMAC) provides end-to-end support for microbial community and human microbiome research, from experimental design through data generation, bioinformatics, and statistics. This includes general consulting, power calculations, selection of data generation options, and analysis of data from amplicon (16S/18S/ITS), shotgun metagenomic sequencing, metatranscriptomics, metabolomics, and other molecular assays. The HMAC has extensive experience with microbiome profiles in diverse populations, including taxonomic and functional profiles from large cohorts, quantitative ecology, multi'omics and meta-analysis, and microbial systems and human epidemiological analysis. By integrating microbial community profiles with host clinical and environmental information, we enable researchers to interpret molecular activities of the microbiota and assess its impact on human health.
Your meta’omic data is housed on dedicated and regularly backed-up network storage drives. With written consent of the Investigator, data will be removed from our storage after six months of inactivity following completion of collaboration.
Amplicon (16S rRNA / ITS) or shotgun metagenomic, metatranscriptomic (SG) sequencing data is passed through a quality control pipeline using the bioBakery (http://huttenhower.sph.harvard.edu/biobakery_workflows) workflows. 16S / ITS: The amplicon sequence data pipelines consist of two approaches, USEARCH / VSEARCH and DADA2 (https://bitbucket.org/biobakery/biobakery_workflows/wiki/Home#!16s-rrna-16s) to identify operational taxonomic units (OTUs) and amplicon sequence variates (ASVs), respectively. These taxonomic profiles are then passed to PICRUSt (http://picrust.github.io/picrust/index.html), which infers gene content and abundance of taxa, to predict the metagenome composition of the 16S-resolved community. PICRUSt predicted metagenomes are amenable to similar downstream analysis as metagenomes identified from shotgun sequencing data, but with taxonomic resolution limited by 16S. In tiered-design studies, MicroPita (http://huttenhower.sph.harvard.edu/micropita) takes as input results from 16S surveys to inform sample subset selection for SG follow-up work, governed by user-specified features of interest (clinical/environmental metadata, diversity measures, etc.). SG: Microbiome composition (bacteria, archaea, viruses and eukaryotic microbes) is gleaned from SG sequencing data using MetaPhlAn2 (http://huttenhower.sph.harvard.edu/metaphlan2), which resolves taxonomic diversity and abundance at the subspecies level.
Metagenomes, both PICRUSt-predicted and SG-sequenced, can further be passed through the HUMAnN2 (http://huttenhower.sph.harvard.edu/humann2) pipeline. HUMAnN2 determines conservation and abundance of gene modules (sets of genes related by sequence and function) and biochemical pathways to reveal the metabolic potential of the microbial community.
Data features derived with these algorithms, including gene/pathway presence and abundance, gene expression, microbiome composition, OTUs, ASVs, or peptide identifications from metaproteomics and compound tables from meta-metabolomics, can be integrated with clinical and environmental metadata using LEfSe (https://bitbucket.org/biobakery/biobakery/wiki/lefse) and MaAsLin2 (http://huttenhower.sph.harvard.edu/maaslin2) along with other packages within R statistical software. LEfSe identifies those data features that are distinct between a pair of metadatums (e.g. differences between two sampling sites, two clinical outcomes, two biochemical markers, two modalities, etc.). MaAsLin2 extends the functionality of LEfSe to identify associations between data features and multiple metadata factors, which can be discrete and/or continuous and can include time series data.
For computing infrastructure, the Core is using the FAS Research Computing cluster.
Service models: fee-for-service ($150/hour)
This rate supports advanced consultation, analysis, administrative tasks, FASRC compute cluster cycles, and data storage.
An easy to use computing environment that provides Huttenhower Lab tools for analysis of meta’omic data, compatible with Windows, Mac OS, and Linux. Read more here.