The Eli and Edythe L. Broad Institute of Harvard and MIT is founded on two core beliefs:
1. This generation has a historic opportunity and responsibility to transform medicine by using systematic approaches in the biological sciences to dramatically accelerate the understanding and treatment of disease.
2. To fulfill this mission, we need new kinds of research institutions, with a deeply collaborative spirit across disciplines and organizations, and having the capacity to tackle ambitious challenges.
The Broad Institute is essentially an “experiment” in a new way of doing science, empowering this generation of researchers to:
* Act nimbly. Encouraging creativity often means moving quickly, and taking risks on new approaches and structures that often defy conventional wisdom.
* Work boldly. Meeting the biomedical challenges of this generation requires the capacity to mount projects at any scale — from a single individual to teams of hundreds of scientists.
* Share openly. Seizing scientific opportunities requires creating methods, tools and massive data sets — and making them available to the entire scientific community to rapidly accelerate biomedical advancement.
* Reach globally. Biomedicine should address the medical challenges of the entire world, not just advanced economies, and include scientists in developing countries as equal partners whose knowledge and experience are critical to driving progress.
ALLPATHS‐LG is a whole‐genome shotgun assembler that can generate high‐quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers. The significant difference between ALLPATHS and traditional assemblers such as Arachne is that ALLPATHS assemblies are not necessarily linear, but instead are presented in the form of a graph. This graph representation retains ambiguities, such as those arising from polymorphism, uncorrected read errors, and unresolved repeats, thereby providing information that has been absent from previous genome assemblies.
User manual webpage for the Birdsuite software package
User manual for the VAAL genome comparison tool.
"(ALLPATHS-LG) works on both small and large (mammalian size) genomes. To use it, you should first generate ~100 base Illumina reads from two libraries: one from ~180 bp fragments, and one from ~3000 bp fragments, both at about 45x coverage. Sequence from longer fragments will enable longer-range continuity."
"Birdseed is SNP genotyping algorithm that runs on the Affymetrix 500K, SNP5.0, and SNP6.0 platforms. Although Affymetrix officially supports Birdseed only for SNP6.0, we and others have found that it has excellent performance on the 500K and SNP5.0 platforms as well."
Used as a component of Birdsuite to discover rare or de novo Copy number variants.
"The Birdsuite is 4-part framework, that first genotypes (calls discrete copy number) of known CNPs, then calls SNP genotypes (for samples and SNPs believed to have 2 copies of the locus) and builds batch- and locus-specific models of probe characteristics, then searches for novel CNVs, and finally integrates these together to provide mutually informed and consistent sequence- and copy-number-genotypes at each locus for each sample."
Fawkes takes common CNV and SNP calls from Canary and Birdseed, and rare CNV calls from Birdseye and performs an integrated analysis to allow combined visualization of genome analyses.
The Genome Analysis Toolkit or GATK is a software package for analysis of high-throughput sequencing data, developed by the Data Science and Data Engineering group at the Broad Institute. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Used for VCF generation.
"VAAL is a variant ascertainment algorithm that can be used to detect SNPs, indels, and more complex genetic variants. On bacterial data sets, it achieves very high sensitivity, and near perfect specificity. VAAL can be used to compare reads from one strain to a reference sequence from another strain. It can also be used to compare reads from two strains to each other, using a third strain to determine homology. For example, we have used VAAL to find a single mutation responsible for bacterial resistance: the output of the program was that single mutation and no others. VAAL uses an assisted assembly algorithm that borrows from ALLPATHS."