We study the genetic basis of complex traits and common diseases in humans. Our group is in the Department of Medical Genetics and of Epidemiology of University Medical Center Utrecht, and the Division of Genetics of the Brigham and Women's Hospital and Harvard Medical School. We are also affiliated with the Program in Medical and Population Genetics at the Broad Institute of Harvard and MIT.
A central goal of our lab is to develop computational tools and statistical approaches to analyze patterns of genetic variation in human populations, and to apply these methods to identify genetic determinants of disease susceptibility, disease progression, and drug response.
We pursue these activities in the context of immune-related disorders including HIV/AIDS and autoimmune disease, as well as cerebrovascular and cardiovascular traits, involving many collaborators in Boston and elsewhere.
de Bakker, Paul, Ph.D.
Role: Visiting Professor of Medicine
"The HLA calling algorithm determines the most likely pair of HLA types at each locus by systematically evaluating all possible pairs of 4-digit HLA types. We use three key components to calculate the posterior probability for each HLA allele pair. First, we compare genotypes for each allele pair to the genotypes determined by the Genome Analysis Toolkit (GATK) based on sequence data. Second, we check the allelic phase of each HLA allele pair for consistency with the sequence data. Specifically, we calculate the binomial probability that the phase orientation for a specific HLA allele pair is consistent with the sequence data at a pair of adjacent polymorphic sites, and aggregate these probabilities across all pairs of polymorphic sites. Third, we use information about the expected allele frequency to determine the prior probability of observing each pair of HLA alleles in the population (if the ancestry is known). We then multiply the probabilities calculated from base genotypes, allelic phase information, and allele frequencies, rescale (to ensure all posteriors sum to 1), and output the posterior probability for each HLA allele pair. The pair with the highest posterior probability corresponds to the best-guess genotype for that DNA sample." The algorithm is part of the Genome Analysis Toolkit (GATK).
"We have developed MANTEL, a tool for performing a meta-analysis of GWAS data sets, including detailed QC checks for individual GWAS data sets and generation of regional association plots."