eagle-i Harvard UniversityHarvard University
See it in Search

Kraft Laboratory

Location: 655 Huntington Avenue, Building II Room 207, Boston, Massachusetts 02115


The study of the relative contribution of genes and environment to the risk of common diseases presents a number of statistical challenges, from study design to analysis. My research focus is statistical methodology in genetic epidemiology, including family-based and population-based case-control studies.

My current projects include methods to measure association between haplotypes of multiple tightly-linked markers and disease in matched case-control studies and to detect gene x gene and gene x environment interactions. I am also interested in using joint variation in DNA sequence and gene expression to better understand disease etiology.

I collaborate with colleagues in the Department of Epidemiology and the Channing Laboratory on a number of large-scale cohort studies, such as the Nurses' Health Study, as well as the international Cohort Consortium for Breast and Prostate Cancer.





  • FEXAT ( Algorithmic software component )

  • GECOR ( Algorithmic software component )

    "GECOR is a windows program written in Java, (i.e., a Java desktop GUI wrap-around for an R function) for calculating sample sizes in matched case-control studies examining genetic and environmental factors, and/or gene-environment interaction. It allows for sample size calculations for the main effects of gene and/or environment, as well as gene-environment interaction. Main effects of gene and/or environment may also be calculated without gene-environment interaction. Environmental effects may be either modelled as dichotomous or categorical. Genetic models are restricted to an additive, dominant, or recessive mode of inheritance. Users may specify multiple controls per case. Additionally, the sample size or power calculations can accommodate scenarios with correlation between the case and control environmental exposure levels."

  • GEmis ( Algorithmic software component )

    "GEmis calculates power in case-control studies examining genetic factors in the presence of misclassification of an environmental factor E as well as dependence between the genetic variant G and the environmental exposure E. Three different tests are considered, the marginal effect of the gene, the standard test for gene-environment interaction and the joint test for a genetic marginal effect and gene-environment interaction. Environmental effect is modelled as dichotomous. Genetic model is restricted to dominant inheritance model."

  • HAPPY ( Algorithmic software component )

    "HAPPY estimates haplotype-specific odds ratios from genotype data on unrelated cases and controls using unconditional logistic regression. It can adjust for the main effects of relevant covariates and estimate stratum-specific haplotype effects. Aside from confidence intervals around individual odds ratio estimates, HAPPY calculates omnibus tests of haplotype association and haplotype-environment interactions. HAPPY uses the "expectation substitution" approach [1,2], which treats expected haplotype scores (calculated under a user-specified inheritance model) as observed covariates in a standard unconditional logistic analysis. The macro outputs these expected scores to an auxiliary data set; the scores can be then be used in customized analyses."

  • MISOR ( Algorithmic software component )

    Odds ratios for phase-known haplotypes measured with error.

  • MOFET ( Algorithmic software component )

  • MULTIPOW 2.1 ( Algorithmic software component )

    "MULTIPOW calculates the power for both joint and replication-based analysis of general multi-stage genetic association studies. It differs from other packages that calculate the power for joint analysis in that: (a) it allows for an arbitrary number of stages (three, four or more instead of just two); (b) it can incorporate the efficiency of the genotyped marker panel into the power calculations; and (c) it is based on the 2 d.f. Pearson's chi-squared test statistic from the 2×3 disease-by-genotype table, rather than the Z test comparing allele frequencies between cases and controls."

    These functions are useful for designs with more than two stages.

  • PowerGxE ( Algorithmic software component )

  • WANDEL ( Algorithmic software component )

    Permutation adjustment for multiple testing.

Web Links:

Last updated: 2012-11-08T11:15:13.099-05:00

Copyright © 2016 by the President and Fellows of Harvard College
The eagle-i Consortium is supported by NIH Grant #5U24RR029825-02 / Copyright 2016