Research Interests
* Statistical missing data problems, imputation methodology.
* Gibbs sampling and other MCMC methods, rate of convergence.
* Markov structure, graphical models (software BUGS), and genetics.
* Image reconstructions: PET, SPECT, etc.
* Bayesian methodology; Even Bill Gates talks about Bayesian ideas!!
* Nonparametric hierarchical models, model selections and testings.
* Large-scale computation and optimization, e.g., VLSI design; Dynamic systems; Computer vision.
* Monte Carlo filters, Sequential importance sampling and resampling.
"BACH requires R and GNU scientific library (GSL).
Bayesian 3D constructor for Hi-C data, or in short 'BACH', is a novel Bayesian probabilistic approach for analyzing Hi-C data. BACH takes the Hi-C contact matrix and local genomic features (restriction enzyme cutting frequencies, GC content and sequence uniqueness) as input and produces, via MCMC computation, the posterior distribution of three-dimensional (3D) chromosomal structure. In the BACH algorithm, we assume that there exists a consensus 3D chromosomal structure in a cell population (this assumption will be relaxed in the BACH-MIX algorithm). BACH can be used to reconstruct the consensus 3D chromosomal structure from the Hi-C contact matrix, and infer the uncertainties of the spatial distances between any two genomic loci from the corresponding posterior distribution."
"BACH-MIX requires GNU scientific library (GSL).
In the BACH-MIX algorithm, we assume that the genomic region of interest is composed of two adjacent sub-regions, each with a rigid consensus 3D structure, but the spatial arrangement of the two sub-structures can vary in a cell population, which is represented by a rotation matrix with three Euler angles. In addition, we take into consideration the mirror symmetric structure which cannot be explained by the rotation matrix. BACH-MIX models the uncertainty of the relative position between the two sub-structures by a mixture component model. The weight of each component represents the proportion of that component in a cell population. The BACH-MIX algorithm takes the 3D chromosomal structure BACH predicted and local genomic features (restriction enzyme cutting frequencies, GC content and sequence uniqueness) as input and produces, via MCMC computation, the posterior distribution of the proportion of each mixture component in a cell population. The BACH-MIX algorithm is equivalent to a Poisson regression procedure with nonnegative constraints on all coefficients (proportions) of the mixture components."
"We propose a parametric model, HiCNorm, to remove systematic biases in the raw Hi-C contact maps. It relates chromatin interactions and systemic biases at the desired resolution level, resulting in a simple, yet accurate normalization procedure. Compared to the exiting Hi-C normalization method, our model has only a few parameters, is much easier to implement, can be interpreted intuitively, and achieves higher reproducibility in real Hi-C data."
"PACO program uses EM algorithm to estimate the link-level delay distributions for tree-structured multicast networks, and also detects and estimates spatial dependent links (SDL)."
"Tmod is developed for the users of Windows operating systems, and has an easy-to-use graphical user interface (GUI). Tmod integrates twelve motif discovery programs: AlignACE (Roth et al., 1998), BioProspector (Liu et al., 2001), Consensus (Hertz and Stormo, 1999), MEME (Bailey and Elkan, 1994), MDscan (Liu et al., 2002), MotifSampler(Thijs, et al., 2001; Thijs, et al., 2002), SeSiMCMC (Favorov, et al., 2005), GLAM (Frith, et al., 2004), MotifRegressor(Conlon, et al., 2003), YMF(Sinha and Tompa, 2003), Weeder (Pavesi, et al., 2001; Pavesi, et al., 2004; Pavesi, et al., 2004) and Gibbs Motif Sampler (Liu, et al., 1995; Newberg, et al., 2007; Thompson, et al., 2004; Thompson, et al., 2003; Thompson, et al., 2007). The motif finding algorithms included in Tmod have all been widely cited and proved excellent methods in the research field of motif discovery."