Genetic association studies

Research Focus 

Next-generation sequencing (NGS) has revolutionized genomic and transcriptomic studies. Spurred by rapid advances of the International HapMap Project (URL: http://hapmap.ncbi.nlm.nih.gov/), the 1000 Genomes Project (URL: http://www.1000genomes.org/) which have generated comprehensive catalogues of human DNA sequence variations, significant progress has been achieved in identification of genetic risk factors, and their interactions with lifestyle, nutritional and environmental exposures for human complex traits.

In particular, genome-wide association (GWA) studies, which assay > 100,000 single nucleotide polymorphisms (SNPs) across large cohorts of individuals, have led to the discoveries of variants predisposing to many common complex diseases, e.g., type 2 diabetes, obesity, osteoporosis, nicotine dependence, autism spectrum disorders, and breast cancer  (http://www.genome.gov/gwastudies/). However, common variants (CVs) [defined as minor allele frequency (MAF) ³1%] identified by GWA studies have only small-to-modest effects on complex traits, and rare variants (RVs) (i.e., MAF < 1%), which have much larger effects and higher penetrance, can also contribute to common diseases. My research is primarily focused on identification of both CVs and RVs that contribute to complex phenotypes based on both GWA and sequencing-based studies. Furthermore, RNA-sequencing (RNA-seq), i.e. NGS of cDNAs, is transforming the characterization and quantification of transcriptomes, which can reveal a full repertoire of new genes, splice variants, as well as non-coding and strand-specific transcripts. Therefore, another focus of my research is on transcriptomic studies of human diseases and drug responses using both microarray and RNA-Seq technologies. Although conventional single-marker association analysis has been the predominant method due to its simplicity, pathway-and network-based analyses offer more powerful approaches than univariate analysis which provides significantly deeper insights of fundamental biological mechanisms. However, there is no consensus on the most optimal method for analyzing gene (genetic marker) groups. Thus, an emerging focus of my research is to develop novel algorithms for pathway and network analysis.

Genetic Association Studies of Complex Traits (e.g., BMD and BMI-Related Phenotypes)

The advent of NGS technologies has allowed for discover of nearly all genetic variants (both CVs and RVs) in a genomic region of interest. Using NGS technology, the 1000 Genomes Project has generated sequence reads from a study population in a relatively unbiased manner, which offers an ideal reference panel for genotype imputation of untyped variants. Given the polygenic nature of complex traits, many more CVs and RVs remain to be identified for continuous traits (e.g., plasma lipid levels). Genotype imputation, a statistical technique using haplotype patterns in a reference panel to predict genotypes at unobserved loci in a study dataset, and meta-analysis, a technique that allows for the pooling of independent studies that examine similar hypotheses, are increasingly applied to synthesize data from several GWA studies and to replicate the genetic variants that emerge from each of those studies. Currently, my major objectives are to identify genetic susceptibility loci for bone mineral density (BMD) and body mass index (BMI)-related phenotypes