Our work focuses broadly on asking questions about organismal function and evolution using genomic data. The huge amount of data currently being produced allows us to ask and answer questions on a genomic scale that have never been possible before. Our questions largely revolve around the relative roles of natural selection and genetic drift in shaping nucleotide, gene family, and gene expression variation both within and between species. Although most of the empirical work has been on systems such as humans and mosquitoes, members of the lab can work on topics and organisms that appeal to them. This page covers four of the major topics currently being studied.


The evolution of transcriptional regulation. Changes in the timing, level, and location of gene expression have been implicated in many phenotypic differences between individuals and species. Using both DNA sequence and gene expression data, we can address the origin of variation in gene expression and the evolutionary forces that affect this variation.

Our work on the origin of transcriptional variation within humans has revealed multiple instances of the creation and maintenance of novel transcription factor binding sites in human history. Natural selection appears to act locally to drive new variants to fixation in different human sub-populations. We have also examined the creation of such binding sites throughout whole genomes, and have found that many are removed by natural selection to avoid inappropriate binding by transcription factors. In ongoing work using Affymetrix microarrays, we are studying how sex-biased gene expression has evolved between the malaria mosquito, Anopheles gambiae, and the fruitfly, Drosophila melanogaster.


The evolution of gene families. Comparison of whole genomes has revealed large and frequent changes in the size of gene families. Comparative genomic analyses allow us to identify large-scale patterns of change in gene families and to make inferences regarding the role of natural selection in gene gain and loss. To make these analyses possible, we have developed a stochastic birth-and-death model for gene family evolution. This model allows for parameter estimation, inference of rates and magnitude of change, and a test for the action of adaptive natural selection in gene family diversification. Application of this method to data from multiple whole genomes of both yeast and mammals is revealing remarkable patterns of gene gain and loss.


Human population genomics. Selective, demographic, and random processes all determine the frequency of alleles in a population and differences between species.  One of the major goals of population genetics has been to uncover which of these processes is acting in natural populations through a combination of directed empirical studies and theoretical models that provide expectations under a variety of conditions.  While most of the work in the field has involved single loci or limited multiple locus studies and models, the availability of genomic-scale data will begin to require new genomic-scale approaches.

The interaction between human demographic history (such as the migration out of Africa) and ongoing natural selection creates complex patterns of polymorphism and linkage disequilibrium. Our recent work on both regulatory and coding variation in humans has used large numbers of loci from across the genome to tease apart the effects of these potentially confounded forces. The development and application of coalescent methods to this data has revealed instances of natural selection throughout humans (at the ABO blood-type locus) and in single populations (at the F7 clotting factor locus). The use of thousands of loci currently being sequenced in multiple populations will allow for an even more fine-scale investigation of natural selection across the human genome.


Divergence in genetic networks. Proteins do not evolve in isolation, but rather as components of complex genetic networks. Therefore, a protein’s position in a network may indicate how central it is to cellular function, and hence how constrained it is evolutionarily. We have examined the protein-protein interaction networks in yeast, worm, and fly, and have found that proteins with a more central position in all three networks—regardless of the number of direct interactors—evolve more slowly and are more likely to be essential for survival. By studying various types of genetic networks in a number of different genomes, we can begin to understand the determinants of sequence evolution—and therefore of phenotypic evolution.