Katherine Pollard, PhD
Director and Senior Investigator, Gladstone Institute of Data Science and Biotechnology

Other Professional Titles

Professor, University of California, San Francisco, Epidemiology & Biostatistics


(415) 734-2711


(415) 355-0960


Audrey Le
(415) 734-2768

On The Web

Areas of Investigation

Our laboratory develops statistical and computational methods for the analysis of massive genomic datasets. We are interested in genome evolution, in particular identifying genome sequences that differ significantly between or within species and their relationship to biomedical traits of interest. We pioneered the statistical phylogenetic approach for identifying Human Accelerated Regions (HARs), the fastest evolving sequences in the human genome. Most HARs are non-coding elements, such as regulatory signals, structural sites, and RNA genes. One of our aims is to identify specific DNA alterations in HARs that are responsible for variation in gene expression.

We are also developing methods for characterizing microbial communities from metagenomic data, the pool of DNA from different microorganisms in a sample. We designed PhylOTU, the first computational tool for estimating the taxonomic composition of metagenomic samples from short, next-generation sequencing reads. Our current emphasis is to extend this approach to measure the functional composition of microbial communities. The goal of this project is to relate DNA based measurements of microbial communities from the human gut and other body sites to patient health status through niche modeling.

Lab Focus

Can we identify more HARs using archaic hominin genomes (e.g., Neanderthal) and many human genomes?
When do mutations in HARs and other regulatory sequences have a functional impact?
Is the functional profile of a metagenomic community more informative about host health status than the taxonomic profile?


Discovered HARs by comparing the human genome to the genomes of chimpanzee and other mammals.
Characterized the role of biased gene conversion, a non-adaptive recombination associated process, in shaping the fastest evolving regions of the human genome.
Solved the problem of clustering short metegnomic sequencing reads into taxonomic groups by leveraging sequenced genomes and phylogenetic triangulation.


  • American Society of Human Genetics
  • American Statistical Association
  • International Society for Computational Biology

Professional titles

Professor, University of California, San Francisco, Epidemiology & Biostatistics


  • Pomona College
  • University of California, Berkeley
  • "University of California, Berkeley"

Honors and Awards

2013 California Academy of Sciences (Fellow)
2013 Alumna of the Year, School of Public Health, University of California, Berkeley
2013 Best Scientific Visualizations of 2013, Wired Magazine
2008 Sloan Research Fellowship, Alfred P. Sloan Foundation
2007 Faculty Development Award, University of California, Davis
2003 Evelyn Fix Memorial Prize, Chin Long Chiang Biostatistics Student of the Year, University of California, Berkeley
1998 Berkeley Fellowship, University of California, Berkeley
1996 Thomas J. Watson Fellowship, Watson Foundation
1995 Valedictorian, High Scholarship Prize (4.0 GPA), Math Prize, Anthropology Prize, Phi Beta Kappa Award, Pomona College
1993 Sophomore Math Prize, Pomona College