Graduate student Shu Zhang uses machine learning to reveal how DNA variation shapes genome structure and drives disease.

 

Shu Zhang is a graduate student in bioinformatics, working in Katie Pollard’s lab at Gladstone. Zhang grew up in the suburbs of Chicago, and received her undergraduate degree in biology from Cornell University. Prior to coming to Gladstone, she was a computational biologist at the Broad Institute. At Gladstone, she’s interested in using machine learning to understand how DNA sequence variation affects genome organization across disease and evolution.

What brought you to Gladstone?

I was a first year planning out my rotations in the fall quarter, and Katie was teaching our biostatistics course. I thought Katie was really cool, and doing really interesting work at the intersection of machine learning and regulatory genomics.

During my rotation, I found the lab to be a great place to work, and older graduate students consistently spoke of Katie as a thoughtful and supportive mentor. I was also struck by the quality of discussion in lab meetings and journal clubs—people asked insightful and curious questions, and it felt like a place where intellectually rigorous projects could flourish. I found these qualities across all the labs at Gladstone, which made me excited to do my PhD here.

What do you like about Gladstone?

I really appreciate how community-oriented Gladstone is. There’s a sense of belonging here. I think it’s because we all have this shared goal of fighting disease. I get to interact with kind and intelligent people every day! It’s a privilege to learn from them in lab meetings and journal clubs, and also be able to have silly discussions at the lunch table.

Who or what has been your biggest influence in your scientific career?

I’ve had a string of excellent mentors throughout my scientific career, who have both shaped my research direction and inspired my long-term goals of leading a laboratory.

My undergraduate mentor pushed me to explore different fields and techniques, which is what led me to computational biology. As I started out as a computational researcher at the Broad Institute, I was given real intellectual freedom to shape the direction of my projects, which strengthened my scientific judgement and confidence to be an independent researcher.

Finally, having strong female mentors in the field has been important throughout my scientific journey. Watching them do rigorous and creative science while leading with clarity and genuine care for those around them—in spite of the various challenges that come from being a woman in the field—has given me a better picture of what my future career could look like.

What are the key areas of research you’re focused on?

My research focuses on using machine learning to better understand how DNA sequence variation shapes 3D genome folding—the way DNA is physically organized within a cell’s nucleus—and how this variation drives regulatory differences across disease and evolution.

I’m particularly interested in repetitive elements, which are regions of DNA with highly repeated patterns. Traditionally, studying their function would require perturbing each region in the lab, but that approach can be time-consuming and technically challenging.

Instead of relying on experiments, I leverage deep learning models, which can predict genome folding patterns directly from DNA sequence. This allows me to perform large-scale experiments on the computer, where I can quantify how specific sequence features may contribute to changes in genome folding.

With this approach, we can investigate not only how repetitive elements contribute to genome organization in human cells, but also how differences in these elements across species, such as between humans and chimpanzees, may drive species-specific patterns of genome folding.

How does your research contribute to understanding or treating specific diseases?

Machine learning has become a really powerful tool to understand the impact of genetic variants on a specific disease. In many diseases, like cancer, there are too many variants to feasibly test in the lab. We can instead model these variants on the computer and predict their effect on cells, and prioritize the variants of interest. Our lab has applied this approach to better understand how the genome is organized in congenital heart defects, autism, and cancer.

How do you collaborate with other researchers, both within Gladstone and externally?

Collaboration is really important for computational work. Progress comes from this iterative process, where modeling helps generate hypotheses, and experimental results both refine our understanding and help improve the models we build.

I’ve been involved with two collaborations while at Gladstone: one with the Bruneau lab, and another with a group at Weill Cornell. In both cases, we met routinely to exchange updates and ideas. I found it exciting to hear different perspectives on the same problem, and talk through approaches for how we designed our analyses. These collaborations highlight how computational and experimental approaches complement each other.

What do you do when you’re not working?

I can often be found at the pottery studio, on a run, or hanging out with my friends. More recently, I’ve been trying to knit a cardigan for the first time.

What is your hidden talent?

I asked my friends this question since I couldn’t come up with anything. According to them, my hidden talent is giving great food recommendations, and having interesting facts (all my friends have heard about the genetics behind why humans lost their tails).

What advice would you give to young scientists or students interested in your field?

My advice would be to just start. In computational biology especially, it’s easier than ever to learn some coding basics and apply them to a real biological dataset, many of which are publicly available. Most of what I’ve learned has come from struggling through a problem, debugging my own code, and figuring things out step by step.

I’d also encourage curiosity for topics beyond the one you work in. A lot of interesting science comes out of combining ideas from seemingly unrelated fields, and new papers elsewhere might unexpectedly spark ideas in your own research. Additionally, your own personal interests will probably evolve! Five years ago, I had little experience in computation and almost no interest in evolutionary biology, and now both are central to my PhD.

Finally, asking questions can be intimidating, but it’s one of the best ways to deepen your understanding of the science and sharpen your ability to think critically about problems in your own research.

Want to Join the Team?

Our people are our most important asset. We offer a wide array of career opportunities both in our administrative offices and in our labs.

Explore Careers