Seeing More: Big Data and Machine Learning Provide New Insights into Human Cells
The union of biology, technology, medicine, and statistics has catapulted us into a new era of scientific research. Over the last decade, we have developed the ability to peer into the depths of a human cell and unlock the mysteries of how it functions. Stem cell technology allows us to grow neurons from the skin cells of patients with serious neurological conditions, such as ALS, Parkinson’s disease, and Huntington’s disease. And together, these innovations provide us with insights into how neurons cope with or die from the presence of certain genetic mutations. With this knowledge, we hope to develop better treatments to fight these devastating diseases.
At the same time, our ability to gather information has progressed rapidly, as new scientific imaging technology generates biological pictures with astonishing detail and at an extraordinary rate. Thus, we have the exciting “problem” of being awash in data.
Researchers in my lab at the Gladstone Institutes are embracing this challenge: we have developed innovative statistical methods to examine our “big data,” and, in collaboration with engineers from a giant in the tech industry, we are applying machine learning technology to extract even more information from the images we collect. These techniques allow us to improve cellular imaging analysis and better understand the inner-workings of the brain.
To facilitate our pursuit for therapies, we invented a unique robotic microscope that tracks the development and degeneration of millions of individual brain cells in real time. We create neurons from the stem cells of patients with ALS, Parkinson’s disease, or Huntington’s disease, and tag every cell with an ID number. The computer-managed microscope then follows the fate of each cell over an extended period of time, returning to the same neuron over weeks or months to take pictures of them as they grow. This allows us to chart how the cells change and eventually die with disease and determine whether the changes are beneficial, harmful, or inconsequential.
This type of in-depth study produces enormous amounts of data—up to hundreds of thousands of images and terabytes of data per day. To handle this wealth of information, we use automated programs to analyze the pictures. However, the conventional approaches to image analysis are limited in their accuracy and the information they can extract. So we recently employed machine learning technology to design better examination techniques.
Playing With Patterns
Machine learning uses computer models that can better predict patterns in data. With supervised machine learning, our researchers provide feedback to the computer, teaching it to become an expert at mapping cellular features accurately. Over time, the computer becomes better and faster than a human at detecting patterns in the data.
Alternatively, unsupervised or “deep” learning uses powerful computers without human help to see patterns in data sets that are just too large for a person to comprehend.
It is this second tool that we hope will help us make connections between the mountains of data we collect in a dish and the mountains we collect from the patients whose cells we are analyzing. For with this technology, we are able to extract more information from data-rich images, making it possible to “see” features of human cells that eluded us previously and produce new insights that were beyond reach with old approaches. What’s more, these powerful computational methods are not limited to image analysis: they can also integrate data from molecular or genetic investigations of the same cells, as well as the patient’s clinical data.
Seeing Into the Future
With these new approaches, my colleagues and I have resolved key differences between patterns of neurodegeneration caused by two types of Parkinson’s disease and one type of ALS. This advance offers hope that it might be possible to uncover a prognostic relationship between the changes in the cells in a dish and a patient’s progression in the clinic. This could lead to quicker diagnoses, more sensitive and successful clinical trials, and better therapeutic options. For example, if we had a way to predict a patient’s clinical course from their cells’ characteristics, we could apply a personalized medicine approach to improve their treatment. This would enable physicians to tailor a patient’s therapy to their specific form of disease or underlying genetic mutation.
Capturing complex data and processing it is challenging. Our research has the potential to transform how we understand disease and, from that new vantage point, find new therapies. We are encouraged by our progress thus far and excited about the possibilities that will come from these groundbreaking approaches.
Steve Finkbeiner, MD, PhD, is the Associate Director of the Gladstone Institute for Neurological Disease and Director of the Taube/Koret Center for Neurodegenerative Disease Research. He is also a Professor of Neurology and Physiology at the University of California, San Francisco.