MAASTRICHT, THE NETHERLANDS and SAN FRANCISCO, CA, USA—Science has a data problem. Innovative new technologies have exponentially advanced biomedical research, making it easier than ever to collect essential information about human health and disease. But despite this wealth of knowledge, there are still many diseases that scientists do not understand in enough detail to develop effective treatments for them.
This problem is in part due to the difficulty of aggregating different types of experimental data. There is an overwhelming amount of information that needs integration, and the databases that contain this information often store the data in ways that make it hard to reuse in another context. In a new study published in PLoS Computational Biology, researchers at Maastricht University and the Gladstone Institutes tackled this challenge to improve the integration of disparate sources and types of data and advance scientists’ understanding of disease.
“There is a wealth of information available to us, but we need a better way to bring it all together,” says co-senior author Alex Pico, PhD, interim director of the Gladstone Bioinformatics Core.
To solve this problem, the researchers converted WikiPathways—an open source knowledgebase of biological pathways—into a modern linkable format. They applied semantic web and ontology approaches to greatly simplify the integration of the data contained in WikiPathways with other knowledge bases. This allowed the scientists to combine information from WikiPathways with two other databases, DisGeNET and the EBI Expression Atlas, which contain information about the relationship between genes and diseases, and the expression of genes under different conditions, respectively. The integration of these resources gives scientists greater insight into the biological processes behind diseases like diabetes mellitus and asthma.
“We think this development will help other scientists better utilize open data, hopefully leading to the discovery of new therapeutic targets for disease,” says first author Andra Waagmeester, PhD, formerly a researcher at Maastricht University and now CEO of the start-up company Micelio.
The researchers also worked with the Open PHACTS Foundation, the outcome of a project funded by the European Innovative Medicines Initiative that is aimed at facilitating drug discovery. Nick Lynch, PhD, CTO at Open PHACTS, says: “Twenty-five percent of the current requests to our programmable database include information from WikiPathways. Therefore, being able to integrate data from different pathways adds valuable, complementary information to our services in support of drug discovery.”
“There is tremendous potential for this technology to spur new connections and foster new collaborations,” adds co-senior author Chris Evelo, PhD, head of the Department of Bioinformatics (BiGCaT) at Maastricht University.
WikiPathways is a community-curated, wiki-based, open database that was started in 2008, initiated by a collaboration between Maastricht University and the Gladstone Institutes. It is frequently used to analyze high-throughput studies involving gene expression and metabolism. All knowledge collected in this database is compatible with the European Open Science Cloud. WikiPathways collaborates with external projects, such as WormBase, through research portals, and has been cited more than 600 times.