Science cluster
Summary
Disordered biomolecules, such as lipids and intrinsically disordered proteins (IDPs), are crucial to understanding cellular membranes and protein behaviours, yet standardised and accessible training data for these molecules is lacking. The FAIRMD project builds on the success of the NMRlipids Databank, which provides open access to molecular dynamics (MD) simulations of lipid bilayers. By expanding the database to include MD simulations of disordered proteins, and by improving its interoperability with existing public databases, FAIRMD aims to fill this gap, offering high-quality, FAIR data to support AI model development. This will enhance research across biophysics, drug design, and materials sciences.
Challenge
Open Science project, Open Science Service, Main RI concerned
The rapid advancement of artificial intelligence (AI) in scientific research is significantly hampered by the lack of standardised, accessible training data. While databases like the Protein Data Bank (PDB) have revolutionised biological research, similar resources for disordered biomolecules, such as lipids and IDPs, are lacking. This creates a barrier for AI-driven research and limits advancements in areas, such as drug development and biomaterial design.
Solution
FAIRMD aims to expand the NMRlipids Databank, a collaborative resource for MD simulations of lipid bilayers, to include disordered proteins, providing researchers with programmatic access to quality evaluated MD simulation models of disordered proteins. This advancement will enable the development of AI models to predict structural and dynamical properties of disordered biomolecular complexes from atomistic details. Bridging quality-evaluated MD simulations to other databases containing biological functions and biophysical properties of molecules will benefit a wide range of fields with high societal impact, such as design of sustainable materials, novel foods, and drugs.
Scientific Impact
By addressing practical challenges in data distribution with open collaboration and overlay-databank approaches, the project promotes FAIR data sharing and supports the development of AI-based tools. The application of the open collaboration model, with shared authorship in the resulting publications, and the overlay-data bank concept promoted by FAIRMD are not limited to biophysical chemistry but bear potential advances across scientific domains.
Principal investigator
Markus Miettinen did his PhD in Theoretical and Computational Physics at Aalto University, Finland; followed by postdocs in Germany (Institute for Biology and Biochemistry, Potsdam University and Department of Physics, Freie Universität Berlin) funded by independent grants from European Molecular Biology Organization and Volkswagen Foundation. After leading the research group at Department of Theory and Bio-Systems at Max Planck Institute of Colloids and Interfaces, he became associate professor of Chemistry and group leader at Computational Biology Unit at University of Bergen, Norway in 2022.