Two projects funded via the 1st OSCARS Open Call, HEFTIE and the Fragalysis Cloud project, have become the first funded projects in the OSCARS portfolio to complete their work.
HEFTIE and Fragalysis Cloud were both launched on 1st October 2024 and ran for 24 months. Their completion marks a milestone for the OSCARS project: the first concrete demonstration that the cascading grant mechanism is delivering finished, openly available results.
HEFTIE addressed the technical challenge of handling enormous imaging files in life sciences and photon and neutron science. The Fragalysis Cloud project aimed to enhance the transparency and accessibility of structure-based drug design (SBDD) data by making it fully open and compliant with FAIR principles, while further developing an existing specialised platform to support life sciences research by a broader community of researchers
What do these projects have in common? Both have made something that was previously closed, fragmented, or inaccessible into something open, documented, and reusable by the wider scientific community.
HEFTIE: a textbook, a benchmark, and a contribution to global software
X-ray tomography is a scientific technique used across a wide range of scientific fields, including material sciences, geosciences, palaeontology, and life sciences. With current experiments routinely producing datasets so large they cannot be loaded into the memory of a standard computer, the development of "chunked" file formats - which break down massive files into smaller parts - has become essential for data processing. Many researchers, however, lack the tools and training to work with these formats effectively, which slows analysis and limits what can be discovered from the data.
HEFTIE - Handling Enormous Files from Tomographic Imaging Experiments - tackled this problem directly. Led by David Stansby from University College London in collaboration with Scalable Minds (an SME specialised in building image analysis tools and services for life scientists specialised in Connectomics), the project developed new tools and educational resources to help scientists around the globe work with huge imaging datasets. In the project team's own words: "OSCARS has provided an invaluable opportunity to develop vital resources needed to enable science with big imaging data."
The project delivered five distinct outputs.
- The team wrote a digital textbook to teach scientists the theory and practice behind next-generation chunked file formats (heftie-textbook.readthedocs.io) - already viewed over 200 times since its launch.
- They created and ran in-depth tests to help scientists choose the best and fastest compression algorithms for storing their imaging data (heftieproject.github.io/zarr-benchmarks).
- They developed a new tool to allow scientists to easily migrate between old and newer versions of chunked file formats (zarr.readthedocs.io/en/stable/user-guide/cli.html).
- They improved existing software tools to make them better documented and easier to use, including contributions to the widely used zarr-python library.
- And they added new features to WebKnossos, a web-based image analysis platform, making it easier to work with 3D images and to annotate them.
All code and resources developed as part of the project are openly licensed, which will maximise their future use and allow others to build upon them.
The project also demonstrated the kind of adaptability that Open Science requires. Between writing the proposal and starting the project, a tool originally planned as part of the HEFTIE workplan was independently developed and openly released by an external team. Rather than duplicating work, the HEFTIE team redirected their time and resources to ensuring their own work was effectively communicated and disseminated into the wider research community.
The resources developed are already seeing community use. During the project, the team presented their work at the University of Cambridge RSE seminar series in October 2025, at the Crick Bioimage Analysis Symposium 2025 at the Francis Crick Institute in London in November 2025, and at the Global Bioimage Analyst (GLOBIAS) seminar series in December 2025. The project's final scientific article has been published on Zenodo (DOI: 10.5281/zenodo.18417416).
Fragalysis Cloud: opening drug design data - and feeding the next generation of AI models
There is strong consensus in the field of medicinal chemistry that existing datasets for exploring and improving structure-based drug design are highly inadequate. The state of the art for data availability has evolved into separate repositories - the Protein Data Bank (accessed through PDBe in ELIXIR) for 3D structural data, and ChEMBL for bioactivity information.
While these are individually well-developed and highly FAIR, they serve specialised informaticians and operate with necessarily constrained ontologies. Data from drug discovery projects is often fragmented across various tools and hidden in unstructured formats, making it difficult for researchers to access and build upon existing work. At Diamond Light Source's XChem facility, crystallographic fragment screening generates thousands of protein-ligand structures, but this valuable data has remained trapped in complex technical formats that only specialists can interpret.
The OSCARS-funded project Implementing FAIRness in structure-based drug design through Fragalysis Cloud - led by Warren Thompson and Frank von Delft at Diamond Light Source and the Research Complex at Harwell, together with Boris Kovar and Matej Vavrek at M2M Solutions, Tim Dudgeon and Alan Christie at Informatics Matters, and with affiliations to the University of Oxford and the University of Johannesburg - enhanced the Fragalysis Cloud platform to facilitate open and FAIR-compliant sharing of SBDD data.
Fragalysis Cloud is a collaboration platform developed at Diamond's XChem facility for curating, sharing, and disseminating 3D structural data, and implementing best-practice medicinal chemistry algorithms for progressing results from fragment screens for drug discovery.
Three key innovations were delivered.
- XChemAlign: software that automatically transforms complex crystallographic data into biologically meaningful formats, enabling researchers to directly compare how different molecules bind to their target proteins to fully exploit potential discovery avenues.
- A data sharing infrastructure: comprehensive API endpoints and Python tools that provide programmatic access to structural and activity data, enabling automated analysis, seamless integration with computational workflows, and streamlined data preparation for deposition to PDBe.
- An interactive analysis environment: the integration of ready-to-use computational tools — Fragmenstein, HIPPO, and Syndirella — directly within the platform through Jupyter notebooks, eliminating the technical barriers that typically prevent medicinal chemists from using advanced drug design algorithms.
The project successfully processed and disseminated data for nine viral protein targets through the ASAP (Antiviral Drug Discovery) consortium, supporting Open Science efforts against COVID-19, Zika, Dengue, Enterovirus, Chikungunya, and other emerging viral threats. A collaboration with PDBe was established to standardise fragment screening data deposition workflows, now formalised in the OpenBind partnership.
This work transforms how structural biology data is shared and used in drug discovery: medicinal chemists can explore and analyse structural data without specialised computational expertise; distributed research teams can collaborate effectively by sharing precise 3D views of molecular interactions via simple web links; AI and machine learning researchers gain access to high-quality, standardised datasets for developing better drug discovery algorithms; and the infrastructure provides a foundation for the future data-dissemination needs of large-scale collaborative initiatives, including the OpenBind consortium for AI co-folding model development.
Publications from the project include a peer-reviewed article in Nature Communications (Ni et al., 2025, DOI: 10.1038/s41467-025-63602-z) alongside preprint articles on fragment screens of Coxsackievirus A16, Enterovirus D68, and Zika virus. The Fragalysis Cloud platform is accessible at fragalysis.diamond.ac.uk. The project's poster is available on Zenodo (DOI: 10.5281/zenodo.18863555).
Two milestones, one shared commitment
Together, HEFTIE and Fragalysis Cloud demonstrate what the OSCARS cascading grant mechanism was designed for: giving focused, well-scoped Open Science projects the resources and the framework to deliver concrete, reusable results in a defined timeframe, and for connecting those results to the wider European Open Science ecosystem.
Both projects' outputs are open and FAIR. Further details on each project can be found here.