OSCARS image

Science clusters

LS RI - Life Sciences
PANOSC - Photon and Neutron Science

Summary

X-ray tomography is a technique used across diverse scientific fields, such as material sciences, geosciences, palaeontology, and life sciences. With current experiments producing enormous datasets, the development of ‘chunked’ file formats - which break down massive files into smaller parts - has become essential for data processing. HEFTIE aims to significantly improve tools and educational resources for handling such datasets by creating a comprehensive digital textbook and new visualisation software. These advancements will accelerate the analysis of 3D organ scans and unlock discoveries in the life sciences, benefiting also other fields, such as neuroscience, archaeology, and the earth sciences.

Research domains:
Life sciences, Photon/neutron sources-based experimental research
Partner(s):
University College London
Project team member(s):
David Stansby, Norman Rzepka, Kimberly Meechan

Challenge

Open Science Service, Industry cooperation, Main RI concerned

X-ray tomography produces extremely large datasets, often exceeding 1 TB, which are too large to fit into standard computer memory. This has led to the adoption of 'chunked' file formats, enabling scientists to process these datasets in manageable portions. However, many researchers lack the tools and training to work with such massive data effectively, which slows down analysis and limits the scientific discoveries that could be made from this data.

Solution

HEFTIE aims to develop a comprehensive digital textbook and new software tools for working with chunked 3D imaging datasets. The textbook will provide clear guidance on setting up and running data analysis and visualisation pipelines, while the new tools and a visualisation software will make it easier to use these datasets. Specifically, and in connection with the Human Organ Atlas project, the project team will use the textbook and the new tools to analyse scans of human organs to understand how the human body works in health and disease, using  a technique allowing the scan of 3D images of whole human organs down to the resolution of individual cells. By leveraging open-source technologies, such as the Zarr data format, Python, and the WEBKNOSSOS software, HEFTIE ensures long-term accessibility and community engagement.

Scientific Impact

By adhering to FAIR data principles and integrating with the EOSC, HEFTIE will expand access to terabyte-scale data and foster open science collaboration across Europe. The digital textbook and the tools will be designed from the start to run on the Virtual Infrastructure for Scientific Analysis - VISA, the Virtual Research Environment (VRE) cloud computing platform developed as part of PaNOSC, to ensure accessibility to a wide range of users across the PaNOSC cluster. 

Improved training and tools for working with the data will allow the project team and other researchers around the world, to speed up data analysis and make new discoveries from unused data. Beyond its direct applications in life sciences, the tools and training materials developed will be applicable to other research fields, such as neuroscience, archaeology, and the earth sciences. 


Keywords
X-ray tomography, Human Organ Atlas, VISA - Virtual Infrastructure for Scientific Analysis, 3D organ scans, 3D imaging datasets, chunked data
Project start date:
Project duration:
12 months

Principal investigator

David Stansby - PI - HEFTIE project
David Stansby
University College London
BIO

David Stansby is the lead Data Scientist for the Human Organ Atlas, a public resource with 3D images of whole human organs that can be zoomed in down to the scale of individual cells. He is based at University College London in the UK, and has a background in enabling scientific research through the creation of open software and data.

QUOTE
"Our project will help scientists around the world work with huge bio-imaging datasets, unlocking the potential of existing open data to make new advances in healthcare."

Resources