OSCARS image

Science cluster

PANOSC - Photon and Neutron Science

Summary

Diffraction experiments generate vast amounts of raw data in diverse formats, complicating their reuse and interoperability across research domains. The MC-ReDD project responds to this challenge by focusing on the expanded imgCIF format, which allows easy transfer and processing of the metadata associated with image data, without needing to transfer the actual data images. The project will create a publicly-available, easy-to-use tool for semi, or fully automatic construction of imgCIF files from raw data sets. This will facilitate the interoperability and reusability benefits offered by the imgCIF scheme so as to ensure trust in raw data re-use by specialists from other domains. 

MC-ReDD project image
Research domains:
Photon/neutron sources-based experimental research
Partner(s):
European XFEL, International Union of Crystallography Journals
Project team member(s):
Dr. Loes M.J. Kroon-Batenburg, Fabio Dall’Antonia (EuXFEL, coordinator from the hosting RI), James Hester (ANSTO, technical advisor), Thomas Kluyver (EuXFEL, Software developer), Alex Stanley (IUCr, advisor)

Challenge

Open Science project, Cross-domain/Cross-RI

Raw diffraction data are captured using numerous binary data formats and varying metadata structures, making it difficult to achieve interoperability or reusability. The large size of these datasets, often over 100 GB, further complicates data sharing and validation. Current methods for handling raw data lack standardisation, hindering transparent communication between researchers, and making it harder to trust and reuse data across disciplines.

Solution

MC-ReDD aims to build on the text-based imgCIF format to create a publicly-available, easy-to-use tool for semi or fully automatic construction of imgCIF files from raw data sets allowing easy transfer and processing of the metadata associated with image data, without necessarily needing to transfer the entire dataset. This scheme offers, for the first time, a way to transparently communicate rich information about raw data in a standardised, robust, machine-readable fashion, allowing third-party raw data services to be provided on the open web. 

Scientific Impact

While enhancing interoperability and reusability of raw diffraction data across scientific domains, the project aims to make the tools available as an Open Science Service within the EOSC Web of FAIR data and Services, or PaNdata software catalogue, as well as in the form of an open service hosted by the IUCr journals website. The imgCIF files can become part of open data collections hosted at Metadata Catalogues, enabling cross searching with other EOSC components. The imgCIF file thus represents a FAIR Digital object.


Keywords
diffraction data, powder diffraction, nano electron diffraction, imgCIF
Project start date:
Project duration:
18 months

Principal investigator

Dr. Loes M.J. Kroon-Batenburg - PI MC-ReDD
Dr. Loes M.J. Kroon-Batenburg
IUCr CommDat working group
BIO

Dr. Loes M.J. Kroon-Batenburg has been assistant professor at the Faculty of Science of the Utrecht University from 2003-2022, working on accurate data processing methods in crystallography, with focus on problematic and complicated diffraction patterns. She served on the International Union of Crystallography (IUCr) Diffraction Data Deposition Working Group (DDDWG) from 2011-2017, investigating the possibilities and practicalities of raw diffraction data archiving and metadata aspects. She has been the main-editor for the IUCrData journal section Raw Data Letters since 2022.

QUOTE
"MC-ReDD contributes to open science by providing a metadata extraction and conversion tool/service to the crystallography and diffraction communities, apt to streamline workflows that enrich raw data with complete and publication-compliant metadata from several raw data sources, including FELs."