Science cluster

PANOSC - Photon and Neutron Science

Summary

The PaN-Finder project aims to enhance the capabilities of the existing PaNOSC Data Portal, which provides researchers access to a wealth of data from Europe’s Photon and Neutron (PaN) Research Infrastructures (RIs). Originally developed as part of the EU Horizon 2020 project PaNOSC, the portal has already improved the visibility and accessibility of scientific data. PaN-Finder will introduce an AI-powered search tool that simplifies user interaction, enabling intelligent, prompt-based searches. By Leveraging state-of-the-art AI technologies, such as Large Language Models and Natural Language Processing, this tool will make it easier for researchers, journalists, and the general public to navigate and utilise valuable PaN data effectively.

PaN-Finder project image
Research domains:
Photon/neutron sources-based experimental research
Partner(s):
European Spallation Source
Project team member(s):
Massimiliano Novelli, Fredrik Bolmsten (European Spallation Source), Janos Babik (European Spallation Source)

Challenge

Open Science project, Open Science Service, Cross-domain/Cross-RI

As the scientific data pool grows, finding relevant and high-quality data becomes increasingly difficult. The existing PaNOSC Data Portal has enabled access to PaN research outputs, but there remains a need for more intuitive search mechanisms. The challenge lies in improving user interaction and data retrieval efficiency, enabling a wider range of users to navigate this expansive knowledge base effectively.

Solution

PaN-Finder is a prompt based search system, similar to ChatGPT, which will include a UI interface and underlying infrastructure. It will adopt state-of-the-art AI technologies (including Large Language Modelling - LLM, AutoEncoders, and Retrieval Augmented Generation - RAG), and pair them with Information Retrieval (IR) and Natural Language Processing (NLP) methodologies. As a result, the PaN-Finder tool will have the capability to understand both summative and exploratory questions, providing meaningful responses that summarise a vast knowledge pool and provide references to relevant resources. Embedding this innovative feature in the existing PaNOSC Data Portal, the project aims to stimulate usage within the PaN community and accelerate its adoption rate.

Scientific Impact

PaN-Finder will significantly improve data findability and accessibility, and will incentivise RIs to provide higher-quality data with well defined metadata, resulting in higher reusability of data. Open Science research will thus be enhanced, by promoting the adoption of FAIR principles across the research community, and enabling a wider audience to perform AI assisted, intelligent, user-friendly searches on the PaN knowledge. PaN-Finder will generate invaluable intelligence on PaN data, which will be instrumental in developing comprehensive guidelines for data curation and enhancing existing metadata. By facilitating better data discovery and curation practices, the project will accelerate scientific progress and collaboration, ensuring that high-quality data with enriched metadata are easily accessible for future research. The long-term impact will extend beyond the PaN community, fostering the seamless adoption of AI innovations while enhancing the overall quality and accessibility of PaN data resources. As part of the project’s outreach activities, these insights will be shared in the community, fostering collaboration and strengthening data quality standards.


Keywords
PaN Data Portal, photon and neutron data, photon and neutron science, AI technologies, PaN data
Project start date:
Project duration:
24 months

Principal investigator

Massimiliano Novelli - PI PaN-Finder project
Massimiliano Novelli
European Spallation Source
BIO

Max Novelli is a software engineer, data architect, and data manager currently working as Senior Data Curation Scientist at the Copenhagen office of the European Spallation Source. He is leading the data catalogue project, its deployment and integration in the existing IT infrastructure. He is also working to establish a well-defined set of metadata to describe scientific datasets under the FAIR principles.
He is passionate about data and data visualisation. When not lost in "computer land", he enjoys his family and friends, exercising and travelling.

QUOTE
"By leveraging the state-of-the-art AI technologies, the PaN-Finder will become the tool to empower researchers, and the public, to intuitively engage with complex datasets, lowering the barriers to Open Science, and fostering a culture change toward the adoption of FAIR principles."