OSCARS in context: an introduction to Open Science, FAIR principles, EOSC and the SRIA

OSCARS in context BANNER

Read the article on Zenodo

 

Open Science is research based on open cooperative work that emphasises the sharing of data, knowledge, results and tools as early and widely as possible using digital and collaborative technology [1].

Notable Open Science practices include:

  • Early and open sharing of research:
    • Pre-registration, registered and open reports, data deposition in shared repositories, pre-prints.
    • Open collaboration within science and with other knowledge producers/users.
  • Providing immediate and unrestricted open access to scientific publications, research data, models, algorithms, software, protocols, notebooks, workflows, and all other research outputs.
  • Ensuring verifiability and reproducibility of research outputs through the provision of sufficient provenance information for all outputs.
  • Promoting public engagement in research and innovation, bolstering citizen science and enhancing public trust in science.
  • Responsible research output management (publications, data, and other outputs) in line with the CARE principles designed to ensure that scientific data are used in ways that are purposeful and oriented towards enhancing the wellbeing of people, and the FAIR principles:
    • FINDABLE - The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
      • F1. (Meta)data are assigned a globally unique and persistent identifier
      • F2. Data are described with rich metadata (defined by R1 below)
      • F3. Metadata clearly and explicitly include the identifier of the data they describe
      • F4. (Meta)data are registered or indexed in a searchable resource
    • ACCESSIBLE - Once the user finds the required data, they need to know how they can be accessed.
      • A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
      • A1.1 The protocol is open, free, and universally implementable
      • A1.2 The protocol allows for an authentication and authorisation procedure, where necessary
      • A2. Metadata are accessible, even when the data are no longer available
    • INTEROPERABLE - The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
      • I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
      • I2. (Meta)data use vocabularies that follow FAIR principles
      • I3. (Meta)data include qualified references to other (meta)data
    • REUSABLE - The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
      • R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
      • R1.1. (Meta)data are released with a clear and accessible data usage license
      • R1.2. (Meta)data are associated with detailed provenance
      • R1.3. (Meta)data meet domain-relevant community standards

Open Science is also the key driver of the European Open Science Cloud (EOSC), which has the objective to provide researchers and innovators in Europe with an open and trusted multi-disciplinary environment where they can publish, find and reuse data, tools and services for research and innovation. EOSC will transform how researchers access and share digital knowledge throughout the research lifecycle, helping European scientific communities reap the full benefits of data-driven science and giving Europe a global lead in both research data management and scientific progress [2].

The EOSC Strategic Research and Innovation Agenda - SRIA [3] provides a clear roadmap - validated through wide consultation of EOSC stakeholders, including representatives of EU Member States and other countries associated to the EU R&I programme, research-performing and research-funding organisations, research infrastructures and e-infrastructures, research libraries and research associations - to achieve the EOSC vision and objectives, namely to deliver an operational “Web of FAIR data and services” for science [3]. 

The European research infrastructures have been enthusiastic supporters of EOSC and have benefited from support from the European Commission to accelerate the implementation of open science practices across domains.  Groups of thematically-aligned research infrastructures came together to form “Science Clusters”, working together to achieve this goal.

Within this broad digital landscape, the OSCARS project aims to foster the uptake of Open Science in Europe by promoting FAIRification of data and tools, creating new FAIR data, developing software to analyse the data, and making them available to the European research community.

The project was designed and is implemented by five Science Clusters — covering environmental sciences - ENVRI, astronomy and particle physics - ESCAPE, life sciences LS-RI, photon and neutron science - PaNOSC, and social sciences and humanities - SSHOC— to strengthen their role in the European Research Area by consolidating their past achievements into lasting interdisciplinary FAIR data services and working practices across scientific disciplines and communities, and by supporting the implementation of new Open Science projects and services.

In the 1st OSCARS Open Call, 58 projects were funded with a total  of 13 million EUR; another 12-15 projects will be funded in the 2nd Open Call with the remaining 3 million EUR. In addition, the clusters offer training platforms, organise hackathons, and community workshops, and have started to set up Community-based Competence Centres (CCCs) tailored to each domain. Their shared infrastructure, such as federated search Application Programming Interface - APIs (PaNOSC), the ENVRI-Hub, the SSHOC Open Marketplace, the LS-RI’s Workflow Hub, and Virtual Research Environments (VREs), enable scientists from different domains to use EOSC resources seamlessly, including AI-driven workflows, such as SSHOC’s workflows for AI-based models, or the ESCAPE’s Data Lake. 


List of abbreviations

  • API: Application Programming Interface
  • CCC: Community-based Competence Centre
  • CODAS: Composable Open Data and Analysis Services
  • CVMFS: CernVM File System
  • DCAT-AP: DCAT Application profile for data portals in Europe
  • DIOS: Data Infrastructure for Open Science
  • DOI: Digital Object Identifier
  • EOSC: European Open Science Cloud
  • FAIR: Findable, Accessible, Interoperable, Reusable
  • GDPR: General Data Protection Regulation
  • HDF: Hierarchical Data Format
  • RI: Research Infrastructure
  • SRIA: Strategic Research and Innovation Agenda
  • VISA: Virtual Infrastructure Scientific Analysis platform
  • VRE: Virtual Research Environment

Glossary

  • CVMFS – CernVM File System, a read-only file system designed for delivering software and data to large-scale computing environments.
  • EOSC Exchange is a catalogue and marketplace which are made up of components including onboarding workflow and data transfer services.
  • FAIR Implementation Profile (FIP) Wizard: a tool, built on the Data Stewardship Wizard software, that allows research communities to document their specific choices for implementing the FAIR Guiding Principles.
  • Galaxy is an open, web-based platform designed to make computational research FAIR, promoting Open Science practices.
  • HDF - Hierarchical Data Format is a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data.
  • Jupyter Notebook: original web application for creating and sharing computational documents. It offers a simple, streamlined, document-centric experience.
  • Metadata crosswalk: a mapping or table that translates metadata elements and their values from one schema (source) to another (target), facilitating interoperability and data exchange between different systems or standards.
  • Metadata standard: a requirement which is intended to establish a common understanding of the meaning or semantics of the data.
  • Research Object Crate - RO-Crate is a metadata packaging format for describing research objects in a machine-actionable way.
  • Science demonstrator: a physical or conceptual "thing" that communicates scientific principles, the feasibility of technology, or the workings of an experiment or process to an audience. It can be a tangible prototype, a set of instructions, or an interactive visual, used by educators to illustrate concepts, researchers to attract investment, or to fulfill a specific job role in an academic setting.
  • WorkflowHub: a registry for computational workflows.

Authors

Nicoletta Carboni (CERIC-ERIC), Romain David (ERINHA),Franciska de Jong (Utrecht University), Darja Fišer (CLARIN ERIC), Friederike Schmidt-Tremmel (Trust-IT).

References

[1] https://rea.ec.europa.eu/open-science_en https://research-and-innovation.ec.europa.eu/strategy/strategy-research-and-innovation/our-digital-future/open-science_en 

[2] https://research-and-innovation.ec.europa.eu/strategy/strategy-research-and-innovation/our-digital-future/open-science/european-open-science-cloud-eosc_en

[3] https://op.europa.eu/en/publication-detail/-/publication/f9b12d1d-74ea-11ec-9136-01aa75ed71a1/language-en

DOI

http://doi.org/10.5281/zenodo.17193362

Read the article on Zenodo