Strengthening cross-disciplinary collaboration of RIs within the EOSC ecosystem
The CLARIN Annual Conference is the key event for users and developers of the CLARIN research infrastructure across Europe and beyond. CLARIN2025, held in Vienna, enabled the exchange of ideas and experiences within the community at large. CLARIN is one of the RIs participating in the OSCARS project.
Its role in OSCARS was highlighted throughout the conference in project presentations and a post-conference workshop on strengthening EOSC’s terminology infrastructures.
Earlier this year, grantees of the first round of the OSCARS open calls discussed their projects and needs with CLARIN representatives at the Strategy Days in Leuven (see blog post). CLARIN was thrilled to see three of the projects represented at the annual conference.
During the poster session, Hana Žižková presented an improved version of Czech open-source spellchecker Opravidlo:
'We were delighted to present Opravidlo.cz, an online proofreading tool for the Czech language supported by the OSCARS project. The poster session format allowed us to talk to those who were interested in the project from the perspective of their research. It was interesting to look at the issue of automatic corrections from the perspective of other languages as speakers of most European languages were present at the conference, or to consider the use of methods other than those we currently use. We appreciated that during the informal parts of the conference we were able to talk to colleagues from other OSCARS projects, and share experiences with reporting milestones within our projects. Specifically for our team, it was inspiring to see how others work with the community connected to the project – an area we have decided to improve. The CLARIN Annual Conference 2025 was also a great opportunity for networking, and overall we consider our participation in the conference to have been beneficial and inspiring.'
Based on Large Language Models (LLMs), Opravidlo’s service will benefit public services and future linguistic research, serve all age groups of native Czechs and accelerate the integration of immigrants to Czechia.
See the poster here
Representing the DraCorOS project, Ingo Börner shared the ongoing developments within community-driven drama corpora platform DraCor. Short for Drama Corpora, DraCor is a digital ecosystem dedicated to the study of drama from antiquity to the 20th century. The platform currently hosts over 4,400 theatre plays in 28 corpora from more than 22 languages.
Ingo Börner reflects:
‘Attending the CLARIN Annual Conference was an enriching and highly productive experience. The welcoming atmosphere of the CLARIN community stood out immediately – technical experts and infrastructure specialists were genuinely approachable, taking time during coffee breaks to provide detailed guidance and feedback. These informal conversations proved invaluable for our DraCorOS project goals.
The conference offered an important "reality check" for our integration plans. We're working to better connect DraCor with CLARIN infrastructure – particularly the VLO and Digital Object Gateway – viewing CLARIN as an accessible hub for onboarding our resources to EOSC. The discussions confirmed we're on the right track and provided concrete guidance: from examining code examples for integration to developing strategies for implementing persistent identifiers using handle links with fragment identifiers. I even received practical suggestions for test-deploying DraCor on the EOSC Cloud Container Platform – an idea I'm excited to pursue.
During the poster session, I presented our DTS (Distributed Text Services) implementation in DraCor, which enables fine-granular text access and improved citability. While DTS hasn't been widely adopted yet, the conversations generated valuable interest. It was particularly exciting to meet CLARIN-PL PhD student Alexandra Rys, who presented research based on data also available in DraCor's Polish Drama Corpus – discovering such usage directly supports our OSCARS goal of connecting research back to corpora through Linked Data technologies.
Meeting other OSCARS grantees provided helpful peer-to-peer exchanges about project expectations and challenges. Unlike typical humanities conferences, CLARIN's tech-affinity created space to genuinely "nerd out" on infrastructure questions – which, as builders of language-centered research infrastructure ourselves, we deeply appreciated. Overall, the conference provided both strategic validation and actionable next steps for advancing DraCor's integration into the CLARIN and EOSC ecosystems.'
Prior to CLARIN2025, the DraCorOS project co-organised the DraCor Summit, bringing together researchers in computational literary studies, cultural analytics, and adjacent fields to discuss their research and corpus projects related to DraCor. In line with its community focus, DraCor’s wide-reaching impact is reflected in their Zenodo library.
See the poster here
Building on CLARIN’s flagship project ParlaMintParlaCAP aims to make the existing parliamentary corpora more accessible to further enhance transparency in legislative discourse across Europe.
ParlaCAP’s Nikola Ljubešić presented the project during the Bazaar session:
‘Attending the CLARIN Annual Conference was an inspiring and rewarding experience. During the highly inclusive Bazaar session, I had the opportunity to present the current state of our work within the ParlaCAP OSCARS cascading grant project. The session generated substantial interest among participants, offering us valuable feedback and new perspectives on our approach to enriching and structuring parliamentary corpora. A particularly meaningful aspect of the event was the presence of the leader of the Croatian node of CESSDA, whose participation added great value to our discussions. As we recently published the first version of the ParlaCAP dataset in their CROSSDA repository, designed to serve social scientists through our transformation of the ParlaMint corpora into the structured ParlaCAP dataset, her engagement as a discussant was both timely and insightful. Our concept of a teacher-student LLM architecture within ParlaCAP also drew significant attention, especially its implementation using the IPTC news topic classification schema. We even had the opportunity to test it during the conference on historical newspaper datasets now feeding into PressMint -- the next CLARIN flagship project, following in the footsteps of ParlaMint. Overall, it was an exceptional event -- rich in ideas, collaborations, and conversations, far too many to fit into this short report.'
See the poster here
Following the main conference programme, the OSCARS Glossaries Task Force, coordinated by CLARIN, organised a post-conference workshop focused on the findability of existing glossaries, vocabularies, and thesauri – so-called semantic artefacts- within the EOSC landscape. These resources are essential for discovering research data and services. Representatives from the five clusters involved in the OSCARS project — ENVRI, ESCAPE, LSRI, PaNOSC, and SSHOC — were invited to share an overview of current solutions and community needs. The participants identified specific issues and opportunities in their scientific domains. They also exchanged best practices and gave recommendations for improving the way(s) in which vocabularies are used and sustained across the EOSC ecosystem. The workshop laid the groundwork for a roadmap towards a more unified landscape for semantic artefacts. The OSCARS workshop was followed up by an event on 3 October at the Austrian Academy of Sciences, organised by the SSH Vocabulary Commons initiative.
Workshop attendee Susanna-Assunta Sansone, professor and director at the Oxford e-Research Centre, reflects: ‘The workshop was an extremely valuable experience that provided both a comprehensive overview of current challenges and a constructive space for collaboration across the clusters. I found the compilation of discussions around use stories for the terminology infrastructure particularly insightful. The mix of presentations and open discussions provided valuable insights into existing solutions and highlighted areas where collaboration can make a real difference. The roadmap and recommendations discussed will be a strong foundation for future work towards a more harmonised and FAIR terminology ecosystem within EOSC.’
With OSCARS’ goals in mind, the CLARIN Annual Conference provided opportunities for consolidating collaboration and discussion about common approaches in the five Science Clusters, while also enabling exchanges on thematic issues and embracing Open Science within the SSH cluster.