OSCARS image

Science cluster

SSHOC - Social Sciences and Humanities

Summary

The AMIS project aims to develop an innovative web application specifically designed for humanities researchers, focusing on text analysis for metadata enrichment. It is expected to be a user-friendly tool for assessing metadata quality, enriching it with additional information, and facilitating text analysis. The web application seeks to streamline metadata creation by providing design assistance, improving academic discoverability, and offering personalised recommendations for scholarly data. By leveraging machine learning, AMIS enables a contextualised approach to metadata enrichment, enhancing research processes, particularly for heritage texts in the Social Sciences and Humanities (SSH).

Research domains:
Social Science and Humanities
Partner(s):
ARIANE Consortium of the Huma-Num infrastructure, CNRS, University of Poitiers, University of Sorbonne Nouvelle, University of Lyon, University of Madrid Complutense, University AL. I. Cuza
Project team member(s):
Prof. Ioana Galleron (University of Sorbonne Nouvelle, France), Prof. Fatiha Idmhand (University of Poitiers, France), Prof. Sabine Loudcher (University of Lyon 2, France), Prof. Amelia Sanz (University of Madrid Complutense, Spain), Prof. Simone Rebora (University of Verona, Italy), Dr. Roxana Patras (University AL. I. Cuza, Romania)

Challenge

Open Science Service

Metadata enrichment in digital humanities is often a manual, time-consuming task, with shared standards being defined but lacking open services that support the large-scale creation of quality metadata. As a result, due to differing interpretations of expectations, information extracted from the same field across a wide range of resources tends to display a high degree of inconsistency. The challenge, therefore, is to strike a balance between speed and precision while taking the researcher’s expertise into account.

Solution

By developing a context-based web application, AMIS will provide an innovative environment for scholars to create and refine their own metadata in faster and more precise ways while also assisting them in identifying best practices and shared vocabularies for describing text content. The service uses machine learning techniques (text classification, named entity recognition - NER, named entity linking - NEL, sentiment analysis, topic modelling, text summarization, sequence labelling, and dependency parsing) to analyse the uploaded files, compare them with data from international repositories, and suggest enhancements in the form of metadata. The core module of AMIS will be trained on a wide variety of resources hosted in repositories such as Zenodo, Nakala, Rossio, Recolecta, Docta, and several others.

Scientific Impact

By enhancing metadata quality and coherence, AMIS will contribute to the convergence of conceptual models in text description and analysis within cultural studies and other SSH disciplines. This will increase comparability among similar research conducted in different academic contexts and corpora. The project will also benefit researchers by improving their own metadata. Existing collections will be upgraded, while new digital corpora will be designed from the outset with enriched metadata perspectives. Heritage texts will not only be made available but will also be virtually linked through a multitude of information points to other existing resources, supporting further discoveries about the circulation of ideas and themes across cultural areas and over time. Finally, AMIS promotes better practices for metadata creation, improving the discoverability and reusability of scientific content.


Keywords
metadata enrichment, nature language processing, text analysis, machine learning, (meta)data quality assessment, digital humanities
Project start date:
Project duration:
24 months

Principal investigator

Fatiha Idmhand - PI AMIS project
Fatiha Idmhand
University of Poitiers
BIO

Fatiha Idmhand is currently a Professor at the University of Poitiers and a Researcher at the Institut des Textes et Manuscrits Modernes (Archivos, UMR-8132, Paris). Her academic background is in Literature and Hispanic Studies (Spanish and Spanish American Studies), with a particular focus on creative processes (genetic criticism studies) and Digital Humanities.

Her scientific research focuses on archives and manuscripts, particularly in relation to key themes such as cultural transfers and the circulation of literature, arts, and ideas between Europe and the Americas during the major conflicts and crises of the 20th and 21st centuries.

She is currently working on cultural mediators, new typologies for manuscripts, and methods in literary computing.

QUOTE
"The AMIS project offers an innovative web tool that streamlines metadata creation, enriches research with relevant information, and enhances discoverability, advancing Open Science by promoting accessible and well-structured scholarly data in cultural and textual studies."