Science cluster
Summary
The AMIS project aims to develop an innovative web application specifically designed for humanities researchers, focusing on text analysis for metadata enrichment. It is expected to be a user-friendly tool for assessing metadata quality, enriching it with additional information, and facilitating text analysis. The web application seeks to streamline metadata creation by providing design assistance, improving academic discoverability, and offering personalised recommendations for scholarly data. By leveraging machine learning, AMIS enables a contextualised approach to metadata enrichment, enhancing research processes, particularly for heritage texts in the Social Sciences and Humanities (SSH).
Challenge
Open Science Service
Metadata enrichment in digital humanities is often a manual, time-consuming task, with shared standards being defined but lacking open services that support the large-scale creation of quality metadata. As a result, due to differing interpretations of expectations, information extracted from the same field across a wide range of resources tends to display a high degree of inconsistency. The challenge, therefore, is to strike a balance between speed and precision while taking the researcher’s expertise into account.
Solution
By developing a context-based web application, AMIS will provide an innovative environment for scholars to create and refine their own metadata in faster and more precise ways while also assisting them in identifying best practices and shared vocabularies for describing text content. The service uses machine learning techniques (text classification, named entity recognition - NER, named entity linking - NEL, sentiment analysis, topic modelling, text summarization, sequence labelling, and dependency parsing) to analyse the uploaded files, compare them with data from international repositories, and suggest enhancements in the form of metadata. The core module of AMIS will be trained on a wide variety of resources hosted in repositories such as Zenodo, Nakala, Rossio, Recolecta, Docta, and several others.
Scientific Impact
By enhancing metadata quality and coherence, AMIS will contribute to the convergence of conceptual models in text description and analysis within cultural studies and other SSH disciplines. This will increase comparability among similar research conducted in different academic contexts and corpora. The project will also benefit researchers by improving their own metadata. Existing collections will be upgraded, while new digital corpora will be designed from the outset with enriched metadata perspectives. Heritage texts will not only be made available but will also be virtually linked through a multitude of information points to other existing resources, supporting further discoveries about the circulation of ideas and themes across cultural areas and over time. Finally, AMIS promotes better practices for metadata creation, improving the discoverability and reusability of scientific content.
Principal investigator
Fatiha Idmhand is currently a Professor at the University of Poitiers and a Researcher at the Institut des Textes et Manuscrits Modernes (Archivos, UMR-8132, Paris). Her academic background is in Literature and Hispanic Studies (Spanish and Spanish American Studies), with a particular focus on creative processes (genetic criticism studies) and Digital Humanities.
Her scientific research focuses on archives and manuscripts, particularly in relation to key themes such as cultural transfers and the circulation of literature, arts, and ideas between Europe and the Americas during the major conflicts and crises of the 20th and 21st centuries.
She is currently working on cultural mediators, new typologies for manuscripts, and methods in literary computing.