Science cluster
Summary
The FAIRY project focuses on enhancing YEASTRACT+, a widely used open-access FAIRsharing Knowledge-base and repository developed by ELIXIR, that supports research on yeast transcriptional regulation. Yeasts are vital organisms in both the bioeconomy, e.g. through the industrial production of food, energy and commodities, and in health research, due to their role as pathogens and model organisms for studying genetics and disease. FAIRY aims to significantly advance scientific understanding of yeast regulatory networks and improve the platform by integrating advanced data analysis tools, ensuring compliance with FAIR principles. By relying on the Galaxy Europe platform for raw data curation, the project will link with the EOSC, increasing the database’s scientific utility and promoting data sharing across disciplines.
Challenge
Open Science project, Open Science Service, Cross-domain/Cross-RI
Despite YEASTRACT+'s role as a leading resource for yeast transcriptional regulation, it currently lacks quantitative gene expression data. Moreover, many yeast species in the database remain poorly characterised, impeding the understanding of their regulatory mechanisms and broader bioeconomic applications. Additionally, many yeast species remain understudied, hindering our understanding of their regulatory mechanisms. The FAIRY project aims to tackle this challenge by incorporating machine learning models trained on cross-species regulatory data, while ensuring that the data remains FAIR.
Solution
FAIRY will develop and integrate tools for quantitative transcriptomics analysis within YEASTRACT+, allowing researchers to identify regulatory interactions and assess their quantitative impact on gene expression levels. The project will rely on the Galaxy Europe platform for raw data curation, in particular large-scale RNA-seq data processing, enhancing the efficiency and scalability of quantitative transcriptomics analysis. Additionally, machine learning models trained on cross-species regulatory data will be employed to predict new transcriptional regulatory rules for transcription factor activity and evolution in poorly-characterised yeasts. To align with the FAIR principles, the project will develop web services, or APIs, enabling seamless data exchange between YEASTRACT+ and major biological databases (such as GenBank, KEGG, MetaCyc, SGD/CGD, and FungiDB), thus enhancing data accessibility, interoperability, and reusability. Researchers will be able to effortlessly access and combine YEASTRACT+ data with information from these complementary resources, fostering a more comprehensive understanding of yeast regulatory networks.
Scientific Impact
By combining data on how genes are regulated and how yeast cells process nutrients, the FAIRY project will improve our understanding of how transcription factors control gene expression and the consequent phenotypic outcome. It will also allow scientists to predict how these regulatory events affect the cell’s metabolism, which is important for optimising yeast-based production processes towards a circular bioeconomy. The project will help researchers identify which genes are essential for the cell's survival, not just in metabolism but also in how genes are regulated. This will expand current tools used for identifying potential drug targets. Typically, each user has to perform these calculations themselves. The FAIRY project will pre-calculate these outcomes for a range of yeast species (currently limited to S. cerevisiae and C. albicans), storing them in a central database. By offering ready-made predictions, the project will make it easier for scientists to reuse the data, speeding up research in designing better yeast strains for industry and finding new antifungal drug targets.
Principal investigator
Pedro T. Monteiro is an Associate Professor of Algorithms at the CS Department of IST-Universidade de Lisboa and a Researcher at INESC-ID Lisboa.
His current research interests include formal and static analysis of qualitative models of biological regulatory networks. In particular, through the use of formal verification techniques, for the exploration of interesting dynamical behaviours of qualitative biological models.
He is also deeply involved in the continuous maintenance and development of http://yeastract-plus.org , a portal of repositories of regulatory associations of several yeast species.