Metabolomics services

Name Description ELIXIR Node
ELIXIR Netherlands

Cellular and molecular biology are fundamental to ELIXIR's mission. As part of our 2024–28 Programme, we are committed to advancing data services and software for research on nucleic acids, proteins and other biomolecules. This initiative will address new demands for multi-omics and multi-modal analyses, including imaging, by developing methods and partnerships. We will also expand expertise in reusable data and software to incorporate FAIR models, ensuring robust solutions for modelling at all scales. 

The following projects are key to connecting the latest developments with established data resources, unlocking the potential of cellular and molecular biology:

  • Advancing structural and functional ontologies of disordered proteins 
  • DBTLHub: Towards a one-stop shop for connecting databases, datasets and tools for the Design-Build-Test-Learn cycle in biotechnology 
  • Spatial2Galaxy: There is no Galaxy without Space 
  • Next level of reproducible, comparable and integrable Metabolomics

This project addresses the limitations of current ontologies in capturing the dynamic nature of disordered protein regions by pursuing several primary objectives. Firstly, novel structural and functional ontologies will be developed to accurately represent the structural heterogeneity and dynamic functional annotations of proteins. These ontologies will incorporate timescales, annotating the kinetics of structural transformations to elucidate molecular mechanisms and regulatory pathways governing protein dynamics. 

Collaborating with existing databases and consortia will ensure seamless integration of ontological resources and experimental data, fostering interoperability and accelerating discoveries. A standardised file format specification will also be developed in collaboration with the Human Proteome Organisation Proteomics Standards Initiative, facilitating the encoding of structural state transitions within disordered protein regions. This specification will enhance data interoperability and exchange among research groups and databases, providing a common language for describing structural transitions and advancing our understanding of the functional implications of protein dynamics in biological systems.

Nodes involved: ELIXIR Belgium, ELIXIR Hungary, ELIXIR Italy, EMBL-EBI
Communities: 3D BioInfo, Intrinsically Disordered Proteins

This project aims to strengthen the basis for a one-stop shop connecting databases, datasets and tools for the deployment of the engineering Design-Build-Test-Learn (DBTL) framework in biotechnology. It will do so by surveying the tools and data landscape, pinpointing gaps and opportunities, and establishing design patterns for task-specific workflows for analysis, integration and sharing of multimodal data. 

It will provide a resource that will allow users to navigate the complex landscape of biotechnology tooling and data, as well as to establish solutions that fit their specific DBTL requirements. Use cases from ongoing programmes in various communities will be used to ascertain and establish the pragmatic value of the solutions. 

The work will be carried out through hands-on activities, dedicated workshops and hackathons, providing training and resources, as well as fostering industrial engagement. The experience of the communities and platforms involved in systems biology, industrial biotechnology, metabolic modelling, metabolomics, enzymes, bioprospecting and data management will be particularly valuable in this respect, as well as their respective industrial relations. Accordingly, the project engages participants from seven ELIXIR nodes and connects researchers and their activities from six communities. 

The project outcomes will contribute to advancing the ambition of connecting the latest developments and established data resources across ELIXIR to realise the potential of cellular and molecular biology, particularly in the fields of industrial biotechnology and biomanufacturing.

Nodes involved: ELIXIR Spain, ELIXIR Greece, ELIXIR France, ELIXIR Netherlands, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR UK
Communities: Biodiversity, Microbiome, Metabolomics, Microbial Biotechnology, Research Data Management, Systems Biology

Spatial transcriptomics (ST) was named ‘Method of the Year 2020’ by Nature Methods and was more recently featured in Nature’s Seven technologies to watch in 2024. ST is now a prerequisite for researching transcriptional pathology at the cellular and molecular levels. Current use of ST is ubiquitously applied to multiple pathologies, including neurodegenerative disease, cancer, cardiomyopathy and nephrology. There is also an emerging application of ST in plant and microbiome research. While there are a plethora of spatial analysis applications, these are not unified or easily manageable by research scientists and they lack any hope of delivering FAIR and reproducible results.

To address this challenge, we will implement Spatial2Galaxy (S2G) – a self-contained, reproducible, scalable FAIR spatial transcription analysis platform for researchers and bioinformaticians alike. We will develop S2G based on our success with developing Galaxy workflows, training materials and ST and single-cell analysis pipelines. 

S2G will provide state-of-the-art ST tools and workflows with proven high performance in benchmarking studies, ensuring the uptake of best practices. These tools will be demonstrated on datasets that connect various ST databases. This will consolidate community guidelines for integrative multi-modal single-cell omics and imaging analysis. Compared to non-spatial single-cell sequencing, presented as the Nature ‘Method of the Year 2013', it took six years until practical training and workflows for its analysis were FAIRified and available in Galaxy by 2019. In contrast, S2G aims to reduce this gap between technologies becoming relevant and provision of FAIR resources to the life science community for ST. 

Nodes involved: ELIXIR Germany, ELIXIR France, ELIXIR Netherlands, ELIXIR UK
Communities: Cancer Data, Galaxy, Human Copy Number Variation, Single-Cell Omics

The ELIXIR metabolomics community relies on standards, formats and data treatment solutions development and adoption, but it remains challenging to ensure high-quality reported metadata, sufficiently contextualised results, interoperable and reusable datasets and to integrate these metabolomics data with other omics or studies. 

This project is designed to address these issues and aims to connect key international standards with ELIXIR resources, as well as creating associated community guidelines and training materials. 

Based on the FAIRification framework, activities in the project will: i) increase interoperability and reuse of public metabolomics datasets and workflows through enhanced and extended open data standards, resources and new semantic annotations, ii) define, ensure and establish quality control for study baselines in Metabolomics and Exposomics, and iii) facilitate metabolomic data interpretation and meta-analysis integration with multi-omics and systems biology studies. 

As a first necessary step, the project will create a Semantic Metabolomics Data Model to standardise metadata, ensuring unambiguous reuse of metabolomics projects. This model will focus on integrating key ontologies, providing open training initiative and enhancing the interoperability of metabolomics data through the production of open guidelines for annotation steps. By linking with ELIXIR’s Deposition databases, ISA Framework and other services, the project seeks to boost interconnection with ELIXIR platforms, other ELIXIR communities (Systems Biology, Food and Nutrition, Galaxy, Proteomics, Toxicology, Research Data Alliance Focus Group ...), the FAIR Cookbook and BioSchemas.org communities. Project outcomes are expected to promote  the emergence of ambitious and innovative semantic-based solutions for inter-comparison of studies in healthcare, clinical and plant domains.

Nodes involved: ELIXIR Czech Republic, ELIXIR Germany, ELIXIR Italy, ELIXIR Spain, ELIXIR France, ELIXIR Netherlands, ELIXIR Sweden, ELIXIR UK, EMBL-EBI
Communities: Food and Nutrition, Galaxy, Metabolomics, Proteomics, Research Data Management, Single-Cell Omics, Systems Biology, Toxicology

ELIXIR Belgium, ELIXIR Czech Republic, ELIXIR France, ELIXIR Greece, ELIXIR Hungary, ELIXIR Italy, ELIXIR Netherlands, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR UK, EMBL-EBI

This project will be led by the ELIXIR Proteomics Community in collaboration with members of the Metabolomics Community and three ELIXIR platforms. High-throughput proteomics has become a popular choice in biological, biomedical and clinical studies and led to the development of hundreds of bioinformatics tools and data analysis pipelines. Given their large diversity, there is a urgent need to compare and benchmark different software pipelines over a large data spectrum.

This study aims to create the framework to benchmark proteomics data analysis workflows, to be built upon and improve resources from ELIXIR Tool, Data and Compute platforms by creating an interface between them linked with public proteomics data and open source stand-alone software and pipelines.

The involved data will be annotated with at least EOSC minimum information according to ELIXIR metadata standards. Our benchmarking will identify robust workflows and therefore nurture the proteomics community with high quality standards required for reproducible research and clinical applications.

ELIXIR Denmark, EMBL-EBI, ELIXIR Netherlands, ELIXIR Spain, ELIXIR France, ELIXIR Sweden, ELIXIR Italy, ELIXIR Czech Republic, ELIXIR Germany

The aim of this Implementation Study is to determine the requirements for validation with ELIXIR partners, to build prototype open validation services for archetype archival databases and knowledge bases, in particular:

  • Content validation according to minimum information checklists.
  • Syntactic format validation according to a standard format in conjunction with the GA4GH file formats team as part of the Large Scale Genomics Workstream.
  • Syntactic format validation for Phenotyping data.
  • Semantic validation according to a publicly available ontology.
ELIXIR Belgium, ELIXIR France, EMBL-EBI, ELIXIR UK
ELIXIR Belgium, ELIXIR Cyprus, ELIXIR Czech Republic, ELIXIR Denmark, ELIXIR Estonia, ELIXIR Finland, ELIXIR France, ELIXIR Germany, ELIXIR Greece, ELIXIR Hungary, ELIXIR Ireland, ELIXIR Israel, ELIXIR Italy, ELIXIR Luxembourg, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI
ELIXIR Greece, ELIXIR Netherlands, ELIXIR Spain

ELIXIR is about integration of diverse resources including tools, training materials and technical services. Within EXCELERATE, ELIXIR is building portals to collate information on tools and data services (bio.tools), training events and material (TeSS, WP11 e-learning environment), compute resources (WP4 technical service registry) and cross-linked policy, standards and databases (FAIRsharing, WP4). A focus of EXCELERATE is to set up these portals such that they can interoperate.

Currently, a scientist can use TeSS to find training events and materials and then, in a separate search, use bio.tools to find relevant tools, and FAIRsharing to find standards and databases. At the moment these ELIXIR portals provide a useful, but fragmented service.  Ideally, linking TeSS and bio.tools to ELIXIR’s computer resources via common workflow diagrams would enable end-users to discover and learn about the prevalent bioinformatics workflows. In this implementation study, we want to achieve the first step and link TeSS and bio.tools via most prevalent bioinformatics workflows and lay the foundation to later incorporate other ELIXIR platforms, such as the compute resources, to provide an even more useful service for the researcher.

The goal of this implementation study is to provide the life-scientist end-user with a powerful tool to find and use ELIXIR resources - across the spectrum - based on intuitive graphical diagrams of the most prevalent scientific workflows.

ELIXIR UK, ELIXIR Estonia, ELIXIR Belgium, ELIXIR Denmark, ELIXIR Switzerland, EMBL-EBI, ELIXIR Norway, ELIXIR France

Metabolomics aims to provide novel insights into the biochemical reactions of organisms by characterising the presence and concentrations of low molecular weight compounds from biological samples. The primary analytical tools for such high-throughput data collection are mass spectrometry (MS), often preceded by chromatographic or electrophoretic separation technologies, and nuclear magnetic resonance spectroscopy (NMR).

These technologies produce relatively large and complex data sets that require bioinformaticians, cheminformaticians, biostatisticians, data scientists and computer scientists. Together they develop and apply a wide range of algorithms, software tools, repositories and computational resources to process, analyse, report and store the data and metadata.

Increasingly, insights from genomics, epigenomics, transcriptomics, proteomics/protein interactomics and metabolomics are combined, to gain insights into the dynamics of biological processes. Metabolomics activities are well represented within Europe and ELIXIR nodes. Metabolite identification is the area that the community believes will have maximal impact of computational metabolomics and metabolomics data management and will benefit most from interactions with the existing five ELIXIR platforms and where progress will contribute most to other ELIXIR communities.

The progress through this integrative Implementation Study will benefit industry and academia alike as metabolite identification is one of the major bottlenecks in metabolomics and resolving this challenge requires a community effort.

ELIXIR Netherlands, EMBL-EBI, ELIXIR France, ELIXIR UK, ELIXIR Germany, ELIXIR Spain, ELIXIR Sweden, ELIXIR Italy, ELIXIR Estonia, ELIXIR Switzerland, ELIXIR Belgium

This Metabolomics Community-led project on the standardization of fluxomics workflows aims at:

  • establishing standards for isotopic labeling data deposition, a major fluxomics input, and accordingly extending MetaboLights (EMBL-EBI), the reference database for quantitative metabolomic datasets;
  • establishing interoperability among largely-used fluxomic tools, building upon the PhenoMeNal fluxomic tool inventory,
  • extending BioSchemas to metabolic reactions and their dynamics, using the metabolic reaction database Rhea (SIB) for metabolic model reconstruction,
  • containerizing the fluxomic workflow for use in cloud-based environment, and
  • standardizing the fluxomics training. The study is aligned with the Human Data and Plant Science Use Cases and the existing Proteomics, Galaxy and the suggested Toxicology, Nutrition and Microbial Biotechnology Communities.

Background

Metabolic reaction rates (fluxes) provide a measure of the in vivo enzymatic activities that cannot be directly available from the transcriptomic, proteomic or metabolomic data alone, even extended with isotopic labeling measurements.

However, flux distribution maps and through them, metabolic network dynamics, can be revealed when analyzing these data integrated with regulatory information using multi-level and multi-scale models. In this context, fluxomics is an integral part of the bioinformatics and systems biology toolbox. It has significant applications in industrial biotechnology, metabolic or protein engineering, nutritional systems biology, toxicology, precision agriculture and crop improvement and network and systems medicine for the investigation of (patho)physiological mechanisms of complex diseases.

A successful fluxomic analysis is based on the accuracy of quantitative metabolomic data (extra- and intra-cellular) and isotopic labeling measurements and the reconstruction of metabolic networks that describe the stoichiometry - and when available the regulation- of metabolic reactions. To date, the community lacks standardized isotopic labeling data repositories, interoperability among the fluxomic tools and harmonized fluxomic training workflows. In this context, the Metabolomics Community decided to focus its second implementation study on the standardization of fluxomic workflows.

Goals

Standardization of the fluxomic workflow requires (a) standardization and FAIRification of the quantitative metabolomic and isotopic labeling data input, (b) standardized reconstruction of metabolic models, based on metabolic reaction databases and ontologies, extended with regulatory information, and (c) interoperability of the various fluxomics tools and the metabolomic and ontology databases. The standardized workflow could be containerized to work in a cloud-based environment. In this implementation study, these objectives will be pursued through the following specific aims:

  1. To establish standard rules for the deposition of isotopic labeling data and accordingly extend the MetaboLights database (EMBL-EBI), established as the reference repository for quantitative metabolomic data. This work will build upon ongoing efforts at EBI and ES node in collaboration with the Data, Tools and Interoperability Platforms.
  2. To identify largely used fluxomic tools and establish their interoperability, building upon the PhenoMeNal fluxomic tool inventory, in collaboration with the ELIXIR Tools and Interoperability platforms. This implementation study will focus exclusively on open source software, in alignment with the general recommendations of ELIXIR.
  3. To establish interoperability between the databases (quantitative metabolomic and isotopic labeling, metabolic reaction, ontologies, protein, signaling networks) and the fluxomic tools for the accurate reconstruction of relevant metabolic models, and to extend BioSchemas to metabolic reactions and their dynamics, in collaboration with the Interoperability platform. Rhea (SIB), to be linked to UniProt by the end of 2018, will be used as the reference metabolic reaction information resource.
  4. To containerize the standardized fluxomic workflow for use in cloud-based environment in interaction with the Compute platform.
  5. To standardize the fluxomic training workflow and organize webinars and a summer school based on the guidelines and best practices as described in the ELIXIR Training Toolkit, in close collaboration with the Training platform.
ELIXIR Greece, ELIXIR Spain, ELIXIR Netherlands, ELIXIR Belgium, ELIXIR Italy, ELIXIR Germany, ELIXIR France, ELIXIR UK, EMBL-EBI, ELIXIR Finland, ELIXIR Sweden, ELIXIR Estonia, ELIXIR Switzerland, ELIXIR Greece