| ELIXIR

Bioschemas: Community Adoption and Training

Bioschemas (http://bioschemas.org) is a community initiative which aims to improve data discoverability in the life sciences and provide better exposure of our data repositories, including the ELIXIR Core and Node Data Resources, to generic search engines, such as Google, and domain specific repositories such as Identifiers.org, FAIRsharing.org, and DataMed. It does this by encouraging content providers in life sciences to use Schema.org markup to expose consistent structured data in their websites.

Read more about Bioschemas: Community Adoption and Training

Integrating reference taxonomic databases for metabarcoding and metagenomics identification

Comparison of environmental sequences to reference sets from curated marker loci provides a mainstay for taxonomic analysis of microbial communities. Microbial eukaryotic sequencing requires many distinct reference sets to cover diversity adequately. Those producing reference sets follow different curation workflows, but share the need to provide their data onwards to a common set of tools and services, such as EMG, Megan, MetaPIPE and BioMaS.

There are multiple inefficiencies:

Read more about Integrating reference taxonomic databases for metabarcoding and metagenomics identification

Increasing Interoperability between ELIXIR Protein Structure and Sequence Resources and Expanding these Resources with 3D-Models of CATH Domains, built by SWISS-MODEL

This project will increase interoperability between four ELIXIR resources (CATH, SWISS-MODEL, InterPro and PDBe), three of which are Core Resources, by building APIs that facilitate the import and export of data between them.

Read more about Increasing Interoperability between ELIXIR Protein Structure and Sequence Resources and Expanding these Resources with 3D-Models of CATH Domains, built by SWISS-MODEL

Extending open proteomics data analysis pipelines in the cloud: Additional tools and focus on scalability, supporting the dramatic growth of public proteomics data

An ELIXIR implementation study started in February 2017, as a collaboration between EMBL-EBI and ELIXIR-DE. Its main objective is to develop open, robust, scalable and reproducible proteomics data analysis workflows based on OpenMS, directly connected to the PRIDE database (an ELIXIR core data resource) and to deploy these pipelines in the EMBL-EBI "Embassy Cloud" as a proof of concept.

Building on this work, we here propose a follow-up project that has three objectives:

Read more about Extending open proteomics data analysis pipelines in the cloud: Additional tools and focus on scalability, supporting the dramatic growth of public proteomics data

Integration and standardization of intrinsically disordered protein data (2018-IDPs)

Intrinsically disordered proteins (IDPs), characterized by high conformational variability, cover almost a third of the residues in Eukaryotic proteomes. As major players in cellular regulation, IDPs are involved in numerous diseases.

Specialized IDP databases provide a starting point for analysis, yet their integration into core databases remains very limited. Here, we propose to start integrating IDP information into ELIXIR Core Data Resources.

Read more about Integration and standardization of intrinsically disordered protein data (2018-IDPs)

FAIRness of the current ELIXIR Core resources: Application (and test) of newly available FAIR metrics, and identification of steps to increase interoperability (2018-FAIRCDR)

The FAIR (Findable, Accessible, Interoperable and Reusable) principles aim to maximize the discovery and reusability of digital resources. While the principles have enjoyed rapid uptake across communities (ELIXIR, G20, EOSC, H2020, NIH), the implementation details remain unclear.

Read more about FAIRness of the current ELIXIR Core resources: Application (and test) of newly available FAIR metrics, and identification of steps to increase interoperability (2018-FAIRCDR)

Establishment of an ELIXIR Contextual Data Clearinghouse

The objective is to develop and deploy an “ELIXIR Contextual Data Clearinghouse (clearinghouse)” for extending, correcting and improving publicly available annotations on records in sample and sequencing data resources.

Read more about Establishment of an ELIXIR Contextual Data Clearinghouse

Apple as a Model for Genomic Information Exchange

Apple is one of the most famous fruits globally and occupies a central position in folklore, culture, and art. Apple cultivars have retained high genetic and phenotypic diversity, evidenced by the high number of apple varieties cultivated today. The economic and cultural importance of apple has driven efforts to catalogue and exploit this genetic diversity, but few of these data are currently integrated into ELIXIR resources.

Read more about Apple as a Model for Genomic Information Exchange

Subscribe to

Hub updates