BeYond-COVID (BY-COVID) aims to provide comprehensive open data on SARS-CoV-2 and other infectious diseases across scientific, medical, public health and policy domains. It will strongly emphasise mobilising raw viral sequences, helping to identify and monitor the spread of SARS-CoV-2 variants. The project will further accelerate access to SARS-CoV-2 and COVID-19 and linking patient and research data.
To ensure interoperability of national and global efforts, BY-COVID will enable federated data analysis compliant with data protection regulations, harmonise and manage metadata and sample identifiers, and facilitate long-term cataloguing.
BY-COVID will build on the One-Health approach, exploit and contribute to the European Open Science Cloud and work closely with the ISIDORe project funded through HORIZON Europe.
The project will integrate established national and European infrastructures with ELIXIR, BBMRI, ECRIN, PHIRI and CESSDA. It will build on existing efforts, such as the COVID-19 Data Platform and the Versatile Emerging infectious disease Observatory project (VEO), maximising efficiency. It will also develop synergies with the European Health Data Space.
In an unprecedented and unique interdisciplinary effort, BY-COVID will bring together 53 partners from 19 countries and stakeholders from the biomedical field, hospitals, public health, social sciences and humanities.
Ultimately, it will improve European readiness for future pandemics, enhance genomic surveillance and rapid-response capabilities. In addition, BY-COVID serves as a demonstrator of interdisciplinary work across country borders. The project's outputs will allow scientists across multiple domains, including SMEs and industry, to access varied data with the potential to generate new knowledge on infectious diseases.
BY-COVID in numbers
The project work is divided into 8 Work Packages (WPs):
The Work Packages will mobilise SARS-CoV-2 and other infectious disease data (i.e. make it easier to transfer to a data repository), connect and standardise the data (to make data searchable via the COVID-19 Data Portal and provide data management protocols), and expose and analyse the data (provide standardised analysis methods).
Work Package 1 will establish and improve SARS-CoV-2 Data Hubs that, with a globally comprehensive viral sequence and normative variation data sets, provide the foundation for linking genomic surveillance with heterogeneous data across domains. An open call will allow additional Data Hubs to be established and packaging of a rapid deployment "preparedness" Data Hub addresses future pathogen outbreaks.
WP leaders: Guy Cochrane (EMBL-EBI), Clara Amid (Erasmus MC)
Work Package 2 brings together data resources and catalogues across domains, captures data governance and access procedures. It will align metadata descriptions and other relevant semantic information first within domains (e.g., biomolecular and imaging, clinical and health, survey, etc) and in a second stage (in alignment with WP3 developments) expose a reference catalogue with harmonised metadata descriptions across domains.
WP leaders: Alfonso Valencia/Salvador Capella-Gutierrez (BSC), Antje Keppler (EuroBioImaging)
Work Package 3 is focussed on services for the discovery, integration and citation of COVID-19 data by delivering a flexible, tiered metadata discovery system across different domains, metadata standards, and maturity/robustness levels of data sources. This will enable the linking of FAIR data and metadata on SARS-CoV-2 and COVID-19, other infectious diseases and related data, and ultimately increase the potential for collaboration and exploitation of data.
WP leaders: Henning Hermjakob (EMBL-EBI), Mari Kleemola (CESSDA/TAU-FSD)
Work Package 4 will develop, aggregate and integrate tools for analysis and visualisation of data in the COVID-19 Data Platform. It will provide a provenance framework for tracking of derived data and the transparency and trustworthiness of results which will ultimately improve trust in science. Researchers will be enabled to exploit the large volumes of data for tasks such as identification of variants of concern.
WP leaders: Frederik Coppens (VIB), Petr Holub (BBMRI-ERIC)
Work Package 5 will demonstrate usability of BY-COVID services across disciplines and national borders through continuously evolving demonstrator projects. It will assess viral variants and disease outcomes using real world data, as well as the effectiveness of vaccines against new variants using retrospective clinical trial data and improve the understanding of the mechanistic determinants of variant responses.
WP leaders: Nina Van Goethem (Sciensano), Enrique Bernal Delgado (IACS)
Work Package 6 will focus on stakeholder engagement, bottom-up by facilitating knowledge exchange in relation to setting up surveillance systems for pathogens, but also top-down to shape the policy landscape (EOSC, EHDS, and more broadly intergovernmental organisations and funders). Importantly, WP6 coordinates the training and capacity-building that takes place across WPs ensuring alignment and visibility with stakeholders.
WP leaders: Patricia Palagi (SIB), Corinne S. Martin (ELIXIR Hub)
Work Package 7 will develop and implement the project-wide communications and outreach strategy, ensuring key stakeholders including scientists, industry and SMEs, policy makers and the public have awareness of the project and opportunities to engage, use project outputs and provide feedback to partners.
WP leaders: Andy Smith (ELIXIR Hub), Katharina Lauer (ELIXIR Hub)
WP8 will oversee the project execution. It will ensure effective coordination so that the project goals, benefits and expected impact are delivered within time, scope, and budget. It will also address the implementation of ELSI (ethical and legal) aspects, and the data management and sustainability plan.
WP leaders: Juan Arenas Marquez, Andrea Troncoso (ELIXIR Hub)
The European COVID-19 Data Platform
Rapid and open sharing of data greatly accelerates research and discovery, which is essential to respond to the COVID-19 pandemic. The European Commission, EMBL’s European Bioinformatics Institute (EMBL-EBI) and ELIXIR, together with EU Member States and other research partners, operate a dedicated European COVID-19 Data Platform. This Platform enables the rapid collection and comprehensive data sharing of available research data from different sources for the European and global research communities.
The European COVID-19 Data Platform consists of three connected components:
SARS-CoV-2 Data Hubs: can be used to organise SARS-CoV-2 genomic sequence data at national, regional and institute levels. Upon release, this data can then be openly shared with the global research community via the COVID-19 Data Portal.
Federated European Genome-phenome Archive: provides secure, controlled access sharing of sensitive patient and COVID-19 research data.
COVID-19 Data Portal: brings together tools and SARS-CoV-2 viral and host datasets spanning genomics, proteins, imaging, drug compounds and publications, enabling researchers to access and analyse integrated COVID-19 data easily.
The design of the BY-COVID project rests on this foundation, aiming to connect well-established data resources and deliver access to heterogeneous, yet interlinked and organised data, across domains and jurisdictions via the COVID-19 Data Platform components. Over the course of the project, emerging national data portals will be linked to the COVID-19 Data Platform, thus establishing a federated digital space for infectious disease data.