October 2021 Kick-off

October 2024

BY-COVID runs from October 2021 to October 2024. The ultimate outcome of the project is that SARS-CoV-2 and other infectious disease data will be easier to access, share and analyse. This will enable the world to respond more quickly to infectious disease outbreaks. During the project there will also be specific outputs such as publications and deliverables. Deliverables will include reports and best practice guidelines. These outputs will appear here as the project progresses.


26 May 2022 | Data and Policy

COVID-19: An exploration of consecutive systemic barriers to pathogen-related data sharing during a pandemic

In this paper we report results of a study, where we interviewed data professionals working with COVID-19-relevant data types including social media, mobility, viral genome, testing, infection, hospital admission, and deaths.

24 Mar 2022 | Molecular Biology and Evolution

Selection analysis identifies clusters of unusual mutational changes in Omicron lineage [BA].1 that likely impact Spike function

This paper shows that, based on both the rarity of the 13 mutations in intrapatient sequencing reads and patterns of selection at the codon sites where the mutations occur in SARS-CoV-2 and related sarbecoviruses, prior to the emergence of Omicron the mutations would have been predicted to decrease the fitness of any virus within which they occurred.

24 Mar 2022 | PLOS Computational Biology

10 Simple Rules for making a software tool workflow-ready

Workflows have become a core part of computational scientific analysis in recent years. This paper presents 10 simple rules for how a software tool can be prepared for workflow use.

04 Jan 2022 | Data Science

Packaging research artefacts with RO-Crate

The aim of this paper is to introduce RO-Crate (an open, community-driven, and lightweight approach to packaging research artefacts along with their metadata in a machine readable manner) and assess it as a strategy for making multiple types of research artefacts FAIR.

23 Dec 2021 | ResearchSquare

Unique SARS-CoV-2 variant found in public sequence data of Antarctic soil samples collected in 2018-2019

This paper is about an investigation to know whether the huge public sequencing data archives’ samples collected earlier than the earliest known cases of the pandemic might contain traces of SARS-CoV-2.

06 Dec 2021 | Figshare

The response of the scholarly communication system to the COVID-19 pandemic

This paper analyses how the scholarly communication system – involving the production, evaluation, and dissemination of research outputs – has responded to this crisis, focusing on the period until mid-2021.

03 Nov 2021 | Zenodo

FAIR, ethical, and coordinated data sharing for COVID-19 response

Data sharing is central to the rapid translation of research into advances in clinical medicine and public health practice. This paper is a review of COVID-19 data sharing platforms and registries.

29 September 2021 | Nature Biotechnology

Ready-to-use public infrastructure for global [SARS]-[CoV]-2 monitoring

This paper presents the COVID-19 effort by the Galaxy Project, which pools free worldwide public computational infrastructure, making the analysis of deep sequencing data accessible to anyone while also providing an analytical framework for global pathogen genomic surveillance based on raw sequencing-read data.

Deliverables and Milestones

D8.111/21Project Handbook initial release and periodic updatesWP8
D9.412/21AI - Requirement No. 5WP8
D9.512/21OEI - Requirement No. 6WP8
D3.103/22Metadata standardsWP3
D7.103/22Dissemination, exploitation and communication PlanWP7
D8.2.102/22Project Data Management Plan initial release and periodic updatesWP8
D9.103/22H - Requirement No. 1WP8
D9.202/22HCT - Requirement No. 2WP8
D9.303/22POPD - Requirement No. 3WP8
D2.106/22Initial data and metadata harmonisation at domain level to enable fast responses to COVID-19WP2
D1.109/22Extended workflowsWP1
D3.209/22Tiered indexing systemWP3
D8.2.212/22Project Data Management Plan initial release and periodic updatesWP8
D8.1.203/23Project Handbook initial release and periodic updatesWP8
D2.206/23Data Access and Transfer across research domains and jurisdictionsWP2
D1.209/23Preparedness Data HubWP1
D3.3.109/23COVID-19 Data PortalWP3
D7.309/23Report on public engagement activitiesWP7
D5.311/23Hot Spot detection, samples data collection and mechanistic analysesWP5
D1.303/24Tracking and open analytics toolsWP1
D2.303/24Enabling data discovery at source using beacon-like mechanismsWP2
D4.303/24Provenance modelWP4
D7.203/24Public report showcasing industry value from infectious disease dataWP7
D4.204/24Common analysis environmentWP4
D5.105/24Enriched report viral variants and health outcomesWP5
D4.106/24Infectious diseases toolkitWP4
D3.3.207/24COVID-19 Data PortalWP3
D6.107/24Stakeholder engagement reportWP6
D6.207/24The training efforts reportWP6
D8.2.307/24Project Data Management Plan initial release and periodic updatesWP8
D8.307/24Report on sustainability plansWP8
D2.409/24Report on data sources discovery and integration for enabling data use and re-use in response to future outbreaksWP2
D5.209/24Secondary use of vaccine trial data and biosamplesWP5
D8.1.309/24Project Handbook initial release and periodic updatesWP8
M7.110/21Branding and communications guidelinesWP7
M7.211/21Launch of project websiteWP7
M8.111/21Project mobilised. All governing boards and WPs establishedWP8
M8.202/22DMP approved by the relevant project boards before submissionWP8
M1.103/22First support services in operationWP1
M2.103/22Identified data sources have been registered in the BY-COVID reference catalogueWP2
M5.102/22Compiled research questions and requirements Workshop 1WP5
M6.103/22Stakeholder engagement (initial scoping and draft monitoring approach)WP6
M5.409/22FAIR open-source pipelineWP5
M6.209/22Identified training needs and roadmapWP6
M7.309/22Industry sector mapping reportWP7
M4.109/22Common analysis environmentWP4
M4.209/22Prototype Infectious diseases toolkitWP4
M2.201/23Identified the preferred mechanisms for data access and use of Real-world dataWP2
M1.203/23First globally comprehensive data setWP1
M3.103/23Initial set of resources metadata mapped, indexed, and discoverable in COVID-19 Data PortalWP3
M5.203/23Compiled research questions and requirements Workshop 2WP5
M5.503/23Viral variant and health outcomesWP5
M5.303/24Compiled research questions and requirements Workshop 3WP5
M2.307/24Report on upgrade of clinical trial data and metadataWP2