The FAIRsharing BY-COVID Collection of data sources
22 June 2023
The COVID-19 Data Portal makes it possible for researchers to access and integrate a broad range of COVID-19 and SARS-CoV-2 data. BY-COVID has developed tools, workflows, documentation and training to support the incorporation of additional resources from many different research disciplines.
A key initial step involves registering data sources and uploading their key characteristics to FAIRsharing. A cross-disciplinary resource that maps and interlinks databases, standards and policies, FAIRsharing enables efficient onboarding of new data sources and a means to ensure these are more discoverable in the European Open Science Cloud (EOSC) ecosystem.
The BY-COVID FAIRsharing Collection is a catalogue and knowledge graph data sources and their characteristics, including access terms, protocols and standards used to represent the data and metadata (Figure 1). The Collection currently contains 20 data sources (Table 1), developed by BY-COVID members, from social science and humanities, health and clinical data, images, genomic and phenotypic data and chemical biology.
Making a range of infectious disease data sources widely discoverable, accessible and interoperable is important for research and innovation, which is increasingly multidisciplinary in nature. For example, pathogen research is accelerated by the availability of data from clinical trials, biobanks, behavioural and socioeconomic studies, particularly if the data is combined with host and pathogen omics information. Many of these data types, for example clinical records or bioactivity data, may contain high resolution images, the availability of which extends the potential research questions which can be explored.
Multidisciplinary data is also critical for public health decision-making, where policy questions are complex and evidence from biomolecular research, clinical studies and social sciences must be taken into account. One lesson from the COVID-19 pandemic was that data-driven decision-making needs high quality, real-time data from many research disciplines and geographic areas in an integrated format. The BY-COVID project is building on these learnings and creating solutions for COVID-19 that can be extended to other pathogens. Resources like the COVID-19 Data Portal and FAIRsharing are pivotal to meet these goals.
The launch of the FAIRsharing BY-COVID Collection, which will grow progressively, marks an important step in the maturation of the BY-COVID project. Datasets from these data sources will now be incorporated into the COVID-19 Data Portal, providing access to heterogeneous, yet interlinked and organised data, across domains. Over the course of the project, emerging national data portals will be registered in FAIRsharing and linked to the COVID-19 Data Platform, building a federated digital space for infectious disease data.
Find out more:
- The FAIRsharing BY-COVID Collection
- COVID-19 Data Portal
- BY-COVID D3.2: Implementation of cloud-based, high performance, scalable indexing system
- FAIRsharing Educational
- A simple explanation about metadata
- Graph of relationships within the BY-COVID Data Resources collection in FAIRsharing
Table 1: The BY-COVID FAIRsharing resource collection (as of 15 June 2023). View current status.
Domain | Resource and record in FAIRsharing | Type of data |
Clinical and health | Health Data Research Innovation Gateway; https://doi.org/10.25504/FAIRsharing.nh1DmP | UK health datasets |
European Health Information Portal (HIP); https://doi.org/10.25504/FAIRsharing.8690f1 | European health information | |
ECRIN Clinical Research Metadata Repository; https://fairsharing.org/3067 | European clinical studies, trial registrations, results summaries, journal articles, protocols | |
Dutch National Observational COVID-19 data portal; https://doi.org/10.25504/FAIRsharing.71bf06 | Observational data from Dutch health care providers | |
BBMRI-ERIC Directory; https://doi.org/10.25504/FAIRsharing.q9VUYM | Aggregate information about biobanks across Europe | |
Dutch National Observational COVID-19 data portal; https://doi.org/10.25504/FAIRsharing.71bf06 | Clinical data portal for the exploration and reuse of clinical data from Dutch university medical centres | |
COVID-19 Data Portal; https://doi.org/10.25504/FAIRsharing.f3b7a9 | COVID-19 datasets and tools including SARS-CoV-2 sequence data | |
Genotypic and phenotypic | The European Genome-phenome Archive (EGA); https://doi.org/10.25504/FAIRsharing.mya1ff | Personally identifiable genetic, phenotypic, and clinical data |
European Mouse Mutant Archive (EMMA); https://doi.org/10.25504/FAIRsharing.g2fjt2 | Mutant mice strains essential for basic biomedical research | |
Social sciences and humanities | EUI COVID-19 social sciences and humanities (SSH) Data Portal; https://doi.org/10.25504/FAIRsharing.97367f | COVID-19-related research in the social sciences and humanities |
Consortium of European Social Science Data Archives (CESSDA) Data Catalogue; https://doi.org/10.25504/FAIRsharing.a12316 | European social science data | |
European Social Survey (ESS) Data Portal; https://fairsharing.org/4838 | European cross-national survey data measuring the attitudes, beliefs and behaviour patterns of diverse populations in more than thirty nations | |
Survey of Health, Ageing and Retirement in Europe (SHARE) Research Data Center; https://fairsharing.org/4839 | European survey data on the effects of health, social, economic and environmental policies over the life-course of European citizens and beyond | |
Open Data Infrastructure for Social Science and Economic Innovations (ODISSEI) Portal; https://fairsharing.org/4841 | Metadata from most data collections relevant to the social science community in the Netherlands | |
Images | Electron Microscopy Public Image Archive (EMPIAR); https://doi.org/10.25504/FAIRsharing.dff3ef | Raw images underpinning 3D cryo-EM maps and tomograms |
Electron Microscopy Data Bank (EMDB); https://doi.org/10.25504/FAIRsharing.651n9j | Electron microscopy density maps of macromolecular complexes and subcellular structures | |
Image Data Resource (IDR); https://doi.org/10.25504/FAIRsharing.6wf1zw | Image data from genetic, RNAi, chemical, localisation and geographic high content screens, super-resolution microscopy and digital pathology | |
BioImage Archive; https://doi.org/10.25504/FAIRsharing.x38D2k | Biological images that are useful to life-science researchers | |
Chemical Biology | ChEMBL; https://doi.org/10.25504/FAIRsharing.m3jtpg | Chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs |
European Chemical Biology Database (ECBD); https://fairsharing.org/3717 | Experimental results from biological screening programs | |
COVID 19-NMR; https://fairsharing.org/4850 | RNA and protein structural data for SARS-CoV-2 as well as other viruses |