Data repositories

A data repository is an online platform for storing, publishing and preserving final data after the research is completed, which collects, manages and makes available the data and associated metadata and documentation. The use of a repository facilitates the retrieval and preservation of datasets and overall contributes to the credibility of scientific knowledge. Publishing a dataset in the repository ensures that the data will be accessible to other researchers and will continue to be available in the future

If you need to upload your research data to a data repository, you have several options:

  • General data repositories
  • Subject-specific data repositories
  • Institutional data repositories

When a dataset is published in the data repository, it is usually available publicly. For some researchers, it is not possible to share a dataset publicly immediately, for example, due to intellectual property protection (e.g., patent application). Therefore some data repositories provide the possibility to publish data with a time embargo.

A time embargo is a period during which a published dataset remains unavailable to the public. During the embargo period, the data itself remains non-public and can only be accessed with authorization from the data originator.

Trusted data repositories

Trusted data repositories meet the following characteristics:

  • They provide open access to the content of the repository
  • They assign persistent identifiers to the content for referencing and citing
  • Use standardized, machine-processable metadata
  • Allow for licensing of published datasets
  • Ensure the long-term preservationof repository content
  • Have obtained certification (e.g., Core Trust Seal or ISO16363)
  • These are repositories that are specific to the scientific field, internationally recognised, widely used and endorsed by the scientific community
dmp

General data repositories

It is possible to publish and search research data in any scientific field.

Zenodo

Zenodo is a general repository created by OpenAIRE and CERN. In addition to research data, articles, code, posters and presentations can also be stored in Zenodo.

dmp
  • files up to 50 GB (larger files require contacting the repository)
  • assigns a DOI identifier
  • option to choose Creative Commons and other licenses
  • metadata description follows DataCite's Metadata Schema standard.

The repository does not have a Core Trust Seal certificate, because under current rules this certificate cannot be assigned to general repositories. Nevertheless, the Zenodo repository is considered trustworthyby funders.

National Repository

The National Research Data Repository is operated by CESNETa nd is still in pilot operation. In the future, it will be one of the main repositories of research data in the Czech Republic within the National Repository Platform. The control of stored records is managed by the National Technical Library.

dmp
  • files up to 500 GB
  • assigns a DOI identifier
  • option to choose Creative Commons license

Harvard Dataverse

Harvard Dataverse is a data repository, that can be used by researchers in any scientific field, most records are from the Social Sciences.

dmp
  • files up to 1 TB
  • assigns a DOI identifier
  • option to chooseCreative Commons license
  • metadata description follows DataCite's Metadata Schema standard

Figshare

Data repository from Digital Science, part of the Springer Nature portfolio.

dmp
  • files up to 20 GB
  • assigns a DOI identifier
  • option to choose Creative Commons license
  • metadata description follows DataCite's Metadata Schema standard

Figshare is considered a trusted repositoryand is ISO27001 certified.

Dryad

The repository contains research data mainly from the natural and medical sciences.

dmp
  • files up to 300 GB (up to 50 GB free)
  • assigns a DOI identifier
  • option to choose Creative Commons license

Subject-specific data repositories

It is generally recommended to store final research data in subject repositories. You can search for a disciplinary data repository on the data repository signposts:

Caution, always read the terms and conditions of the data repository carefully to see if you can contribute to the repository!

Czech Social Science Data Archive

Long-term preservation and access to social science research data. Data storage is based on a contract between the data producer and the Institute of Sociology of the CAS (of which CSDA is a part).

dmp
  • certificate Core Trust Seal
  • assigns a DOI identifier

LINDAT/CLARIAH-CZ

Subject-specific repository for linguistic data and tools, budovaný Ústavem formální a aplikované lingvistiky MFF UK.

dmp
  • option to choose a license
  • assigns a Handle identifier

Data repositories recommended by journals and publishers