Sharing & Archiving Data
Why share your data?
- Enabling others to replicate and verify results as part of the scientific process
- Allows researchers to ask new questions, conduct new analyses, and improve research methods
- Linking to research products like publications & presentations
- Creating a more complete understanding of a research study
- Meeting the expectations of publishers (i.e., the Nature Publishing Group, Science Journals, PLOS, PNAS)
- Meeting the expectations of government funding agencies
- Receiving credit for data creation for career advancement
- Makes your papers more useful and citable by other scientists
- Reduces the costs of duplicating data collection
How to share your data
- Deposit it in a discipline-specific repository, general repository, or archive (ie. Subject-based)
- Deposit to UVa’s Data Repository – LibraData (your final, publishable products of research).
- Disseminate through a project, personal, or department website
- Submit as supplemental material to a journal in support of an article
- Peer-to-peer exchange
Where to share your data
- LibraData: Additional information about LibraData
- re3data: Catalog of research data repositories
- NIH Data Sharing Repositories
- Bioinformatics, Biology, Medicine, and Chemistry Subject-Specific Repositories
- Astronomy, Astrophysics, Physics, Earth Sciences, and GIS/Geography Subject-Specific Repositories
- Humanities, Social Sciences, Education, Computer Science, and Source Code Subject-Specific Repositories
The best practice for sharing your research data is to deposit it in a data repository. They are not just place holders – many of them also preserve and curate the data. The re3data catalog can help you find the appropriate repository for your data. Funders may specify specific repositories for the research data produced by projects they fund. Publishers may require that the data supporting research they publish be deposited in a specific location. Not sure what your publisher’s policies are? The Sherpa/RoMEO tool can help you identify them. It provides information on publishers’ policies, including copyright and archiving, and can be searched using the journal title, ISSN, or the publisher name.
Advantages to using a data repository
- Persistent Identifiers — unique and citable
- Access controls
- Repository guidelines for deposit
- Data preservation — migrating to new formats or emulating old formats
- Professional backup & documentation
- Repository Standards ensure commitment and quality
Things to consider when Sharing and Archiving
File Formats for Long Term Access
The file format in which you keep your data is a primary factor in one’s ability to use your data in the future. Plan for both hardware and software obsolescence. See the section Organizing Files and File Formats for details on preferable long-term storage file formats.
Don’t Forget the Documentation
Document your research and data so others can interpret the data. It is important to begin to document your data at the very beginning of your research project and continue throughout the project.
U.Va. Data Retention Policy
University faculty and researchers have a responsibility to maintain research data and make that data available for preservation by the University both as a matter of research integrity, and because of the University’s ownership rights. Details on researcher’s responsibility is found in the Laboratory Notebook and Recordkeeping Policy.
Ownership and Privacy
Make sure that you have considered the implications of sharing data, in terms of copyright and IP ownership, and ethical requirements like privacy and confidentiality. U.Va. has tried to clarify the ownership rights and intellectual property rights of data generated by U.Va researchers in this memo: Data Rights and Responsibility.
Data publishing as a standalone process is still pretty new. This is the publishing of a dataset not tied to an article. The current practice is to deposit your data in a repository and link it to an published article in a journal. This newer method is to publish your data as a “data paper” which describes a dataset in detail, without analysis or discussion. They can also be linked to published articles in traditional journals.
Advantages to Publishing Research Data
- Increased exposure of a dataset
- Validation – strengthens the credibility of the study relying on the data
- Element of peer-review of the dataset
- Academic accreditation for the researcher
- Sharing of datasets not tied to publications
- Increased citation counts for related articles
- Faster pace of science progress – maximize opportunities for reuse
Data journals are publications whose primary purpose is to expose datasets. They enable the author to focus on the data itself, rather than producing an extensive analysis of the data which occurs in the traditional journal model.
They are published in the “Data Papers” section of an established journal, or in a journal dedicated to data papers:
- Earth System Science Data – ESSD journal
- Geoscience Data Journal Background on how the journal operates is part of the PREPADE project.
- Scientific Data
- Journal of Open Archeology Data
- Biodiversity Data Journal
- PANGEA: Data Publisher: They manage an assortment of platforms and journals.
- The Australian National Data Service (ANDS) has a very informative page about Data and Journals.