Citing Data

Why should you cite data?  For the same reasons that you cite journal articles and books – to give the data producer appropriate credit for their work.

Citing data is important in order to:

  • Create a bibliographic trail that connects the publications and the data
  • Meet funder requirements
  • Support the scholarly record for research
  • Allow easier access to the data for re-purposing or re-use
  • Support the persistence of datasets
  • Enable others to verify your work by increasing transparency and visibility
  • Enable tracking, measuring of impact, and demonstrating value in reuse

Primary Elements to Include in all Data Citations

  • Creator or Contributor: Author(s) of the dataset
  • Title: Name of the dataset
  • Publisher (or Distributor): Repository name
  • Date of Publication: The date when the dataset was published or released (rather than the collection or coverage date)
  • Version: If you have multiple versions of a specific dataset, or an updated set
  • Persistent Identifier (Unique Identifier): This is often a DOI, but can also be an ARK, URN or Handle System.  Otherwise use the URL of the source
  • Date accessed if appropriate

Examples

  • Dataset: OECD (2008), “Social Expenditures aggregates”, OECD Social Expenditure Statistics (database). doi: 10.1787/000530172303 http://dx.doi.org/10.1787/000530172303 (Accessed on 2008-12-02).
  • Dataset package: Sidlauskas B (2007) Data from: Testing for unequal rates of morphological diversification in the absence of a detailed phylogeny: a case study From characiform fishes. Dryad Digital Repository. doi:10.5061/dryad.20
  • Table from a publication:
    Smith, J. (2008), Figure 1.2. Broadband Penetration in OECD Countries, in OECD Communications Outlook 2008, OECD Publishing doi: 10.1787/000530172303 http://dx.doi.org/10.1787/000530172303.
  • Dataset from a publication: Irino, T; Tada, R (2009): Chemical and mineral compositions of sediments from ODP Site 127‐797. Geological Institute, University of Tokyo. http://dx.doi.org/10.1594/PANGAEA.726855 
  • Updated Dataset:
    Cavalieri, D., C. Parkinson, P. Gloersen, and H. J. Zwally. 1996, updated 2006. Sea ice concentrations from Nimbus-7 SMMR and DMSP SSM/I passive microwave data, March 2002–Sept. 2003. Boulder, Colorado USA: National Snow and Ice Data Center. url: http://nsidc.org/data/nsidc-0051.html (Accessed on 2008-05-14).
  • Survey Dataset: Barnes, Samuel H. Italian Mass Election Survey, 1968. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 1992-02-16. https://doi.org/10.3886/ICPSR07953.v1 

Guidelines on Citing Data