The Pacific Northwest Aquatic Monitoring Partnership (PNAMP), National Marine Fisheries Service’s West Coast Region (NMFS), and StreamNet released a white paper (Olson et al. 2019) August 15, 2019 that provides recommendations for citing online multi-contributor dynamic data sets such as found in StreamNet. Besides the authors listed, this paper benefited from input by US Geological Survey, the Northwest Power and Conservation Council, Bonneville Power Administration, Yakama Nation, and Nez Perce Tribe. The paper is meant to assist data users in properly crediting data providers and citing a data set obtained online. Also included are metadata recommendations for multi-contributor data sets that will assist data end users.
The paper, as described in its abstract, focuses on the use of data generated from long term monitoring efforts and the need for accurate authoritative source citations of those data to ensure credit for data collected, and accountability for the data quality to enable repeated retrieval of a given data set. Data sets used in published reports and articles are increasingly being considered objects that are required to be published and cited. Aggregating data into open access databases is becoming common and is the focus of the Coordinated Assessment for Salmon and Steelhead project (CA; https://www.pnamp.org/project/coordinated-assessments-for-salmon-and-steelhead; http://www.streamnet.org/data/coordinated-assessments/) and National Marine Fisheries Service, National Oceanic and Atmospheric Administration Salmon Population Summary (SPS; https://www.webapps.nwfsc.noaa.gov/apex/f?p=261:home:0) among others. Guidelines are needed for citing these long-term dynamic data sets that have many contributors. We explore best practices and provide recommendations for including robust metadata attributes within data sets to enable data publication and citation using the CA and SPS data repositories as case studies. From reviewing the current citations possible from the CA and the SPS we recommend at minimum that natural resource monitoring databases contain: metadata to identify organizations that generated the data; contact persons for each organization that contributes data to an aggregated data set; and that metadata be incorporated into databases to enable autogenerated citations that recognize all contributing organizations with time-stamped versions of the data delivered. Beyond those minimums, additional best practice recommendations include this suite of metadata elements that identify a given data set upon citation or publication: author(s); publication date; description of data; file format(s) of data – e.g. tiles, shapefile sets, images, text files; dates data were collected; locations where data were collected; producers/contributors to the data set version cited; date data set was downloaded; original data repository from which the data were obtained; version identifier to note significant change to a data set; and a persistent identifier that can be used to locate that version of the data.