StreamNet compiles two specific levels of data: 1. high level indicators of fish population health, and 2. supporting fish abundance estimates and indexes at local scales. All compiled data pass a series of rigorous QA/QC data validation steps, include supporting metadata, and where applicable, have links to reference documents at the StreamNet Library.
Data Exchange Standard
Data, including reference documents, in the StreamNet database must conform to StreamNet’s Data Exchange Standard (DES).
This document precisely defines the data elements, their organization in tables, and required formats. This document serves as the common denominator for the specific data types contained in the database.
Adherence to the DES assures that data can be loaded into the database, can be queried accurately, and are equivalent for further analysis by users. Conversion of agency data to the DES and assuring that they conform before submission is the responsibility of the project’s data stewards/compilers in the data source agencies. QA procedures are applied at the agency data steward level, and automated validation routines are run when the data are received at PSMFC. All validation failures are reported to the stewards for correction, and any errors are conveyed back to the original data collectors. Additions or changes to the DES are made following a formal documented procedure adopted by the Steering Committee.
An essentially identical process is used for the "Coordinated Assessments" population-scale high level indicator data.
Automated Data Validation
We use an automated data validation and loading system that provides real time feedback on the success (or not) of data validation. Data are submitted to the StreamNet or Coordinated Assessments database one record at a time, and real-time data validation is run on each record at three levels. First, each field has its own set of rules. Examples include ensuring numeric fields do not contain text, ensuring codes fall within the group of allowable values, and ensuring text strings are within acceptable length ranges. The second level of validation ensures that values in the various fields within a data record are compatible. For example, if a record is submitted that says it is for spring run coho salmon, it is rejected because there is no such run of fish. The third level of validation looks for data problems between rows of data within a table. This serves to prevent duplicate data. A useful feature of the automated validation routines is that the data may be run against the validation rules and an error report obtained without actually submitting any data for inclusion in the database. This feature allows data submitters to check entire sets of data, fix all errors, and then submit an entire data set after it is known it will pass validation. The interface used for data submittals allows for adding new records, for changing existing records, and for deleting existing records.
Database triggers apply time stamps to new and modified records, and stored procedures automate the creation of data in various internal tables that enhance the filtering and rapid retrieval of data. Triggers also back up all records before they are deleted from the central StreamNet database. Thus the central database functions as a backup for all data that have been submitted in the past in the unlikely event data are irretrievably lost at the submitting agency. Georeferencing tables are maintained which allow the query systems to find data by HUC, NPCC subbasin, or state/county. StreamNet maintains a custom set of web services that allow Northwest Power and Conservation Council to retrieve, in an automated way, specific sets of detailed time series data and StreamNet Library reference documents (via PDF url) for use in their dashboards and other webpages.
The data query systems are designed so that users can quickly find and access the data they seek by using filters. Some features of note include flexible and intuitive geographic filtering via maps, integrated output maps, the feel of a desktop data explorer application rather than a "database website", interactive charts with multiple series for comparison, and integrated reference documentation.