From 525c93738f7966e392b07f32a4197328b16acee3 Mon Sep 17 00:00:00 2001 From: Bruno Carturan Date: Fri, 29 Nov 2024 15:10:11 -0800 Subject: [PATCH] Final push before archiving --- README.md | 4 +- ...s_cuid_streamid_2024-11-25_definitions.csv | 64 ------------------- data_input/README.md | 2 + ...s_cuid_streamid_2024-11-25_definitions.csv | 64 +++++++++++++++++++ 4 files changed, 69 insertions(+), 65 deletions(-) delete mode 100644 data_input/2_nuseds_cuid_streamid_2024-11-25_definitions.csv create mode 100644 data_input/nuseds_cuid_streamid_2024-11-25_definitions.csv diff --git a/README.md b/README.md index ab672ed..f1af049 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,7 @@ -Code, datasets and figures associated to the manuscript **Monitoring for fisheries or for fish? Declines in monitoring of salmon spawners continue despite a conservation crisis** +Code, datasets and figures associated to the manuscript: + +## Monitoring for fisheries or for fish? Declines in monitoring of salmon spawners continue despite a conservation crisis **Authorship:** diff --git a/data_input/2_nuseds_cuid_streamid_2024-11-25_definitions.csv b/data_input/2_nuseds_cuid_streamid_2024-11-25_definitions.csv deleted file mode 100644 index bfac2c8..0000000 --- a/data_input/2_nuseds_cuid_streamid_2024-11-25_definitions.csv +++ /dev/null @@ -1,64 +0,0 @@ -field,definition,source -IndexId,, -POP_ID,Population ID.,NuSEDS -GFE_ID,Stream ID (Geo_Feature ID). ,NuSEDS -SPECIES,Species of Fish.,NuSEDS -WATERBODY,This is the name of the waterbody or portion of a waterbody that bounds the population as shown on any given SEN.,NuSEDS -AREA,This is the subdistrict. In most cases subdistricts are the same as statistical areas. They mainly differ for streams that eventually drain into the Fraser and for large areas that have been split up and thus have a/b/c... designations. E.g. Statistical area 03 has two subdistricts 3A and 3B.,NuSEDS -Year,This is the year that the estimate is for. Surveys may have continued into the following calendar year.,NuSEDS -MAX_ESTIMATE,"The maximum value (excluding NAs) among the following fields in NuSEDS: NATURAL_ADULT_SPAWNERS, NATURAL_JACK_SPAWNERS, NATURAL_SPAWNERS_TOTAL, ADULT_BROODSTOCK_REMOVALS, JACK_BROODSTOCK_REMOVALS, TOTAL_BROODSTOCK_REMOVALS, OTHER_REMOVALS, TOTAL_RETURN_TO_RIVER",Calculated by PSF -ENUMERATION_METHODS,"The enumeration method used to observe fish. The first method listed is the primary method. Values are: Bank Walk, Based on Angling Catch, Biologist/Working Group, Boat, Broodstock Removal, Dead Pitch, Electronic Counters, Electroshocking, Enumeration by Hatchery, Fence, Fixed Wing Aircraft, Float, Helicopter, Hydroacoustic Station, Other, Peak Live and Dead Count, Redd Counts, Snorkel, Spot Checks, Stream Walk, Strip Counts, Tag Recovery, Trap, Walk.",NuSEDS -ESTIMATE_CLASSIFICATION,"This categorizes estimates based on their levels of accuracy and precision (Type-1 are the most accurate, Type-6 the least accurate). There are three other classifications that belong to SENs whose source data were migrated from the regional MSAccess SILBC16 database (definitions extracted from that user manual). -RELATIVE: CONSTANT MULTI-YEAR METHODS and -RELATIVE: VARYING MULTI-YEAR METHODS: ""This is the case with survey methods restricted to a fraction of the spawning habitat and/or a fraction of the spawning period. There are various types of relative abundance estimates depending on the survey method, the level of standardization of the methods, and the sampling effort. For our purpose we have retained one type based on between-year consistency of the method where there are two levels."" -NO SURVEY THIS YEAR: ""stream was not inspected for that species this year"" ",NuSEDS -ESTIMATE_METHOD,"There are several standard methods to chose from. -Addition/Subtraction - simple addition or subtraction to provide an estimate. Should be used in conjunction with activity types Adjustment/Calibration and Summary observations. E.g. a population aggregate, the sum of two or more populations, would require the linking of two or more SENs and straight summation of the estimates. -Multiplication/Division - simple multiplication or division to summary estimates. This method should be used in conjunction with activity type Adjustment/Calibration. E.g. E.g. An annual estimate that was arrived at by Peak Live Plus Dead analysis can be adjusted by some factor to make it equivalent to a Time Series estimate that uses AUC calculations. -Area Under the Curve - Combining a series of point estimates for abundance to create an estimate for the annual abundance. This is done by determining the total area under a curve of abundance by time then dividing by the survey life (the average length of time that an individual is available to be observed alive i.e. is still within the survey area and is not dead). -Peak Live Plus Dead - Examine point estimates for abundance, determine the survey when the maximum live count observed; sum the live and dead counts for that survey to create the annual estimate. -Peak Live Plus Cumulative Dead - Examine point estimates for abundance, determine the survey when the maximum live count observed. Sum the live count for that survey with the cumulative total of the dead counts prior to and including that survey to create the annual estimate. -Fixed Site Census - Combining one or more raw observations into a single estimate (e.g. add all daily fence observation SIL to create a single annual estimate). -Mark and Recapture - Petersen - Use capture and re-capture SIL data to determine an abundance estimate with the Petersen calculation. -Mark and Recapture - Jolly-Seber - Use capture and re-capture SIL data to determine an abundance estimate with the Jolly-Seber calculation. -Redd Count - Using counts of redds from SILs and multiplied by a factor such as 2. -Lake Expansion - expanding the dead recoveries by the recovery effort -Cumulative New - N/A -",NuSEDS -GAZETTED_NAME,Provincially recognized name for the waterbody.,NuSEDS -LOCAL_NAME_1,Commonly known name for the waterbody.,NuSEDS -LOCAL_NAME_2,Second most commonly known name for the waterbody.,NuSEDS -ADULT_PRESENCE,"Values are present if adults were observed, none observed if no adults were observed during the stream inspections, not inspected if adults were not looked for, unknown if it is not known whether adults were observed during inspections or not.",NuSEDS -JACK_PRESENCE,"Values are present if jacks were observed, none observed if no jacks were observed during the stream inspections, not inspected if jacks were not looked for, unknown if it is not known whether jacks were observed during inspections or not.",NuSEDS -SPECIES_QUALIFIED,"This is an Conservation Unit acronym used to describe the species of salmon for which the escapement estimate is for, eg: CK - Chinook Salmon CM - Chum Salmon CO - Coho Salmon PKE - Even Year Pink Salmon PKO - Odd Year Pink Salmon SEL - Lake Type Sockeye Salmon SER - River or Ocean Type Sockeye Salmon ",NuSEDS: conservation unit system sites -CU_NAME,The assigned name of the Conservation Unit. Note that this name does not identify the species.,NuSEDS: conservation unit system sites -CU_TYPE," There are currently six CU types, i.e.,Current, Bin, VREQ[Bin], VREQ[Current], VREQ[Extirpated] and Extirpated based upon Blair Holtby's Rev 4.0 Conservation Unit data refresh. Definitions for these CU Types are listed below: CU_TYPE DEFINITION Current CU is extant and is either accepted or has been proposed. Bin Not a CU but a category to hold sites that for some reason are not assigned to a CU. Reasonable uses of the bin category are: a) sites where migratory dropouts are counted that cannot be reliably assigned to CUs; b) sites where transplanted fish are enumerated under the pretense that DFO is recreating an extinct CU; and c) sites where transplanted fish are enumerated that are outside the ecotypical zone of the source fish and where no claim to recreating an extinct CU has been made. VREQ[Bin] VREQ[] : Indicates that there is some doubt about the nature of the CU and valididation is required. [Bin]-See definition for Bin. The most common use of the prefix is for sockeye CUs on the central and north coasts that were identified by presumed suitability rather than by actual verified records of persistent presence. The second most common use is for CUs that likely were valid but it is unknown if they persist. Again these are mostly sockeye CUs. VREQ[Current] VREQ[] : Indicates that there is some doubt about the nature of the CU and valididation is required. [Current]-See definition for Current. The most common use of the prefix is for sockeye CUs on the central and north coasts that were identified by presumed suitability rather than by actual verified records of persistent presence. The second most common use is for CUs that likely were valid but it is unknown if they persist. Again these are mostly sockeye CUs. VREQ[Extirpated] VREQ[] : Indicates that there is some doubt about the nature of the CU and valididation is required. [Extirpated]-See definition for Extirpated. The most common use of the prefix is for sockeye CUs on the central and north coasts that were identified by presumed suitability rather than by actual verified records of persistent presence. The second most common use is for CUs that likely were valid but it is unknown if they persist. Again these are mostly sockeye CUs. Extirpated There are no known sites with fish spawning successfully in the wild and there are no known hatchery sites. ",NuSEDS: conservation unit system sites -FAZ_ACRO,Acronym of the freshwater adaptive zone,NuSEDS: conservation unit system sites -JAZ_ACRO,Acronym of the joint adaptive zone,NuSEDS: conservation unit system sites -MAZ_ACRO,Acronym of the marine adaptive zone,NuSEDS: conservation unit system sites -FULL_CU_IN,"The full index of the CU including the species qualifier, e.g. CK-01",NuSEDS: conservation unit system sites -SYSTEM_SITE,The name of the waterbody. Originally from NUSEDS but not necessarily the same. Name priority was BC gazette > BC provincial alias > DFO alias1 > DFO alias2. Same as Waterbody Name above,NuSEDS: conservation unit system sites -Y_LAT,"Location of the mouth of the waterbody if flowing, or the centroid if not.",NuSEDS: conservation unit system sites -X_LONGT,"Location of the mouth of the waterbody if flowing, or the centroid if not.",NuSEDS: conservation unit system sites -IS_INDICATOR,"Is ""Y"" if this POP_ID has been identified by Area experts as an ""indicator"" population.",NuSEDS: conservation unit system sites -CU_LAT,Centroid of the CU,NuSEDS: conservation unit system sites -CU_LONGT,Centroid of the CU,NuSEDS: conservation unit system sites -coordinates_changed,, -FULL_CU_IN_PSF,, -stream_survey_quality,The quality of the survey based on ESTIMATE_CLASSIFICATION; see https://www.salmonexplorer.ca/methods/analytical-approach.html#spawner-survey-method ,PSF -cuid,"The unique numeric identifier for the CU, as used in PSF's databases",PSF -cu_name_pse,"The display name of the CU used by PSF, including in the Pacific Salmon Explorer",PSF -cu_name_dfo,The name of the CU used by DFO,PSF -region,The broad-scale region that the CU is part of in the Pacific Salmon Explorer,PSF -regionid,The unique numeric identifier for the region,PSF -pointid,The unique numeric identifier for the site (lat/lon),PSF -sys_nm,The site name as diaplyed in the Pacific Salmon Explorer (derived from SYSTEM_SITE),PSF -latitude,,PSF -longitude,,PSF -distance,,PSF -location_new,,PSF -streamid,The unique numerid identifier for the population (site and CU combination),PSF -longitude_final,Corrected longitude (decimal degrees) based on location name and river network,PSF -latitude_final,Corrected latitude (decimal degrees) based on location name and river network,PSF -sys_nm_final,,PSF -survey_score,, \ No newline at end of file diff --git a/data_input/README.md b/data_input/README.md index 9655f36..d870b0d 100644 --- a/data_input/README.md +++ b/data_input/README.md @@ -13,6 +13,8 @@ - **nuseds_cuid_streamid_2024-11-25.csv**: the cleaned version of the New Salmon Escapement Database (NuSEDS) +- **nuseds_cuid_streamid_2024-11-25_definitions.csv**: the definitions of the fields in nuseds_cuid_streamid_2024-11-25.csv + *** IMPORTANT *** The **nuseds_cuid_streamid_2024-11-25.csv** file must be downloaded from https://zenodo.org/records/14194638 and placed in the **/data_input** repository to run 1_datasets.R diff --git a/data_input/nuseds_cuid_streamid_2024-11-25_definitions.csv b/data_input/nuseds_cuid_streamid_2024-11-25_definitions.csv new file mode 100644 index 0000000..03d962c --- /dev/null +++ b/data_input/nuseds_cuid_streamid_2024-11-25_definitions.csv @@ -0,0 +1,64 @@ +field,source,definition +IndexId,PSF,"Combination of the species acronyms ""CO"",""CN"",""PK"",""CM"",""SX"" with the NuSEDS field ""POP_ID""" +POP_ID,NuSEDS,Population ID. +GFE_ID,NuSEDS,Stream ID (Geo_Feature ID). +SPECIES,NuSEDS,Species of Fish. +WATERBODY,NuSEDS,This is the name of the waterbody or portion of a waterbody that bounds the population as shown on any given SEN. +AREA,NuSEDS,This is the subdistrict. In most cases subdistricts are the same as statistical areas. They mainly differ for streams that eventually drain into the Fraser and for large areas that have been split up and thus have a/b/c... designations. E.g. Statistical area 03 has two subdistricts 3A and 3B. +Year,NuSEDS,This is the year that the estimate is for. Surveys may have continued into the following calendar year. +MAX_ESTIMATE,Calculated by PSF,"The maximum value (excluding NAs) among the following fields in NuSEDS: NATURAL_ADULT_SPAWNERS, NATURAL_JACK_SPAWNERS, NATURAL_SPAWNERS_TOTAL, ADULT_BROODSTOCK_REMOVALS, JACK_BROODSTOCK_REMOVALS, TOTAL_BROODSTOCK_REMOVALS, OTHER_REMOVALS, TOTAL_RETURN_TO_RIVER" +ENUMERATION_METHODS,NuSEDS,"The enumeration method used to observe fish. The first method listed is the primary method. Values are: Bank Walk, Based on Angling Catch, Biologist/Working Group, Boat, Broodstock Removal, Dead Pitch, Electronic Counters, Electroshocking, Enumeration by Hatchery, Fence, Fixed Wing Aircraft, Float, Helicopter, Hydroacoustic Station, Other, Peak Live and Dead Count, Redd Counts, Snorkel, Spot Checks, Stream Walk, Strip Counts, Tag Recovery, Trap, Walk." +ESTIMATE_CLASSIFICATION,NuSEDS,"This categorizes estimates based on their levels of accuracy and precision (Type-1 are the most accurate, Type-6 the least accurate). There are three other classifications that belong to SENs whose source data were migrated from the regional MSAccess SILBC16 database (definitions extracted from that user manual). +RELATIVE: CONSTANT MULTI-YEAR METHODS and +RELATIVE: VARYING MULTI-YEAR METHODS: ""This is the case with survey methods restricted to a fraction of the spawning habitat and/or a fraction of the spawning period. There are various types of relative abundance estimates depending on the survey method, the level of standardization of the methods, and the sampling effort. For our purpose we have retained one type based on between-year consistency of the method where there are two levels."" +NO SURVEY THIS YEAR: ""stream was not inspected for that species this year"" " +ESTIMATE_METHOD,NuSEDS,"There are several standard methods to chose from. +Addition/Subtraction - simple addition or subtraction to provide an estimate. Should be used in conjunction with activity types Adjustment/Calibration and Summary observations. E.g. a population aggregate, the sum of two or more populations, would require the linking of two or more SENs and straight summation of the estimates. +Multiplication/Division - simple multiplication or division to summary estimates. This method should be used in conjunction with activity type Adjustment/Calibration. E.g. E.g. An annual estimate that was arrived at by Peak Live Plus Dead analysis can be adjusted by some factor to make it equivalent to a Time Series estimate that uses AUC calculations. +Area Under the Curve - Combining a series of point estimates for abundance to create an estimate for the annual abundance. This is done by determining the total area under a curve of abundance by time then dividing by the survey life (the average length of time that an individual is available to be observed alive i.e. is still within the survey area and is not dead). +Peak Live Plus Dead - Examine point estimates for abundance, determine the survey when the maximum live count observed; sum the live and dead counts for that survey to create the annual estimate. +Peak Live Plus Cumulative Dead - Examine point estimates for abundance, determine the survey when the maximum live count observed. Sum the live count for that survey with the cumulative total of the dead counts prior to and including that survey to create the annual estimate. +Fixed Site Census - Combining one or more raw observations into a single estimate (e.g. add all daily fence observation SIL to create a single annual estimate). +Mark and Recapture - Petersen - Use capture and re-capture SIL data to determine an abundance estimate with the Petersen calculation. +Mark and Recapture - Jolly-Seber - Use capture and re-capture SIL data to determine an abundance estimate with the Jolly-Seber calculation. +Redd Count - Using counts of redds from SILs and multiplied by a factor such as 2. +Lake Expansion - expanding the dead recoveries by the recovery effort +Cumulative New - N/A +" +GAZETTED_NAME,NuSEDS,Provincially recognized name for the waterbody. +LOCAL_NAME_1,NuSEDS,Commonly known name for the waterbody. +LOCAL_NAME_2,NuSEDS,Second most commonly known name for the waterbody. +ADULT_PRESENCE,NuSEDS,"Values are present if adults were observed, none observed if no adults were observed during the stream inspections, not inspected if adults were not looked for, unknown if it is not known whether adults were observed during inspections or not." +JACK_PRESENCE,NuSEDS,"Values are present if jacks were observed, none observed if no jacks were observed during the stream inspections, not inspected if jacks were not looked for, unknown if it is not known whether jacks were observed during inspections or not." +SPECIES_QUALIFIED,NuSEDS: conservation unit system sites,"This is an Conservation Unit acronym used to describe the species of salmon for which the escapement estimate is for, eg: CK - Chinook Salmon CM - Chum Salmon CO - Coho Salmon PKE - Even Year Pink Salmon PKO - Odd Year Pink Salmon SEL - Lake Type Sockeye Salmon SER - River or Ocean Type Sockeye Salmon " +CU_NAME,NuSEDS: conservation unit system sites,The assigned name of the Conservation Unit. Note that this name does not identify the species. +CU_TYPE,NuSEDS: conservation unit system sites," There are currently six CU types, i.e.,Current, Bin, VREQ[Bin], VREQ[Current], VREQ[Extirpated] and Extirpated based upon Blair Holtby's Rev 4.0 Conservation Unit data refresh. Definitions for these CU Types are listed below: CU_TYPE DEFINITION Current CU is extant and is either accepted or has been proposed. Bin Not a CU but a category to hold sites that for some reason are not assigned to a CU. Reasonable uses of the bin category are: a) sites where migratory dropouts are counted that cannot be reliably assigned to CUs; b) sites where transplanted fish are enumerated under the pretense that DFO is recreating an extinct CU; and c) sites where transplanted fish are enumerated that are outside the ecotypical zone of the source fish and where no claim to recreating an extinct CU has been made. VREQ[Bin] VREQ[] : Indicates that there is some doubt about the nature of the CU and valididation is required. [Bin]-See definition for Bin. The most common use of the prefix is for sockeye CUs on the central and north coasts that were identified by presumed suitability rather than by actual verified records of persistent presence. The second most common use is for CUs that likely were valid but it is unknown if they persist. Again these are mostly sockeye CUs. VREQ[Current] VREQ[] : Indicates that there is some doubt about the nature of the CU and valididation is required. [Current]-See definition for Current. The most common use of the prefix is for sockeye CUs on the central and north coasts that were identified by presumed suitability rather than by actual verified records of persistent presence. The second most common use is for CUs that likely were valid but it is unknown if they persist. Again these are mostly sockeye CUs. VREQ[Extirpated] VREQ[] : Indicates that there is some doubt about the nature of the CU and valididation is required. [Extirpated]-See definition for Extirpated. The most common use of the prefix is for sockeye CUs on the central and north coasts that were identified by presumed suitability rather than by actual verified records of persistent presence. The second most common use is for CUs that likely were valid but it is unknown if they persist. Again these are mostly sockeye CUs. Extirpated There are no known sites with fish spawning successfully in the wild and there are no known hatchery sites. " +FAZ_ACRO,NuSEDS: conservation unit system sites,Acronym of the freshwater adaptive zone +JAZ_ACRO,NuSEDS: conservation unit system sites,Acronym of the joint adaptive zone +MAZ_ACRO,NuSEDS: conservation unit system sites,Acronym of the marine adaptive zone +FULL_CU_IN,NuSEDS: conservation unit system sites,"The full index of the CU including the species qualifier, e.g. CK-01" +SYSTEM_SITE,NuSEDS: conservation unit system sites,The name of the waterbody. Originally from NUSEDS but not necessarily the same. Name priority was BC gazette > BC provincial alias > DFO alias1 > DFO alias2. Same as Waterbody Name above +Y_LAT,NuSEDS: conservation unit system sites,"Location of the mouth of the waterbody if flowing, or the centroid if not." +X_LONGT,NuSEDS: conservation unit system sites,"Location of the mouth of the waterbody if flowing, or the centroid if not." +IS_INDICATOR,NuSEDS: conservation unit system sites,"Is ""Y"" if this POP_ID has been identified by Area experts as an ""indicator"" population." +CU_LAT,NuSEDS: conservation unit system sites,Centroid of the CU +CU_LONGT,NuSEDS: conservation unit system sites,Centroid of the CU +coordinates_changed,PSF,"If the coordinates were manually changed (i.e., ""Y_LAT"" and ""X_LONGT"" != ""latitude"" and"" longitude"")" +FULL_CU_IN_PSF,PSF,"The updated NuSEDS field ""FULL_CU_IN"" according to PSF's definition of Cus" +stream_survey_quality,PSF,The quality of the survey based on ESTIMATE_CLASSIFICATION; see https://www.salmonexplorer.ca/methods/analytical-approach.html#spawner-survey-method +cuid,PSF,"The unique numeric identifier for the CU, as used in PSF's databases" +cu_name_pse,PSF,"The display name of the CU used by PSF, including in the Pacific Salmon Explorer" +cu_name_dfo,PSF,The name of the CU used by DFO +region,PSF,The broad-scale region that the CU is part of in the Pacific Salmon Explorer +regionid,PSF,The unique numeric identifier for the region +pointid,PSF,The unique numeric identifier for the site (lat/lon) +sys_nm,PSF,The site name as diaplyed in the Pacific Salmon Explorer (derived from SYSTEM_SITE) +latitude,PSF,The PSF's latitude of spawning survey sites +longitude,PSF,The PSF's longitude of spawning survey sites +distance,PSF,Euclidan distance between the NuSEDS' SYSTEM_SITE with Y_LAT and X_LONGT and the PSF's sys_nm with latitude and longitude +location_new,PSF,If SYSTEM_SITE and sys_nm differ +streamid,PSF,The unique numerid identifier for the population (site and CU combination) +longitude_final,PSF,Corrected longitude (decimal degrees) based on location name and river network +latitude_final,PSF,Corrected latitude (decimal degrees) based on location name and river network +sys_nm_final,PSF,The final sys_nm used +survey_score,PSF,"Equivalent of ESTIMATE_CLASSIFICATION but with numbers (e.g. 1 for ""TRUE ABUNDANCE (TYPE-1))"