Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to indicate multiple SENSOR or PARAMETER #81

Open
apswong opened this issue Nov 8, 2023 · 18 comments
Open

How to indicate multiple SENSOR or PARAMETER #81

apswong opened this issue Nov 8, 2023 · 18 comments
Assignees
Labels
admt approved documentation Improvements or additions to documentation priority priority topic R03 R25

Comments

@apswong
Copy link

apswong commented Nov 8, 2023

This issue is somewhat related to R03 and R25:

When a float carries multiple sensors of the same type and thus generate multiple values for the same parameter, the current practice is to indicate the additional sensors/parameters by placing an integer "n" at the end of the char string. For example, if a float carries three oxygen sensors, then the three parameters are DOXY, DOXY2, DOXY3.

This practice is problematic when the char string ends in a numeral, e.g. DOWN_IRRADIANCE412.

What are the possible solutions? Many old files have already been written with the old practice.

@HCBScienceProducts
Copy link

HCBScienceProducts commented Nov 8, 2023 via email

@HCBScienceProducts
Copy link

HCBScienceProducts commented Nov 8, 2023 via email

@SBS-EREHM
Copy link

SBS-EREHM commented Nov 8, 2023

FYI, this issue is a duplicate of #115

@HCBScienceProducts' underscore trick works, I think, as long as we agree to never ever make a SENSOR (sensor type) that ends in _<nnn>, i.e., the _<nnn> suffix is reserved for identical sensor instance nnn >= 2.

@vpaba
Copy link
Contributor

vpaba commented Apr 17, 2024

@HCBScienceProducts should this reccomendation (for duplicate SENSOR) be added to the WIP manual, if not there already? @mscanderbeg

Otherwise, the issue can be closed

@apswong
Copy link
Author

apswong commented Apr 17, 2024

@vpaba @tcarval
This recommendation: use an underscore to separate the integer n when the parameter string ends with a number, needs to be agreed on, then enter into the Argo Users Manual and the GDAC File Checker.

@richardsc
Copy link

With full acknowledgment that what I'm about to suggest could have retroactive consequences for all previous parameters that didn't end with a numeral, for consistency of implementation going forward the proposal could be to always use an underscore to separate additional sensors, e.g. BBP700_2 or DOXY_2. This eliminates the uncertainty over whether a numeral is part of the parameter name or not (but would potentially require a lot of changing of old names ...).

@apswong
Copy link
Author

apswong commented Apr 23, 2024

I agree with @richardsc. I prefer Clark's solution that from now on, going forward, that the new naming convention is to always use an underscore to separate additional sensors. Old files that contain the old naming convention (without the underscore) can remain on the GDACs, but new files should use the new naming convention from now on.

@SBS-EREHM
Copy link

SBS-EREHM commented Apr 23, 2024

With full acknowledgment that what I'm about to suggest could have retroactive consequences for all previous parameters that didn't end with a numeral, for consistency of implementation going forward the proposal could be to always use an underscore to separate additional sensors, e.g. BBP700_2 or DOXY_2. This eliminates the uncertainty over whether a numeral is part of the parameter name or not (but would potentially require a lot of changing of old names ...).

I completely agree with @richardsc. Currently, in the proposed JSON metadata schema,NVS controlled vocabulary terms (including those for SENSOR or PARAMETER) are specifed by SDN URI, e.g., SDN::R25:CTD_PRES or SDN:R03::PRES. Parsing and validating a controlled terms with repeated SENSORS or PARAMETERS becomes problematic. when there are multiple instances Imagine a float deployed with both MCOMS and RBR Tridente FLBBCD sensors with BBP532. In the current scheme we would have:

  • SDN::R25BACKSCATTERINGMETER_BBP532, SDN:R03::BETA_BACKSCATTERING700
  • SDN::R25BACKSCATTERINGMETER_BBP5322, SDN:R03::BETA_BACKSCATTERING5322

To properly validate the second sensor or paremater's controlled term for sensor or parameter, we need to

  1. Understand that this is a sensor with a wavelength in it's controlled term
  2. Parse the URI for wavelength
  3. Have "inside knowledge" that the wavelength is only 3 digits
  4. then remove the 4th digit
  5. Than validate the controlled term (SDN::R25BACKSCATTERINGMETER_BBP532, SDN:R03::BETA_BACKSCATTERING700) using the NVS controlled vocabulary

@richardsc propsal would make it a bit easier, assuming there is NEVER EVER an NVS controlled term that ends in _N where N=2,3,4,5,...

  • SDN::R25BACKSCATTERINGMETER_BBP532_2, SDN:R03::BETA_BACKSCATTERING532_2

This is still problematic. Many R25 and R03 entries have an underscore, so the parsing rule has to look for a final underscore followed by a (single?) digit, e.g., a messy regular expression match* with (in this case) the first capture group capturing the controlled term and the fourth capture group capturing the instance number :

^SDN::R25:((?:([A-Z]+[0-9]*)*(?:_[A-Z]+[0-9]*)))?(_(\d))?$

I propose something easier. We use a double underscore __N that easily and unambiguously delimits the NVS controlled term (SDN::R25BACKSCATTERINGMETER_BBP532, SDN:R03::BETA_BACKSCATTERING700) from the instance number (N). So I can simply look for a double underscore.

If no double underscore, the NVS term is ready to go:

  • SDN::R25BACKSCATTERINGMETER_BBP532, SDN:R03::BETA_BACKSCATTERING700

If double underscore is found, then parse all to the left as the NVS controlled term and all to the right as the instance number

  • SDN::R25BACKSCATTERINGMETER_BBP532__2, SDN:R03::BETA_BACKSCATTERING700__2

Comments?

  • I'm sure someone can write a more compact reg ex than this, but the example does work.

@HCBScienceProducts
Copy link

@SBS-EREHM: Please check on the previous posts. The current scheme is to add an underscore if the name ends in a numeral, so your given examples aren't correct.

The current practice is that:

When a float carries multiple sensors of the same type and thus generate multiple values for the same parameter, the additional sensors/parameters are indicated by placing an integer "n" at the end of the char string. For example, if a float carries three oxygen sensors, then the three parameters are DOXY, DOXY2, DOXY3.

When the parameter string ends with a numeral because the parameter string contains a wavelength indication <nnn>, e.g., BBP<nnn>, then the integer "n" at the end is separated by an underscore. E.g. BBP700 for the first and BBP700_2 for a second BBP700 parameter.

(to be agreed on or modified by the current discussion)

Parsing the current practice is (unambiguously) feasible: By starting from the back :-)

  • How many numerals are at the end of the string?
    • 0 -> standard parameter name, e.g., DOXY
    • 1+ -> Are they preceded by a single underscore?
      • Yes -> numerals indicate the parameter number; everything before the underscore is a standard parameter name (with a wavelength <nnn> at the end), e.g., BBP700_2
      • No -> Is it less than 3 numerical digits?
        • Yes -> numerals indicate the parameter number; everything before the numerals is a standard parameter name (without a wavelength <nnn> at the end), e.g., DOXY2
        • No -> numerals indicate a wavelength <nnn>; everything is part of a standard parameter name, e.g., BBP700

This parsing implicitly assumes for the current practice that:

  1. Parameter names end with numerals only to indicate a wavelength <nnn>, which is always 3(+) numerical digits
  2. There is max. 99 replicate parameter names for parameters that don't end with a numeral, i.e., DOXY, DOXY2, DOXY3, ... DOXY99

Number (1.) is presently true. Number (2.) seems a reasonable assumption to me.

@apswong: While I see the merit of @richardsc's suggestion to simplify the above sketched decision tree by always adding a single underscore, I would prefer a solution that gives a consistent Argo data set all over. Not one that has different practices for different floats in time.
So I'd be in favour to formalize and keep the current practice, or to opt for @richardsc's suggestion to always ad an underscore but then require existing files to be modified accordingly.

I also want to remind that there are CSIRO floats 1901348 and 5905165/5905395 with both ECO_BB3 and MCOMS_FLBBCD sensors. Each sensor has a BBP700 channel. To have an example of how things look like (and work like) at present.

Examples from the current 1901348_meta.nc file (highlights in bold are by me):

PARAMETER =
"TEMP ",
"PSAL ",
"PRES ",
"DOXY ",
"TEMP_DOXY ",
"PHASE_DELAY_DOXY ",
"TEMP_VOLTAGE_DOXY ",
"CHLA ",
"FLUORESCENCE_CHLA ",
"BBP700 ",
"BETA_BACKSCATTERING700 ",
"TEMP_CPU_CHLA ",
"FLUORESCENCE_CDOM ",
"CDOM ",
"TRANSMITTANCE_PARTICLE_BEAM_ATTENUATION660 ",
"CP660 ",
"BBP700_2 ",
"BETA_BACKSCATTERING700_2 ",
"BBP532 ",
"BETA_BACKSCATTERING532 ",
"BBP470 ",
"BETA_BACKSCATTERING470 ",
"DOWN_IRRADIANCE412 ",
"RAW_DOWNWELLING_IRRADIANCE412 ",
"DOWN_IRRADIANCE443 ",
"RAW_DOWNWELLING_IRRADIANCE443 ",
"DOWN_IRRADIANCE490 ",
"RAW_DOWNWELLING_IRRADIANCE490 ",
"DOWN_IRRADIANCE555 ",
"RAW_DOWNWELLING_IRRADIANCE555 ",
"UP_RADIANCE412 ",
"RAW_UPWELLING_RADIANCE412 ",
"UP_RADIANCE443 ",
"RAW_UPWELLING_RADIANCE443 ",
"UP_RADIANCE490 ",
"RAW_UPWELLING_RADIANCE490 ",
"UP_RADIANCE555 ",
"RAW_UPWELLING_RADIANCE555 " ;
}

and correspondingly

PARAMETER_SENSOR =
"CTD_TEMP ",
"CTD_CNDC ",
"CTD_PRES ",
"OPTODE_DOXY ",
"OPTODE_DOXY ",
"OPTODE_DOXY ",
"OPTODE_DOXY ",
"FLUOROMETER_CHLA ",
"FLUOROMETER_CHLA ",
"BACKSCATTERINGMETER_BBP700 ",
"BACKSCATTERINGMETER_BBP700 ",
"BACKSCATTERINGMETER_BBP700 ",
"FLUOROMETER_CDOM ",
"FLUOROMETER_CDOM ",
"TRANSMISSOMETER_CP660 ",
"TRANSMISSOMETER_CP660 ",
"BACKSCATTERINGMETER_BBP700_2 ",
"BACKSCATTERINGMETER_BBP700_2 ",
"BACKSCATTERINGMETER_BBP532 ",
"BACKSCATTERINGMETER_BBP532 ",
"BACKSCATTERINGMETER_BBP470 ",
"BACKSCATTERINGMETER_BBP470 ",
"RADIOMETER_DOWN_IRR412 ",
"RADIOMETER_DOWN_IRR412 ",
"RADIOMETER_DOWN_IRR443 ",
"RADIOMETER_DOWN_IRR443 ",
"RADIOMETER_DOWN_IRR490 ",
"RADIOMETER_DOWN_IRR490 ",
"RADIOMETER_DOWN_IRR555 ",
"RADIOMETER_DOWN_IRR555 ",
"RADIOMETER_UP_RAD ",
"RADIOMETER_UP_RAD412 ",
"RADIOMETER_UP_RAD443 ",
"RADIOMETER_UP_RAD443 ",
"RADIOMETER_UP_RAD490 ",
"RADIOMETER_UP_RAD490 ",
"RADIOMETER_UP_RAD555 ",
"RADIOMETER_UP_RAD555 " ;
}

(There's a cookie to win for the first one to find the one entry that's off in the 1901348 meta entries ;-) )

@catsch
Copy link

catsch commented Sep 13, 2024

PARAMETER_SENSOR =
"CTD_TEMP ",
"CTD_CNDC ",
"CTD_PRES ",
"OPTODE_DOXY ",
"OPTODE_DOXY ",
"OPTODE_DOXY ",
"OPTODE_DOXY ",
"FLUOROMETER_CHLA ",
"FLUOROMETER_CHLA ",
"BACKSCATTERINGMETER_BBP700 ",
"BACKSCATTERINGMETER_BBP700 ",
"BACKSCATTERINGMETER_BBP700 ",
"FLUOROMETER_CDOM ",
"FLUOROMETER_CDOM ",
"TRANSMISSOMETER_CP660 ",
"TRANSMISSOMETER_CP660 ",
"BACKSCATTERINGMETER_BBP700_2 ",
"BACKSCATTERINGMETER_BBP700_2 ",
"BACKSCATTERINGMETER_BBP532 ",
"BACKSCATTERINGMETER_BBP532 ",
"BACKSCATTERINGMETER_BBP470 ",
"BACKSCATTERINGMETER_BBP470 ",
"RADIOMETER_DOWN_IRR412 ",
"RADIOMETER_DOWN_IRR412 ",
"RADIOMETER_DOWN_IRR443 ",
"RADIOMETER_DOWN_IRR443 ",
"RADIOMETER_DOWN_IRR490 ",
"RADIOMETER_DOWN_IRR490 ",
"RADIOMETER_DOWN_IRR555 ",
"RADIOMETER_DOWN_IRR555 ",
"RADIOMETER_UP_RAD ",
"RADIOMETER_UP_RAD412 ",
"RADIOMETER_UP_RAD443 ",
"RADIOMETER_UP_RAD443 ",
"RADIOMETER_UP_RAD490 ",
"RADIOMETER_UP_RAD490 ",
"RADIOMETER_UP_RAD555 ",
"RADIOMETER_UP_RAD555 " ;
}

I want the cookie,

So on my side, I'd be in favour to formalize and keep the current practice as i am not really happy to modify the existing file.

@tcarval
Copy link
Contributor

tcarval commented Oct 11, 2024

Here is the current rule documented in Argo user's manual §3.3.1 Parameters from duplicate sensors.

3.3.1 Parameters from duplicate sensors
Some floats are equipped with 2 different sensors, measuring the same physical parameter. In that case, add the integer "2" at the end of the code of the duplicate parameter (e.g. DOXY2).
If more sensors that measure the same physical parameter are added, then the integer will simply increase by 1 (i.e. DOXY3, DOXY4, and so on).

Do I replace that chapter with:

When a float carries multiple sensors of the same type and thus generate multiple values for the same parameter, the additional sensors/parameters are indicated by placing an integer "n" at the end of the char string. For example, if a float carries three oxygen sensors, then the three parameters are DOXY, DOXY2, DOXY3.
When the parameter string ends with a numeral because the parameter string contains a wavelength indication , e.g., BBP, then the integer "n" at the end is separated by an underscore. E.g. BBP700 for the first and BBP700_2 for a second BBP700 parameter.

@catsch
Copy link

catsch commented Oct 11, 2024

Yes, thanks Thierry

@danibodc danibodc added the non-NVS For issues not strictly related to NVS collections updates or new collections requests label Oct 11, 2024
@apswong
Copy link
Author

apswong commented Oct 11, 2024

@tcarval @catsch The option suggested by @richardsc above (to always add an underscore) should be presented and discussed at the ADMT meeting. We can then record whatever the agreed option is in the Users Manual.

Option 1: Always add an underscore
Option 2: Only add an underscore when the PARAM char string ends with a numeral

@tcarval tcarval pinned this issue Oct 11, 2024
@tcarval
Copy link
Contributor

tcarval commented Oct 11, 2024

@tcarval @catsch The option suggested by @richardsc above (to always add an underscore) should be presented and discussed at the ADMT meeting. We can then record whatever the agreed option is in the Users Manual.

Option 1: Always add an underscore
Option 2: Only add an underscore when the PARAM char string ends with a numeral

Richard's option to always add an underscore is for sure the best. But it implies a reprocessing all floats having duplicate sensors, that is not a small effort.

@catsch
Copy link

catsch commented Oct 12, 2024

Yes I think that the clear benefits of changing the way it is working now should be highlighted in the discussion (because it means some reprocessing),
while just writing how it is working now in the manual is straightforward

@delphinedobler delphinedobler unpinned this issue Oct 23, 2024
@tcarval tcarval added priority priority topic documentation Improvements or additions to documentation admt approved and removed non-NVS For issues not strictly related to NVS collections updates or new collections requests labels Oct 23, 2024
@tcarval
Copy link
Contributor

tcarval commented Oct 25, 2024

ADMT-25 decision : Option 1: Always add an underscore

@HCBScienceProducts
Copy link

  • reprocess all DAC files with duplicate sensor to always have an underscore

@apswong
Copy link
Author

apswong commented Oct 25, 2024

The reprocessing of DAC files should include meta files where SENSOR information is recorded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
admt approved documentation Improvements or additions to documentation priority priority topic R03 R25
Projects
Status: No status
Development

No branches or pull requests

8 participants