... | ... | @@ -3,7 +3,7 @@ |
|
|
- The `irf` variable stands for intermediate reject flag. A value of 2 indicates that the data source or report lacks suitable quality. This data is rejected from the [ship data
|
|
|
selection](Workflow/data-selection) in [split_by_type.R](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/split_by_type.R)
|
|
|
|
|
|
- A deck priority is assigned in [`add_dck_priority.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R).
|
|
|
- A deck priority is assigned in [`add_dck_priority.R `](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_dck_priority.R).
|
|
|
|
|
|
A certain priority is assign to some decks which are known to have a good quality data. The data expected to be of best quality is assigned a priority of 1, data with larger priority numbers will be flagged as the worst duplicate if reports are identified as potential matches. These priority values are based on those from [ICOADS](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf).
|
|
|
|
... | ... | @@ -17,7 +17,7 @@ year and/or `sid` can be found in Table 5 of the [technical report](https://git. |
|
|
|
|
|
Having a precision criteria not only preforms quality control on the data but helps to set tolerances when comparing variables from suspected duplicates.
|
|
|
|
|
|
A comparison of climatological variables allows for a match between reports in the [duplicate identification](Workflow/duplicate-indentification) procedure done by [`new_get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_get_pairs.R).
|
|
|
A comparison of climatological variables allows for a match between reports in the [duplicate identification](Workflow/duplicate-indentification) procedure done by [`get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/get_pairs.R).
|
|
|
|
|
|
**Met Office track check**
|
|
|
-----------------------------------
|
... | ... | @@ -26,4 +26,4 @@ Several `dck` have `id`'s that indicate a logbook, sheet or other block of data |
|
|
|
|
|
The linked `id`'s are checked using the Met Office Quality Control track check (MOQC track check from [IMMA](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/How-to-install#install-dependencies-with-conda-all-platforms)) as well as for time duplicates.
|
|
|
|
|
|
Reports that fail the track check are flag as the worst duplicate. Where positions (lat, lon) are similar the best duplicate is select by `dck` priority and number of quality variables found by the climatological check. The track check is also perform in the same [`new_get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_get_pairs.R) script. |
|
|
\ No newline at end of file |
|
|
Reports that fail the track check are flag as the worst duplicate. Where positions (lat, lon) are similar the best duplicate is select by `dck` priority and number of quality variables found by the climatological check. The track check is also perform in the same [`get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/get_pairs.R) script. |
|
|
\ No newline at end of file |