... | ... | @@ -14,11 +14,11 @@ Script | Description |
|
|
[`split_by_type.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/split_by_type.R) | This script splits up the Rda versions of ICOADS R3 Total files by platform type (ship, moored buoy etc.) adding or correcting the PT flag when necessary. The separation of records into the different platform types is done following a [selection criteria](Workflow/data-selection).
|
|
|
[`simple_dup.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/simple_dup.R) | Duplicate record identification among ship data and the rest of the different platform types for which the records show a matching `date` , `time` and position `(lat, lon)`. The script finds reports in the ship files output that are also contained in the files output from platforms, coastal data, moored buoys or drifters.
|
|
|
[`ship2plat.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/ship2plat.R) | Assesses potential matches and identify reports that are "not ship". Uses additional information from WMO Publication 47 to flag non-ship data. [Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1). These reports are then written to directory MFILES_NOTSHIP to be excluded from further SHIP processing.
|
|
|
[`process_ids.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/process_ids.R) | The script adds various ID info and flags to the ICOADS shipdata. Reformat of ship `id`'s coming from different data sources (`sid`) to enable linking of data from the same ship (same ship name) across different `dck`'s. And to also enable linking to metadata information in WMO Publication No. 47. More information in [Processing of `id`'s](Workflow/processing-of-ids).
|
|
|
[`process_shipdata.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/process_shipdata.R) | Corrects dates and times errors noticed in some `dck`. Largely arise from confusion over historical definitions of the marine day and conversions between local time and UTC.
|
|
|
[`new_get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_get_pairs.R) | Duplicate record identification within the ship data. Pairs the reports as duplicate if they have associated ship `id`'s. The candidate pairs are then selected according to i) the number of matching elements (similar content of variables within a specific tolerance), ii) the `dck`'s, and iii) a comparison of the `id`'s.
|
|
|
[`new_get_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_get_dups.R) | Counts the number of duplicated records and flags the best according to a [quality control criteria](Workflow/quality-control). Groups duplicated records by common `callsings`.
|
|
|
[`new_merge_ids_year.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_merge_ids_year.R) | Links `id`'s into classes. Ship tracks of the linked `id`'s are then checked. Reports that fail the track check are flagged as the worst duplicate. Uses a shipping tracking alogrithm from [**r-imma**.](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/How-to-install#install-dependencies-with-conda-all-platforms)
|
|
|
[`process_ships.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/process_ships.R) | The script adds various ID info and flags to the ICOADS shipdata. Reformat of ship `id`'s coming from different data sources (`sid`) to enable linking of data from the same ship (same ship name) across different `dck`'s. And to also enable linking to metadata information in WMO Publication No. 47. More information in [Processing of `id`'s](Workflow/processing-of-ids). Corrects dates and times errors noticed in some `dck`. Largely arise from confusion over historical definitions of the marine day and conversions between local time and UTC.
|
|
|
[`get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/get_pairs.R) | Duplicate record identification within the ship data. Pairs the reports as duplicate if they have associated ship `id`'s. The candidate pairs are then selected according to i) the number of matching elements (similar content of variables within a specific tolerance), ii) the `dck`'s, and iii) a comparison of the `id`'s.
|
|
|
[`get_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/get_dups.R) | Counts the number of duplicated records and flags the best according to a [quality control criteria](Workflow/quality-control). Groups duplicated records by common `callsings`.
|
|
|
[`merge_ids_year.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/merge_ids_year.R) | Links `id`'s into classes. Ship tracks of the linked `id`'s are then checked. Reports that fail the track check are flagged as the worst duplicate. Uses a shipping tracking alogrithm from [**r-imma**.](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/How-to-install#install-dependencies-with-conda-all-platforms)
|
|
|
[`nrt_dup.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/nrt_dup.R) | ---.
|
|
|
[`clean_data.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/clean_data.R) | ---.
|
|
|
[`clean2track.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/clean2track.R) | Forms ship tracks for linked `id`'s. --.
|
|
|
[`remove_output_files.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/remove_output_files.R) | Clean of `output_data/output_files` before re-running.
|
... | ... | @@ -30,30 +30,28 @@ The following functions are in alphabetical order. |
|
|
|
|
|
Function | Description
|
|
|
-------- |:------------
|
|
|
[`add_date2.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_date2.R) | Adds a date variable to the data frame based on `yr`, `mo`, `dy`, `hr`. <br> Invalid values are set to missing. For these reports the code generates a date variable with a missing hour set to local noon and adds a **date.flag.** <br><br> `0 = valid date & time`<br> `1 = invalid date or time (time not missing)` <br> `2 = valid date, hr missing, 12 local added`
|
|
|
[`add_dck_priority.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R) | Function use to identify duplicates. During the identification procedure priorities are assigned to the data from each `dck` (See [source](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R)), data expected to be of best quality is assigned a priority of 1, data with larger priority numbers will be flagged as the worst duplicate if identified as potential match. Priority values are based on those from [ICOADS.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf)
|
|
|
[`add_ID_class.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_ID_class.R) | Adds an specific `id` class to the report. Based on the [ICOADS doc.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf) and other earlier sources, the `id`'s have been classified as to their "`id` type" (logbook number, ship number, etc.) and validity. Where an [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign is found, the country for that callsign is also identified.
|
|
|
[`add_shipnames.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_shipnames.R) | Adds ship names according to `dck` numbers.
|
|
|
[`assess_match.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/assess_match.R) | Finds maximum differences between reports across variables of interest (`sst, slp, at, dpt, wbt, rh, w, d, ww, n, vv, wh, nh, w1`). <br> Flags delayed mode report(s) with an older [IMMT](https://icoads.noaa.gov/immt5.html) number as worst duplicate(s).
|
|
|
`find_gap_func.R` | -------
|
|
|
`fix_usmm_705to707cors_func.R` | Called in `add_shipnames.R` but never used should it be delete from the repository?
|
|
|
[`flag_id_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/flag_id_dups.R) | Adds a flag to reports that fail the [IMMA](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/how-to-install#imma-toolbox) ship track check. Such reports are flagged as the worst duplicate.
|
|
|
`get_gap_pos.R` | -----
|
|
|
[`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_groups.R) | Assigns a group number to a specific type of `dck` or `dck`'s. The group is appended to the record `id` to avoid mixing data with the same `id` from different types of data (e.g. Japanese data, UK Navy).
|
|
|
[`get_id_class.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_id_class.R) | Used in `add_ID_class.R`. <br> Assigns an `id` class to callsigns listed in metadata of Pub 47 (**Not sure**).
|
|
|
`get_isolated.R` Should we only have gcd.slc alone? | Contains function [gcd.slc](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_isolated.R#L10): Calculates the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines ([slc](https://en.wikipedia.org/wiki/Spherical_law_of_cosines)).
|
|
|
[`get_itu_country.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_itu_country.R) | Gets the [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign prefix associated with a country.
|
|
|
`get_matchedids.R` | ------
|
|
|
`get_mismatch.R` | -------
|
|
|
[`get_prec.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_prec.R) | Get's the `dck`/`sid` [precision of measurement](Workflow/quality-control) for selected variables (`sst`,`slp`,`at`,`dpt`,`w`,`d`,`n`,`ww`,`w1`,`vv`)
|
|
|
`get_speed.R` Do we need this doesn't seem to get use only the gcd.slc? | Also contains function [gcd.slc](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_isolated.R#L10): Calculates the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines ([slc](https://en.wikipedia.org/wiki/Spherical_law_of_cosines)).
|
|
|
[`id_group_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/id_group_func.R) | Makes sure that all `id`'s grouped in [`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_groups.R) are properly associated.
|
|
|
[`liz_merge.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/liz_merge.R) | Merges two data frames by specific columns.
|
|
|
[`new_add_match_id.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/new_add_match_id.R) | Tests whether paired `id`'s are allow to match according a [matching criteria](Workflow/matching-criteria) and by using the [Damerau–Levenshtein (DL) distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) formula.
|
|
|
[`new_homog_ids.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/new_homog_ids.R) | Uses standard linkages between `id`'s to add `id` [homogenisation](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/Workflow/processing-of-ids#homogenisation) information to each records.
|
|
|
[`print_id_match_info.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/print_id_match_info.R) | Prints ICOADS duplicates information and `id` matching results.
|
|
|
[`read_rdsfiles.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/read_rdsfiles.R) | A collection of functions to read all different types of data files (e.g rds, .txt) used through out the code.
|
|
|
[`write_dup_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/write_dup_func.R) | Writes duplicate information in a pipe-delimited year-month output format. The report `uid` is followed by the report `id` then a flag with value 1 if the `id` has been changed, 0 if it remains the same. <br> An example of the format is: <br> ICOADS-30-0Y0HJK | 32024 | 0 <br> ICOADS-30-0Y0HJL | 14 00117 | 1
|
|
|
[`add_ID_class.R `](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_ID_class.R) | Adds an specific `id` class to the report. Based on the [ICOADS doc.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf) and other earlier sources, the `id`'s have been classified as to their "`id` type" (logbook number, ship number, etc.) and validity. Where an [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign is found, the country for that callsign is also identified.
|
|
|
[`add_date2.R `](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_date2.R) | Adds a date variable to the data frame based on `yr`, `mo`, `dy`, `hr`. <br> Invalid values are set to missing. For these reports the code generates a date variable with a missing hour set to local noon and adds a **date.flag.** <br><br> `0 = valid date & time`<br> `1 = invalid date or time (time not missing)` <br> `2 = valid date, hr missing, 12 local added`
|
|
|
[`add_dck_priority.R `](hhttps://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_dck_priority.R) | Function use to identify duplicates. During the identification procedure priorities are assigned to the data from each `dck` (See [source](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R)), data expected to be of best quality is assigned a priority of 1, data with larger priority numbers will be flagged as the worst duplicate if identified as potential match. Priority values are based on those from [ICOADS.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf)
|
|
|
[`add_match_id.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_match_id.R) | Tests whether paired `id`'s are allow to match according a [matching criteria](Workflow/matching-criteria) and by using the [Damerau–Levenshtein (DL) distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) formula.
|
|
|
[`add_shipnames.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_shipnames.R) | Adds ship names according to `dck` numbers.
|
|
|
[`find_gap_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/find_gap_func.R) | -------
|
|
|
[`flag_id_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/flag_id_dups.R) | Adds a flag to reports that fail the [IMMA](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/how-to-install#imma-toolbox) ship track check. Such reports are flagged as the worst duplicate.
|
|
|
[`get_gap_pos.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_gap_pos.R) | -----
|
|
|
[`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_groups.R) | Assigns a group number to a specific type of `dck` or `dck`'s. The group is appended to the record `id` to avoid mixing data with the same `id` from different types of data (e.g. Japanese data, UK Navy).
|
|
|
[`get_id_class.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_id_class.R) | Used in `add_ID_class.R`. <br> Assigns an `id` class to callsigns listed in metadata of Pub 47 (**Not sure**).
|
|
|
[`get_itu_country.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_itu_country.R) | Gets the [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign prefix associated with a country.
|
|
|
[`get_matchedids.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_matched_ids.R) | ------
|
|
|
[`get_mis.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_mis.R) | Selects candidate pairs of duplicate reports according to i) the number of matching elements (similar content of variables within a specific tolerance).
|
|
|
[`get_prec.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_prec.R) | Get's the `dck`/`sid` [precision of measurement](Workflow/quality-control) for selected variables (`sst`,`slp`,`at`,`dpt`,`w`,`d`,`n`,`ww`,`w1`,`vv`)
|
|
|
[`get_pub47.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_pub47.R) | function to process monthly files from Pub47.
|
|
|
[`get_speed.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_speed.R) | Contains several utility functions to convert between radians and degrees, calculate time and speed from coordinates and calculate the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines ([slc](https://en.wikipedia.org/wiki/Spherical_law_of_cosines)).
|
|
|
[`homog_ids.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/homog_ids.R) | Uses standard linkages between `id`'s to add `id` [homogenisation](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/Workflow/processing-of-ids#homogenisation) information to each records.
|
|
|
[`id_group_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/id_group_func.R) | Makes sure that all `id`'s grouped in [`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_groups.R) are properly associated.
|
|
|
[`liz_merge.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/liz_merge.R) | Merges two data frames by specific columns.
|
|
|
[`print_id_match_info.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/print_id_match_info.R) | Prints ICOADS duplicates information and `id` matching results.
|
|
|
[`read_rdsfiles.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/read_rdsfiles.R) | A collection of functions to read all different types of data files (e.g rds, .txt) used through out the code.
|
|
|
[`write_dup_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/write_dup_func.R) | Writes duplicate information in a pipe-delimited year-month output format. The report `uid` is followed by the report `id` then a flag with value 1 if the `id` has been changed, 0 if it remains the same. <br> An example of the format is: <br> ICOADS-30-0Y0HJK | 32024 | 0 <br> ICOADS-30-0Y0HJL | 14 00117 | 1
|
|
|
|
|
|
**ICOADS variables used**
|
|
|
-------------------------
|
... | ... | |