... | @@ -14,11 +14,11 @@ Script | Description |
... | @@ -14,11 +14,11 @@ Script | Description |
|
[`split_by_type.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/split_by_type.R) | This script splits up the Rda versions of ICOADS R3 Total files by platform type (ship, moored buoy etc.) adding or correcting the PT flag when necessary. The separation of records into the different platform types is done following a [selection criteria](Workflow/data-selection).
|
|
[`split_by_type.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/split_by_type.R) | This script splits up the Rda versions of ICOADS R3 Total files by platform type (ship, moored buoy etc.) adding or correcting the PT flag when necessary. The separation of records into the different platform types is done following a [selection criteria](Workflow/data-selection).
|
|
[`simple_dup.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/simple_dup.R) | Duplicate record identification among ship data and the rest of the different platform types for which the records show a matching `date` , `time` and position `(lat, lon)`. The script finds reports in the ship files output that are also contained in the files output from platforms, coastal data, moored buoys or drifters.
|
|
[`simple_dup.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/simple_dup.R) | Duplicate record identification among ship data and the rest of the different platform types for which the records show a matching `date` , `time` and position `(lat, lon)`. The script finds reports in the ship files output that are also contained in the files output from platforms, coastal data, moored buoys or drifters.
|
|
[`ship2plat.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/ship2plat.R) | Assesses potential matches and identify reports that are "not ship". Uses additional information from WMO Publication 47 to flag non-ship data. [Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1). These reports are then written to directory MFILES_NOTSHIP to be excluded from further SHIP processing.
|
|
[`ship2plat.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/ship2plat.R) | Assesses potential matches and identify reports that are "not ship". Uses additional information from WMO Publication 47 to flag non-ship data. [Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1). These reports are then written to directory MFILES_NOTSHIP to be excluded from further SHIP processing.
|
|
[`process_ids.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/process_ids.R) | The script adds various ID info and flags to the ICOADS shipdata. Reformat of ship `id`'s coming from different data sources (`sid`) to enable linking of data from the same ship (same ship name) across different `dck`'s. And to also enable linking to metadata information in WMO Publication No. 47. More information in [Processing of `id`'s](Workflow/processing-of-ids).
|
|
[`process_ships.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/process_ships.R) | The script adds various ID info and flags to the ICOADS shipdata. Reformat of ship `id`'s coming from different data sources (`sid`) to enable linking of data from the same ship (same ship name) across different `dck`'s. And to also enable linking to metadata information in WMO Publication No. 47. More information in [Processing of `id`'s](Workflow/processing-of-ids). Corrects dates and times errors noticed in some `dck`. Largely arise from confusion over historical definitions of the marine day and conversions between local time and UTC.
|
|
[`process_shipdata.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/process_shipdata.R) | Corrects dates and times errors noticed in some `dck`. Largely arise from confusion over historical definitions of the marine day and conversions between local time and UTC.
|
|
[`get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/get_pairs.R) | Duplicate record identification within the ship data. Pairs the reports as duplicate if they have associated ship `id`'s. The candidate pairs are then selected according to i) the number of matching elements (similar content of variables within a specific tolerance), ii) the `dck`'s, and iii) a comparison of the `id`'s.
|
|
[`new_get_pairs.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_get_pairs.R) | Duplicate record identification within the ship data. Pairs the reports as duplicate if they have associated ship `id`'s. The candidate pairs are then selected according to i) the number of matching elements (similar content of variables within a specific tolerance), ii) the `dck`'s, and iii) a comparison of the `id`'s.
|
|
[`get_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/get_dups.R) | Counts the number of duplicated records and flags the best according to a [quality control criteria](Workflow/quality-control). Groups duplicated records by common `callsings`.
|
|
[`new_get_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_get_dups.R) | Counts the number of duplicated records and flags the best according to a [quality control criteria](Workflow/quality-control). Groups duplicated records by common `callsings`.
|
|
[`merge_ids_year.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/merge_ids_year.R) | Links `id`'s into classes. Ship tracks of the linked `id`'s are then checked. Reports that fail the track check are flagged as the worst duplicate. Uses a shipping tracking alogrithm from [**r-imma**.](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/How-to-install#install-dependencies-with-conda-all-platforms)
|
|
[`new_merge_ids_year.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/new_merge_ids_year.R) | Links `id`'s into classes. Ship tracks of the linked `id`'s are then checked. Reports that fail the track check are flagged as the worst duplicate. Uses a shipping tracking alogrithm from [**r-imma**.](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/How-to-install#install-dependencies-with-conda-all-platforms)
|
|
[`nrt_dup.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/nrt_dup.R) | ---.
|
|
[`clean_data.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/clean_data.R) | ---.
|
|
[`clean_data.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/clean_data.R) | ---.
|
|
[`clean2track.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/clean2track.R) | Forms ship tracks for linked `id`'s. --.
|
|
[`clean2track.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/clean2track.R) | Forms ship tracks for linked `id`'s. --.
|
|
[`remove_output_files.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/remove_output_files.R) | Clean of `output_data/output_files` before re-running.
|
|
[`remove_output_files.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rscripts/remove_output_files.R) | Clean of `output_data/output_files` before re-running.
|
... | @@ -30,30 +30,28 @@ The following functions are in alphabetical order. |
... | @@ -30,30 +30,28 @@ The following functions are in alphabetical order. |
|
|
|
|
|
Function | Description
|
|
Function | Description
|
|
-------- |:------------
|
|
-------- |:------------
|
|
[`add_date2.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_date2.R) | Adds a date variable to the data frame based on `yr`, `mo`, `dy`, `hr`. <br> Invalid values are set to missing. For these reports the code generates a date variable with a missing hour set to local noon and adds a **date.flag.** <br><br> `0 = valid date & time`<br> `1 = invalid date or time (time not missing)` <br> `2 = valid date, hr missing, 12 local added`
|
|
[`add_ID_class.R `](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_ID_class.R) | Adds an specific `id` class to the report. Based on the [ICOADS doc.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf) and other earlier sources, the `id`'s have been classified as to their "`id` type" (logbook number, ship number, etc.) and validity. Where an [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign is found, the country for that callsign is also identified.
|
|
[`add_dck_priority.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R) | Function use to identify duplicates. During the identification procedure priorities are assigned to the data from each `dck` (See [source](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R)), data expected to be of best quality is assigned a priority of 1, data with larger priority numbers will be flagged as the worst duplicate if identified as potential match. Priority values are based on those from [ICOADS.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf)
|
|
[`add_date2.R `](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_date2.R) | Adds a date variable to the data frame based on `yr`, `mo`, `dy`, `hr`. <br> Invalid values are set to missing. For these reports the code generates a date variable with a missing hour set to local noon and adds a **date.flag.** <br><br> `0 = valid date & time`<br> `1 = invalid date or time (time not missing)` <br> `2 = valid date, hr missing, 12 local added`
|
|
[`add_ID_class.R `](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_ID_class.R) | Adds an specific `id` class to the report. Based on the [ICOADS doc.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf) and other earlier sources, the `id`'s have been classified as to their "`id` type" (logbook number, ship number, etc.) and validity. Where an [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign is found, the country for that callsign is also identified.
|
|
[`add_dck_priority.R `](hhttps://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_dck_priority.R) | Function use to identify duplicates. During the identification procedure priorities are assigned to the data from each `dck` (See [source](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_dck_priority.R)), data expected to be of best quality is assigned a priority of 1, data with larger priority numbers will be flagged as the worst duplicate if identified as potential match. Priority values are based on those from [ICOADS.](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf)
|
|
[`add_shipnames.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/add_shipnames.R) | Adds ship names according to `dck` numbers.
|
|
[`add_match_id.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_match_id.R) | Tests whether paired `id`'s are allow to match according a [matching criteria](Workflow/matching-criteria) and by using the [Damerau–Levenshtein (DL) distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) formula.
|
|
[`assess_match.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/assess_match.R) | Finds maximum differences between reports across variables of interest (`sst, slp, at, dpt, wbt, rh, w, d, ww, n, vv, wh, nh, w1`). <br> Flags delayed mode report(s) with an older [IMMT](https://icoads.noaa.gov/immt5.html) number as worst duplicate(s).
|
|
[`add_shipnames.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/add_shipnames.R) | Adds ship names according to `dck` numbers.
|
|
`find_gap_func.R` | -------
|
|
[`find_gap_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/find_gap_func.R) | -------
|
|
`fix_usmm_705to707cors_func.R` | Called in `add_shipnames.R` but never used should it be delete from the repository?
|
|
[`flag_id_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/flag_id_dups.R) | Adds a flag to reports that fail the [IMMA](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/how-to-install#imma-toolbox) ship track check. Such reports are flagged as the worst duplicate.
|
|
[`flag_id_dups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/flag_id_dups.R) | Adds a flag to reports that fail the [IMMA](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/how-to-install#imma-toolbox) ship track check. Such reports are flagged as the worst duplicate.
|
|
[`get_gap_pos.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_gap_pos.R) | -----
|
|
`get_gap_pos.R` | -----
|
|
[`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_groups.R) | Assigns a group number to a specific type of `dck` or `dck`'s. The group is appended to the record `id` to avoid mixing data with the same `id` from different types of data (e.g. Japanese data, UK Navy).
|
|
[`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_groups.R) | Assigns a group number to a specific type of `dck` or `dck`'s. The group is appended to the record `id` to avoid mixing data with the same `id` from different types of data (e.g. Japanese data, UK Navy).
|
|
[`get_id_class.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_id_class.R) | Used in `add_ID_class.R`. <br> Assigns an `id` class to callsigns listed in metadata of Pub 47 (**Not sure**).
|
|
[`get_id_class.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_id_class.R) | Used in `add_ID_class.R`. <br> Assigns an `id` class to callsigns listed in metadata of Pub 47 (**Not sure**).
|
|
[`get_itu_country.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_itu_country.R) | Gets the [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign prefix associated with a country.
|
|
`get_isolated.R` Should we only have gcd.slc alone? | Contains function [gcd.slc](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_isolated.R#L10): Calculates the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines ([slc](https://en.wikipedia.org/wiki/Spherical_law_of_cosines)).
|
|
[`get_matchedids.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_matched_ids.R) | ------
|
|
[`get_itu_country.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_itu_country.R) | Gets the [ITU](https://en.wikipedia.org/wiki/ITU_prefix) callsign prefix associated with a country.
|
|
[`get_mis.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_mis.R) | Selects candidate pairs of duplicate reports according to i) the number of matching elements (similar content of variables within a specific tolerance).
|
|
`get_matchedids.R` | ------
|
|
[`get_prec.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_prec.R) | Get's the `dck`/`sid` [precision of measurement](Workflow/quality-control) for selected variables (`sst`,`slp`,`at`,`dpt`,`w`,`d`,`n`,`ww`,`w1`,`vv`)
|
|
`get_mismatch.R` | -------
|
|
[`get_pub47.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_pub47.R) | function to process monthly files from Pub47.
|
|
[`get_prec.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_prec.R) | Get's the `dck`/`sid` [precision of measurement](Workflow/quality-control) for selected variables (`sst`,`slp`,`at`,`dpt`,`w`,`d`,`n`,`ww`,`w1`,`vv`)
|
|
[`get_speed.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_speed.R) | Contains several utility functions to convert between radians and degrees, calculate time and speed from coordinates and calculate the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines ([slc](https://en.wikipedia.org/wiki/Spherical_law_of_cosines)).
|
|
`get_speed.R` Do we need this doesn't seem to get use only the gcd.slc? | Also contains function [gcd.slc](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_isolated.R#L10): Calculates the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines ([slc](https://en.wikipedia.org/wiki/Spherical_law_of_cosines)).
|
|
[`homog_ids.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/homog_ids.R) | Uses standard linkages between `id`'s to add `id` [homogenisation](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/Workflow/processing-of-ids#homogenisation) information to each records.
|
|
[`id_group_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/id_group_func.R) | Makes sure that all `id`'s grouped in [`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/get_groups.R) are properly associated.
|
|
[`id_group_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/id_group_func.R) | Makes sure that all `id`'s grouped in [`get_groups.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/get_groups.R) are properly associated.
|
|
[`liz_merge.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/liz_merge.R) | Merges two data frames by specific columns.
|
|
[`liz_merge.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/liz_merge.R) | Merges two data frames by specific columns.
|
|
[`new_add_match_id.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/new_add_match_id.R) | Tests whether paired `id`'s are allow to match according a [matching criteria](Workflow/matching-criteria) and by using the [Damerau–Levenshtein (DL) distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) formula.
|
|
[`print_id_match_info.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/print_id_match_info.R) | Prints ICOADS duplicates information and `id` matching results.
|
|
[`new_homog_ids.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/new_homog_ids.R) | Uses standard linkages between `id`'s to add `id` [homogenisation](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/Workflow/processing-of-ids#homogenisation) information to each records.
|
|
[`read_rdsfiles.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/read_rdsfiles.R) | A collection of functions to read all different types of data files (e.g rds, .txt) used through out the code.
|
|
[`print_id_match_info.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/print_id_match_info.R) | Prints ICOADS duplicates information and `id` matching results.
|
|
[`write_dup_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads.utils/-/blob/master/R/write_dup_func.R) | Writes duplicate information in a pipe-delimited year-month output format. The report `uid` is followed by the report `id` then a flag with value 1 if the `id` has been changed, 0 if it remains the same. <br> An example of the format is: <br> ICOADS-30-0Y0HJK | 32024 | 0 <br> ICOADS-30-0Y0HJL | 14 00117 | 1
|
|
[`read_rdsfiles.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/read_rdsfiles.R) | A collection of functions to read all different types of data files (e.g rds, .txt) used through out the code.
|
|
|
|
[`write_dup_func.R`](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/blob/master/rutils/write_dup_func.R) | Writes duplicate information in a pipe-delimited year-month output format. The report `uid` is followed by the report `id` then a flag with value 1 if the `id` has been changed, 0 if it remains the same. <br> An example of the format is: <br> ICOADS-30-0Y0HJK | 32024 | 0 <br> ICOADS-30-0Y0HJL | 14 00117 | 1
|
|
|
|
|
|
|
|
**ICOADS variables used**
|
|
**ICOADS variables used**
|
|
-------------------------
|
|
-------------------------
|
... | | ... | |