|
|
|
|
|
**Data input**
|
|
**Data input**
|
|
--------------
|
|
--------------
|
|
The following data is required by the scripts of this repository:
|
|
The following data is required to run the scripts in this repository:
|
|
|
|
|
|
- [ICOADS v3.0](https://icoads.noaa.gov/r3.html). [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775).
|
|
- [ICOADS v3.0](https://icoads.noaa.gov/r3.html). [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775).
|
|
- Metadata from WMO Publication 47.
|
|
- Metadata from WMO Publication 47.
|
|
[Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1)
|
|
[Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1)
|
|
- CLIWOC logbook IDs. (**needs a link**)
|
|
- [CLIWOC logbook IDs](https://stvno.github.io/page/cliwoc/)
|
|
- Inventory of ship names in the
|
|
- Inventory of ship names in the
|
|
[US Maury Collection](https://icoads.noaa.gov/software/transpec/maury/mauri_out)
|
|
[US Maury Collection](https://icoads.noaa.gov/software/transpec/maury/mauri_out)
|
|
- generate_id (**needs description**)
|
|
- generate_id (**needs description**)
|
|
- **Precision criteria file**. An estimate of the precision of each key variable (e.g. `sst, lat, lon`) per `dck`,
|
|
- **Precision criteria file**. An estimate of the precision of each key variable (e.g. `sst, lat, lon`) per `dck`,
|
|
`yr` and or `sid`. This precision criteria is require in order to set tolerances
|
|
`yr` and or `sid`. This precision criteria is require in order to set tolerances when comparing variables from ICOADS (See the [list of ICOADS variables](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/API-Reference#icoads-variables-used) used in this repository). Comparison of variables allows for
|
|
when comparing variables from ICOADS (See the [list of ICOADS variables](https://git.noc.ac.uk/brecinosrivas/icoads-r-hostace/-/wikis/API-Reference#icoads-variables-used) used in this repository). Comparison of variables allows for
|
|
a match between reports in the [duplicate identification](Workflow/duplicate-identification) procedure.
|
|
a match between reports in the duplicate identification procedure.
|
|
|
|
- **Json files** containing ITU callsign prefixes associated with a country.
|
|
- **Json files** containing ITU callsign prefixes associated with a country.
|
|
- **seq IDS.** (**needs description**)
|
|
- **seq IDS.** (**needs description**)
|
|
|
|
|
|
**Processing stages**
|
|
**Processing stages**
|
|
--------------------
|
|
--------------------
|
|
|
|
|
|
The diagram below is a summary of the data processing workflow followed by
|
|
The diagram below is a summary of the data processing workflow followed by the shell scripts defined in ```scr```. Each block
|
|
the shell scripts defined in ```scr```. Each block
|
|
|
|
represents a main task done by one script in ```rscripts```.
|
|
represents a main task done by one script in ```rscripts```.
|
|
The corresponding `.R` file name has been added in grey between each block. For more information on eah `.R` script, please look into the [API reference page.](api-reference)
|
|
The corresponding `.R` file name has been added in grey between each block. For more information on eah `.R` script, please look into the [API reference page.](api-reference)
|
|
|
|
|
... | @@ -31,13 +29,12 @@ in order to: |
... | @@ -31,13 +29,12 @@ in order to: |
|
1. Select data taken only by commercial ships, excluding
|
|
1. Select data taken only by commercial ships, excluding
|
|
specialist ship data sources, such as research vessels
|
|
specialist ship data sources, such as research vessels
|
|
(For more information see the [selection criteria](Workflow/data-selection)).
|
|
(For more information see the [selection criteria](Workflow/data-selection)).
|
|
2. [Preprocessing of ID's](Workflow/processing-of-ids) to improve duplicate
|
|
2. [Preprocessing of ID's](Workflow/processing-of-ids) to improve [duplicate identification](Workflow/duplicate-identification) and linking of `id`'s between each pair of duplicate reports.
|
|
identification and linking of `id`'s between each pair of duplicate reports.
|
|
3. Preformed [quality control](Workflow/quality-control) on the data to point out the best duplicate.
|
|
3. Preformed [quality control](Workflow/quality-control) on the data to point
|
|
|
|
out the best duplicate.
|
|
|
|
|
|
|
|
- The rest of the blocks represent processing scripts that concentrate in the duplicates
|
|
- The rest of the blocks represent processing scripts that concentrate in the duplicates identification and [matching of reports ID's](Workflow/matching-criteria).
|
|
identification and [matching of reports ID's](Workflow/matching-criteria).
|
|
|
|
|
|
More details on the data processing can be found in this [technical report]().
|
|
|
|
|
|
```mermaid
|
|
```mermaid
|
|
graph TB
|
|
graph TB
|
... | @@ -68,3 +65,4 @@ class id6,id7,id8,id9,id10,id11,id12 rest |
... | @@ -68,3 +65,4 @@ class id6,id7,id8,id9,id10,id11,id12 rest |
|
**Output data**
|
|
**Output data**
|
|
--------------
|
|
--------------
|
|
|
|
|
|
|
|
Maybe here we can add some of the plots that you created as output. |