|
|
|
|
|
To be added soon. |
|
**Data input**
|
|
|
|
- ICOADS v3.0. [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775)
|
|
|
|
- Metadata from WMO Publication 47.
|
|
|
|
[Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1)
|
|
|
|
- CLIWOC logbook IDs. (couldn't find the link)
|
|
|
|
- Inventory of ship names in the
|
|
|
|
[US Maury Collection](https://icoads.noaa.gov/software/transpec/maury/mauri_out)
|
|
|
|
- generate_id (by Dave... not so clear the source)
|
|
|
|
- Precision criteria:
|
|
|
|
An estimate of the precision of each key variable (e.g. sst, lat, lon) per DCK,
|
|
|
|
year and or SID. This precision criteria is required to set tolerances when
|
|
|
|
allowing a match between reports in the duplicate identification procedure.
|
|
|
|
- json files.
|
|
|
|
- seq IDS.
|
|
|
|
|
|
|
|
**Processing stages**
|
|
|
|
|
|
|
|
The following diagram is a summary of the data processing workflow follow by
|
|
|
|
the shell scripts defined in ```scr```. Each block
|
|
|
|
represents a main task done by one script in```rscripts```.
|
|
|
|
The corresponding `.R` file name is highlight in grey between each stage.
|
|
|
|
|
|
|
|
Green blocks represent pre-processing tasks done to the ICOADS data base,
|
|
|
|
in order to:
|
|
|
|
|
|
|
|
1. Select data taken only by commercial ships, excluding
|
|
|
|
specialist ship data sources, such as research vessels
|
|
|
|
(For more information see the [selection criteria](data-selection)).
|
|
|
|
2. [Preprocessing of IDs](processing-of-ids) to improve duplicate
|
|
|
|
identification and linking of IDs between each pair of duplicate reports.
|
|
|
|
3. Preformed [quality control](quality-control) on the data to point out the best duplicate.
|
|
|
|
|
|
|
|
```mermaid
|
|
|
|
graph TB
|
|
|
|
A1[rscripts]
|
|
|
|
|
|
|
|
id1[(ICOADS v3.0)] --> |split_by_type.R|id2[Separate records according <br> to the different platform types.]
|
|
|
|
id2 --> |simple_dup.R|id3[First duplicate identification between <br> ship data and the different platform types. <br> Considers the records as duplicates if they <br> show matching date, time & position.]
|
|
|
|
id3 --> |ship2plat.R|id4[Exclude non-ship data.]
|
|
|
|
id4 --> id5[(ICOADS SHIP data)]
|
|
|
|
id5 --> |process_ids.R|id6[Homogenize and re-format <br> ship IDs from different decks. <br> Links metadata from Pub 47 & logbooks <br> to formed a plausible ship track.]
|
|
|
|
id6 --> |process_shipdata.R|id7[Process ship data: <br> correction of dates & times.]
|
|
|
|
id7 --> |new_get_pairs.R|id8[Second duplicate identification. <br> Pairs the reports as duplicate if <br> they have associated ship IDs. <br> Reports that fail the track check <br> are flagged as the worst.]
|
|
|
|
id8 --> |new_get_dups.R|id9[Count the number of duplicates and flag the best.]
|
|
|
|
id9 --> |new_merge_ids_year.R|id10[Links of ID's into classes.]
|
|
|
|
id10 --> |clean_data.R|id11[Cleans of ship data.]
|
|
|
|
id11 --> |clean2track.R|id12[Forms ship tracks for linked IDs.]
|
|
|
|
id12 --> id13[(Output data)]
|
|
|
|
|
|
|
|
classDef pre-processing fill:#fcc679,stroke:#333,stroke-width:1px
|
|
|
|
classDef scripts fill:#8C929D,stroke:#333,stroke-width:1px
|
|
|
|
classDef rest fill:#e8eaf6,stroke:#333,stroke-width:1px
|
|
|
|
class id2,id3,id4 pre-processing;
|
|
|
|
class A1,id1,id5,id13 scripts;
|
|
|
|
class id6,id7,id8,id9,id10,id11,id12 rest
|
|
|
|
|
|
|
|
``` |