Data input
- ICOADS v3.0. Freeman. et al., (2017)
- Metadata from WMO Publication 47. Kent. et al., (2007)
- CLIWOC logbook IDs. (couldn't find the link)
- Inventory of ship names in the US Maury Collection
- generate_id (by Dave... not so clear the source)
- Precision criteria: An estimate of the precision of each key variable (e.g. sst, lat, lon) per DCK, year and or SID. This precision criteria is required to set tolerances when allowing a match between reports in the duplicate identification procedure.
- json files.
- seq IDS.
Processing stages
The following diagram is a summary of the data processing workflow follow by
the shell scripts defined in scr
. Each block
represents a main task done by one script inrscripts
.
The corresponding .R
file name is highlight in grey between each stage.
Green blocks represent pre-processing tasks done to the ICOADS data base, in order to:
- Select data taken only by commercial ships, excluding specialist ship data sources, such as research vessels (For more information see the selection criteria).
- Preprocessing of IDs to improve duplicate identification and linking of IDs between each pair of duplicate reports.
- Preformed quality control on the data to point out the best duplicate.