|
|
This repository consist in a collection of R-scripts to homogenise platform
|
|
|
This repository is a collection of R-scripts to homogenise platform
|
|
|
identifier information and to identify duplicate observations in the
|
|
|
**International Comprehensive Ocean-Atmosphere Data Set** (ICOADS) marine data source.
|
|
|
**International Comprehensive Ocean-Atmosphere Data Set** (ICOADS) marine data source. **BOLD** denotes an ICOADS variable name.
|
|
|
|
|
|
ICOADS is the world most extensive surface marine meteorological data collection.
|
|
|
Contains ocean surface and atmospheric observations from the 1600's
|
... | ... | @@ -8,7 +8,7 @@ to present and is still receiving more data every year. |
|
|
The data base is made up of observation reports from many different sources,
|
|
|
there are several hundred combinations of the **DCK** (deck) and **SID** (sources)
|
|
|
flags that indicate the origin of the data.
|
|
|
Typically, **DCK** indicates the **type of data**
|
|
|
Typically, **DCK** indicates the type of data
|
|
|
(e.g. US Navy ships; Japanese Whaling Fleet) and **SID** provides more information
|
|
|
about the data system or format
|
|
|
(e.g. data stream extracted from the WMO global telecommunications systems, GIS).
|
... | ... | @@ -37,12 +37,9 @@ likely to have different formats, precision, conversions and metadata. |
|
|
|
|
|
* Planned redundancy, for example the ingestion of several near real time data streams.
|
|
|
|
|
|
The processing software used by ICOADS (https://icoads.noaa.gov/software/) is written in FORTRAN and includes code to translate software to the IMMA1 format [Smith. *et al.,* (2016)], to apply QC and flags, and to identify (and in earlier releases remove) reports likely to be duplicates [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775).
|
|
|
The processing software used by ICOADS (https://icoads.noaa.gov/software/) is written in FORTRAN and includes code to translate software to the IMMA1 format [Smith. *et al.,* (2016)](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1_short.pdf), to apply QC and flags, and to identify (and in earlier releases remove) reports likely to be duplicates [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775).
|
|
|
|
|
|
The code in this repository offer additional quality control
|
|
|
on the data, duplicate identification and linking of IDs between each pair of duplicate reports.
|
|
|
It also provides an identification of the best duplicate by assessing the track
|
|
|
(path in lat/lon) of the observation.
|
|
|
The code in this repository offers additional quality control on the data, homogenisation of ID information between different **DCK** and **SID** and duplicate identification (DI) preserving information on reports associated by the DI through the use of ICOADS unique identifiers (**UID**).
|
|
|
|
|
|
References
|
|
|
----------
|
... | ... | |