Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
I ICOADS R HOSTACE
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 7
    • Issues 7
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • brivas
  • ICOADS R HOSTACE
  • Wiki
  • Introduction

Introduction · Changes

Page history
added version history authored Jun 10, 2020 by bearecinos's avatar bearecinos
Show whitespace changes
Inline Side-by-side
Showing with 9 additions and 9 deletions
+9 -9
  • Introduction.md Introduction.md +9 -9
  • No files found.
Introduction.md
View page @ 1514ec8c
This repository is a collection of R-scripts to homogenise platform
identifier information and to identify duplicate observations in the
**International Comprehensive Ocean-Atmosphere Data Set** (ICOADS) marine data source. **BOLD** denotes an ICOADS variable name.
**International Comprehensive Ocean-Atmosphere Data Set** (ICOADS) marine data source. Text in this `format` denotes an ICOADS variable name (see [API-reference](api-reference) for variables information).
ICOADS is the world most extensive surface marine meteorological data collection.
Contains ocean surface and atmospheric observations from the late 1600's
to present and is updated every month with observations from near-real-time data streams.
The data base is made up of observation reports from many different sources,
there are several hundred combinations of the **DCK** (deck) and **SID** (sources)
there are several hundred combinations of the `dck` (deck) and `sid` (sources)
flags that indicate the origin of the data.
Typically, **DCK** indicates the type of data
(e.g. US Navy ships; Japanese Whaling Fleet) and **SID** provides more information
Typically, `dck` indicates the type of data
(e.g. US Navy ships; Japanese Whaling Fleet) and `sid` provides more information
about the data system or format
(e.g. data stream extracted from the WMO global telecommunications systems, GIS).
Sometimes a single **DCK** is associated with a single **SID**,
sometimes a single **DCK** will contain several **SID** and vice versa,
not all of the **DCK** and **SID** are independent so there can be duplicated reports which need to be identified and flagged.
Sometimes a single `dck` is associated with a single `sid`,
sometimes a single `dck` will contain several `sid` and vice versa,
not all of the `dck` and `sid` are independent so there can be duplicated reports which need to be identified and flagged.
Historically archives of marine data have been maintained by individual nations,
and often these were shared so that the same observations appear in the archives
......@@ -23,7 +23,7 @@ of several nations. Truncated formats often did not contain sufficient informati
to identify the observations made by a particular ship or platform,
and these compact formats sometimes converted or encoded data in different ways.
For example, many observations do not have an identifier linking to the ship
(**ID**) or platform (**PT**), and for those that do have such identifiers
(`id`) or platform (`pt`), and for those that do have such identifiers
they may be different between data sources. The main types of duplicates are:
* Observations historically shared among national archives,
......@@ -39,7 +39,7 @@ likely to have different formats, precision, conversions and metadata.
The processing software used by ICOADS (https://icoads.noaa.gov/software/) is written in FORTRAN and includes code to translate software to the IMMA1 format [Smith. *et al.,* (2016)](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1_short.pdf), to apply QC and flags, and to identify (and in earlier releases remove) reports likely to be duplicates [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775).
The code in this repository offers additional quality control on the data, homogenisation of ID information between different **DCK** and **SID** and duplicate identification (DI) preserving information on reports associated by the DI through the use of ICOADS unique identifiers (**UID**).
The code in this repository offers additional quality control on the data, homogenisation of ID information between different `dck` and `sid` and duplicate identification (DI) preserving information on reports associated by the DI through the use of ICOADS unique identifiers (`uid`).
References
----------
......
Clone repository

Wiki pages

Home

Introduction
Installation
JASMIN tips
Workflow
- Data selection
- Processing of ID's
- Matching criteria
- Quality control
- Duplicate identification
API Reference
Releases
Examples