Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
I ICOADS R HOSTACE
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 7
    • Issues 7
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • brivas
  • ICOADS R HOSTACE
  • Wiki
  • Workflow

Workflow · Changes

Page history
update in docs authored May 15, 2020 by bearecinos's avatar bearecinos
Hide whitespace changes
Inline Side-by-side
Showing with 57 additions and 1 deletion
+57 -1
  • Workflow.md Workflow.md +57 -1
  • No files found.
Workflow.md
View page @ cc20d5ff
To be added soon. **Data input**
- ICOADS v3.0. [Freeman. *et al.,* (2017)](https://doi.org/10.1002/joc.4775)
- Metadata from WMO Publication 47.
[Kent. *et al.,* (2007)](https://doi.org/10.1175/JTECH1949.1)
- CLIWOC logbook IDs. (couldn't find the link)
- Inventory of ship names in the
[US Maury Collection](https://icoads.noaa.gov/software/transpec/maury/mauri_out)
- generate_id (by Dave... not so clear the source)
- Precision criteria:
An estimate of the precision of each key variable (e.g. sst, lat, lon) per DCK,
year and or SID. This precision criteria is required to set tolerances when
allowing a match between reports in the duplicate identification procedure.
- json files.
- seq IDS.
**Processing stages**
The following diagram is a summary of the data processing workflow follow by
the shell scripts defined in ```scr```. Each block
represents a main task done by one script in```rscripts```.
The corresponding `.R` file name is highlight in grey between each stage.
Green blocks represent pre-processing tasks done to the ICOADS data base,
in order to:
1. Select data taken only by commercial ships, excluding
specialist ship data sources, such as research vessels
(For more information see the [selection criteria](data-selection)).
2. [Preprocessing of IDs](processing-of-ids) to improve duplicate
identification and linking of IDs between each pair of duplicate reports.
3. Preformed [quality control](quality-control) on the data to point out the best duplicate.
```mermaid
graph TB
A1[rscripts]
id1[(ICOADS v3.0)] --> |split_by_type.R|id2[Separate records according <br> to the different platform types.]
id2 --> |simple_dup.R|id3[First duplicate identification between <br> ship data and the different platform types. <br> Considers the records as duplicates if they <br> show matching date, time & position.]
id3 --> |ship2plat.R|id4[Exclude non-ship data.]
id4 --> id5[(ICOADS SHIP data)]
id5 --> |process_ids.R|id6[Homogenize and re-format <br> ship IDs from different decks. <br> Links metadata from Pub 47 & logbooks <br> to formed a plausible ship track.]
id6 --> |process_shipdata.R|id7[Process ship data: <br> correction of dates & times.]
id7 --> |new_get_pairs.R|id8[Second duplicate identification. <br> Pairs the reports as duplicate if <br> they have associated ship IDs. <br> Reports that fail the track check <br> are flagged as the worst.]
id8 --> |new_get_dups.R|id9[Count the number of duplicates and flag the best.]
id9 --> |new_merge_ids_year.R|id10[Links of ID's into classes.]
id10 --> |clean_data.R|id11[Cleans of ship data.]
id11 --> |clean2track.R|id12[Forms ship tracks for linked IDs.]
id12 --> id13[(Output data)]
classDef pre-processing fill:#fcc679,stroke:#333,stroke-width:1px
classDef scripts fill:#8C929D,stroke:#333,stroke-width:1px
classDef rest fill:#e8eaf6,stroke:#333,stroke-width:1px
class id2,id3,id4 pre-processing;
class A1,id1,id5,id13 scripts;
class id6,id7,id8,id9,id10,id11,id12 rest
```
Clone repository
  • API Reference
  • Examples
  • Home
  • How to install
  • Introduction
  • JASMIN tips
  • Releases
  • Workflow
  • Workflow
    • Data selection
    • Duplicate indentification
    • Matching criteria
    • Processing of IDs
    • Quality control