Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
I ICOADS R HOSTACE
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 7
    • Issues 7
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • brivas
  • ICOADS R HOSTACE
  • Wiki
    • Workflow
  • Processing of IDs

Last edited by Beatriz Recinos Sep 15, 2020
Page history

Processing of IDs

Pre-processing tasks described here are done in process_ships.R

Corrections

The following corrections to ship names happen in add_shipnames.R.

  • For the period 1878 to 1894 some minor changes to ship names from dck 704 are made to correct for typos and other similar problems.

  • For dck 701 (1867-1899) and dck 711 (1889-1899) some ship names are corrected.

  • For the period 1663 to 1860, CLIWOC logbook id's from dck 730 are converted to ship names using table information from a MS ACCESS database that is no longer available online. Some information is available from:
    https://www.historicalclimatology.com/cliwoc.html

  • For the period 1663 to 1863 ship names from the US Maury collection dck 701 are extended using information from this link. Also data with a missing id from dck 701 is split into voyages by manual inspection.

  • Ship names from the German Maury collection (dck 721) are extented where they overlap with names from US Maury (dck 701). Where names are the same across dck 701 and dck 721, and it is not clear if the ships are the same, the dck number is then also append
    (e.g. AUSTRALIA, JAMESTOWN, SWORDFISH, ANN MARIA, ASHBURTON).

  • In dck 555 (1966-1973) North Pole and South Pole station id's are corrected by prepending a "N" or "S" depending on latitude.

Reformatting

  • Manual corrections to two id's from dck 187 (1946-1956) are made to conform the expected format.

  • For the period 1953 to 1961 id's from dck 184 are truncated to remove the first digit which indicates the ocean region, and the id's are reformatted to match the expected form.

  • Between 1962 and 1963 the id "Eltanin" is added to dck 897, which contains only data from that ship and has a missing id.

  • Between 1957 and 1961 a small number of id's from dck 902 are reformatted to match expected format. This is done by prepending a No. "3" to the truncated id's.

  • Between 1930 and 1961 id's for dck 118 and 119 (small number of id's) are reformatted to match the expected format. This is done by inserting a 2-digit year.

  • For dck 720 and sid 135, 8-character ids represent a single report. These are truncated to the first 4 digits, and a "-SEQ" is appended.

Homogenisation

The following corrections are made in homog_ids.R

  • id's in dck's 194, 201, 202, 203 and 227 are all derived from the same 5-digit ship identifiers. Leading digits are removed where needed.

  • Some ship id's are callsigns and can be used to link to metadata information in WMO Publication No. 47. However some callsigns have been modified in some dck, or corrupted, so processing attempts to recover the original callsign in these cases are made. For more information see Kent. et al., (2007) and Freeman. et al., (2017).

  • Where an id is identified as a callsign or as an identifier listed in WMO Publication No. 47, other id's containing the same character string are flagged and leading digits are removed. This is with the purpose of homogenising the callsigns across dck's.

Clone repository

Wiki pages

Home

Introduction
Installation
JASMIN tips

Workflow
- Data selection
- Processing of ID's
- Matching criteria
- Quality control
- Duplicate identification

API Reference

Releases

Examples