Note: still not complete! probably tomorrow
This page lists all available functions and variables used in the ICOADS R HOSTACE toolbox.
Workflow
Scripts here follow the same order as in the Workflow.
Script | Description |
---|---|
split_by_type.R |
Separates records according to the different platform types. This is done following a selection criteria. |
simple_dup.R |
Duplicate record identification among ship data and the rest of the different platform types for which the records show a matching date , time and position (lat, lon) . |
ship2plat.R |
Excludes non-ship data. Uses additional information from WMO Publication 47 to flag non-ship data. Kent. et al., (2007) |
process_ids.R |
Reformat of ship id 's coming from different data sources (sid ) to enable linking of data from the same ship (same ship name) across different dck 's. And to also enable linking to metadata information in WMO Publication No. 47. More information in Processing of id 's. |
process_shipdata.R |
Corrects dates and times errors noticed in some dck . Largely arise from confusion over historical definitions of the marine day and conversions between local time and UTC. |
new_get_pairs.R |
Duplicate record identification within the ship data. Pairs the reports as duplicate if they have associated ship id 's. The candidate pairs are then selected according to i) the number of matching elements (similar content of variables within a specific tolerance), ii) the dck 's, and iii) a comparison of the id 's. |
new_get_dups.R |
Counts the number of duplicated records and flags the best according to a quality control criteria. Groups duplicated records by common callsings . |
new_merge_ids_year.R |
Links id 's into classes. Ship tracks of the linked id 's are then checked. Reports that fail the track check are flagged as the worst duplicate. Uses a shipping tracking alogrithm from r-imma.
|
clean_data.R |
---. |
clean2track.R |
Forms ship tracks for linked id 's. --. |
Utils
Functions ordered alphabetically.
Function | Description |
---|---|
add_date2.R |
Adds a date variable to the data frame based on yr , mo , dy , hr . Invalid values are set to missing. For these reports the code generates a date variable with a missing hour set to local noon and adds a date.flag. 0 = valid date & time 1 = invalid date or time (time not missing) 2 = valid date, hr missing, 12 local added
|
add_dck_priority.R |
Function use to identify duplicates. During the identification procedure priorities are assigned to the data from each dck (See source), data expected to be of best quality is assigned a priority of 1, data with larger priority numbers will be flagged as the worst duplicate if identified as potential match. Priority values are based on those from ICOADS.
|
add_ID_class.R |
Adds an specific id class to the report. Based on the ICOADS doc. and other earlier sources, the id 's have been classified as to their "id type" (logbook number, ship number, etc.) and validity. Where an ITU callsign is found, the country for that callsign is also identified. |
add_shipnames.R |
Adds ship names according to dck numbers. |
assess_match.R |
Finds maximum differences between reports across variables of interest (sst, slp, at, dpt, wbt, rh, w, d, ww, n, vv, wh, nh, w1 ). Flags delayed mode report(s) with an older IMMT number as worst duplicate(s). |
find_gap_func.R |
------- |
fix_usmm_705to707cors_func.R |
Called in add_shipnames.R but never used should it be delete from the repository? |
flag_id_dups.R |
Adds a flag to reports that fail the IMMA ship track check. Such reports are flagged as the worst duplicate. |
get_gap_pos.R |
----- |
get_groups.R |
Assigns a group number to a specific type of dck or dck 's. The group is appended to the record id to avoid mixing data with the same id from different types of data (e.g. Japanese data, UK Navy). |
get_id_class.R |
Used in add_ID_class.R . Assigns an id class to callsigns listed in metadata of Pub 47 (Not sure). |
get_isolated.R Should we only have gcd.slc alone? |
Contains function gcd.slc: Calculates the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines (slc). |
get_itu_country.R |
Gets the ITU callsign prefix associated with a country. |
get_matchedids.R |
------ |
get_mismatch.R |
------- |
get_prec.R |
Get's the dck /sid precision of measurement for selected variables (sst ,slp ,at ,dpt ,w ,d ,n ,ww ,w1 ,vv ) |
get_speed.R Do we need this doesn't seem to get use only the gcd.slc? |
Also contains function gcd.slc: Calculates the geodesic distance between two points specified by radian latitude/longitude using the spherical law of cosines (slc). |
id_group_func.R |
Makes sure that all id 's grouped in get_groups.R are properly associated. |
liz_merge.R |
Merges two data frames by specific columns. |
new_add_match_id.R |
Tests whether paired id 's are allow to match according a matching criteria and by using the Damerau–Levenshtein (DL) distance formula. |
new_homog_ids.R |
Uses standard linkages between id 's to add id homogenisation information to each records. |
print_id_match_info.R |
Prints ICOADS duplicates information and id matching results. |
read_rdsfiles.R |
A collection of functions to read all different types of data files (e.g rds, .txt) used through out the code. |
write_dup_func.R |
Writes duplicate information in a pipe-delimited year-month output format. The report uid is followed by the report id then a flag with value 1 if the id has been changed, 0 if it remains the same. An example of the format is: ICOADS-30-0Y0HJK | 32024 | 0 ICOADS-30-0Y0HJL | 14 00117 | 1 |
ICOADS variables used
Visit the ICOADS main website or check th IMMA report.
Variable names ordered alphabetically.
Code | Description |
---|---|
ah |
high cloud amount |
am |
middle cloud amount |
at |
air temperature |
c1 |
country recruited ship |
c1m |
recruiting country |
cce |
change code |
che |
high cloud type |
cle |
low cloud type |
cme |
middle cloud type |
d |
wind direction (true) |
dck |
deck |
dpt |
dew-point temperature |
ds |
ship course |
dups |
dup status |
dy |
day UTC |
eoh |
exposure of hygrometer |
es |
thickness of Is (ice accretion on ship) |
he |
|
hop |
height of visual observation platform |
hr |
hour UTC1 |
id |
identification/callsign |
ii |
ID indicator |
immv |
IMMT version (International Maritime Meteorological Tape) |
ir |
indic. for precip. data |
irf |
intermediate reject flag |
is |
ice accretion on ship |
lat |
latitude |
lon |
longitude |
lz |
2°×2° landlocked flag |
mds |
metadata source |
mo |
month UTC |
n |
cloud amount |
ne |
total cloud amount |
nh |
amt. of lowest clouds |
nhe |
lower cloud amount |
nid |
national source indic.1 |
oav |
alkalinity value |
oaz |
alkalinity depth |
ocv |
total chlorophyll value |
ocz |
total chlorophyll depth |
onv |
nitrate value |
onz |
nitrate depth |
oov |
dissolved oxygen |
ooz |
dissolved oxygen depth |
ophv |
pH value |
ophz |
pH depth |
opv |
phosphate value |
opz |
phosphate depth |
osiv |
silicate value |
osiz |
silicate depth |
osv |
salinity value |
osz |
salinity depth |
pt |
platform type |
qci |
quality control indic. |
rh |
relative humidity |
ri |
relative lunar illuminance |
rrr |
amount of precip |
rs |
rate of Is (ice accretion on ship) |
sa |
solar altitude |
sbi |
sky-brightness indicator |
si |
SST meas. method ? or indic. for SST meas. |
sid |
source ID |
sim |
SST measurement method |
slp |
air pressure |
sme |
source meta. element |
smf |
source metadata file |
smv |
source format version |
sst |
sea surface temperature |
sx |
swell period indicator |
uh |
NOL high amount |
uid |
unique report ID |
um |
NOL middle amount |
vs |
ship's average speed |
vv |
visibility |
w |
wind speed |
w1 |
past weather |
wbt |
wet-bulb temperature |
wh |
wave height |
wmi |
indic. for wave measurement |
ww |
present weather |
wwe |
present weather |
wx |
wave period indicator |
yr |
year UTC |