Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
I ICOADS R HOSTACE
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 7
    • Issues 7
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • brivas
  • ICOADS R HOSTACE
  • Wiki
  • How to install

Last edited by ricorne 3 years ago
Page history

How to install

The scripts in this repository consist on a pure R package, but it has several dependencies which can be installed by the following instructions.

All the required packages should work on any platform and on linux based systems. The code has been tested in R v3.5.1 and in R v3.6.3

Dependencies

Here is a list of all dependencies to run the code. The code has been tested with the most recent version of the following packages:

Developing tools:

  • devtools
  • pryr
  • config

Data processing tools

  • stringdist
  • geosphere
  • jsonlite
  • lubridate
  • igraph

External R software

  • icoadsQC
  • icoads.utils
  • maptools
  • reticulate
  • chron

Install dependencies with conda (all platforms)

This is the recommended way to install all the dependencies. So when the code is run, either in a laptop or cluster you don't have to re-install the R packages for a new session.

Prerequisites

You should have a recent version of the conda package manager.

You can get conda by installing miniconda, which is what we recommend here to keep track of your R environment.

See the following blog post: using the R language with Anaconda, for more information.

Conda environment

Once conda is installed on your system you can easily create a fixed R environment to use in every run by:

conda create -n r_env r-essentials r-base

Then activate it:

conda activate r_env

To install the code dependencies you must have activated your environment. You will know is activated once you see the name of the environment (e.g r_env) in () at the beginning of your bash alias:

(r_env) [brecinos@jasmin-sci2 ~]$

To install dependencies simply do:

conda install -c r r-"package_name"

For example:

conda install -c r r-config

Note: Always google the command since some packages might require:

conda install -c conda-forge r-"package name"

IMMA toolbox

The IMMA data format is used for the disemmination of the ICOADS marine data. The R-package imma provides function to apply quality control to the ICOADS data.

Must be installed manually by getting the .tar.gz file from the package repository and running the following script once your conda environment has been activated:

conda install ./imma-master.tar.gz

if the above doesn't work try:

R CMD INSTALL imma-master.tar.gz 

ICOADS-utils toolbox

The icoads.utils package is a collection of utility functions to assist with the homogenization of platform identifier information and the identification of duplicated records in the processing tasks carried out by the scripts in rscripts.

Must be installed manually by getting the .tar.gz file from the package repository and running the following script once your conda environment has been activated:

conda install ./icoads.utils-master.tar.gz

if the above doesn't work try:

R CMD INSTALL icoads.utils-master.tar.gz 

Install the repository itself

For this to work you'll need to have the git software installed on your system. Then, clone the latest repository version:

git clone git@git.noc.ac.uk:brecinosrivas/icoads-r-hostace.git

If you are inside JASMIN, you might have to configure your Gitlab ssh keys (under Gitlab Profile >> Settings > Ssh keys). And add a JASMIN pub key to your profile (generated from within a JASMIN's sci-server). This in order to enable access from JASMIN sci servers to your NOC gitlab platform and to clone the repository within one of JASMIN's sci servers.

For more information on gitlab ssh keys click here.

Now you can go to the repository by:

cd icoads-r-hostace

And the ls of the repository should look like this:

~/icoads-r-hostace$ ls
config.yml  README.md  rscripts  rutils  scr

Modify the config.yml according to where do you want your input/output data to reside. For example, I have added a new folder called: output_data.

So my local copy of the repository looks like this:

~/icoads-r-hostace$ ls
config.yml  output_data  README.md  rutils  OPFILES  rscripts  scr

OPFILES is my logging directory.

Each script in rscripts will take input data from the output_data folder and write output to the same directory (see simple_dup.R).

~/icoads-r-hostace/output_data$ ls
CROSS_COAST   MFILES_MOORED    MFILES_SHIP         NEW_PAIRFILES
CROSS_DRIFT   MFILES_NOTSHIP   MFILES_SHIP_FINAL   NEW_TRACK_INPUT
CROSS_MOORED  MFILES_PLAT      MFILES_SHIP_IDPROC  SHIP_CLEAN
MFILES_COAST  MFILES_REJECT    MFILES_SHIP_PROC
MFILES_DRIFT  MFILES_RESEARCH  NEW_DUP_FILES
Clone repository

Wiki pages

Home

Introduction
Installation
JASMIN tips

Workflow
- Data selection
- Processing of ID's
- Matching criteria
- Quality control
- Duplicate identification

API Reference

Releases

Examples