The scripts in this repository consist on a pure R package, but it has several dependencies which can be install by the following instructions.
All the required packages should work on any platform and on linux based systems. The code has been tested in R v3.5.1 and in R v3.6.3
Dependencies
Here is a list of all dependencies to run the code. The code has been tested with the most recent version of the following packages:
Developing tools:
- devtools
- pryr
- config
Data processing tools
- stringdist
- geosphere
- jsonlite
- lubridate
- igraph
External R software
- imma
Install dependencies with conda (all platforms)
This is the recommended way to install all the dependencies. So when the code is ran, either in a laptop or cluster you don't have to re-install the R packages for a new session.
Prerequisites
You should have a recent version of the conda package manager.
You can get conda by installing miniconda, which is what we recommend here to keep track of your R environment.
See the following blog post: using the R language with Anaconda, for more information.
Conda environment
Once conda is installed on your system you can easily create a fix R environment to use in every run by:
conda create -n r_env r-essentials r-base
Then activate it:
conda activate r_env
To install the code dependencies you must have activated your environment. You will know is activated once you see the name of the environment (e.g r_env) in () at the beginning of your bash alias:
(r_env) [brecinos@jasmin-sci2 ~]$
To install dependencies simply do:
conda install -c conda-forge r-"package_name"
For example:
conda install -c conda-forge r-devtools
IMMA toolbox
The IMMA data format is used for the disemmination of the icoads marine data. The package imma written also in R. Provides function to read those files and to apply quality control to the data.
Must be install manually by getting the .tar.gz file and running the following script once your conda environment is activated:
conda install ./imma_0.0.1.tar.gz
Install the repository itself
For this to work you'll need to have the git software installed on your system. Then, clone the latest repository version:
git clone git@git.noc.ac.uk:brecinosrivas/icoads-r-hostace.git
Now you can go to the repository by:
cd icoads-r-hostace
And the ls
of the repository should look like this:
~/icoads-r-hostace$ ls
config.yml README.md rscripts rutils scr
Modify the config.yml
according to where do you want your input/output
data to reside. For example, I have added a new folder called: output_data
.
So my local copy of the repository looks like this:
~/icoads-r-hostace$ ls
config.yml output_data rutils OPFILES rscripts scr
OPFILES is my logging directory.
Each script in rscript
will take input data from this folder and write the output to this same folder (e.g. simple_dup.R).
~/icoads-r-hostace/output_data$ ls
CROSS_COAST MFILES_MOORED MFILES_SHIP NEW_PAIRFILES
CROSS_DRIFT MFILES_NOTSHIP MFILES_SHIP_FINAL NEW_TRACK_INPUT
CROSS_MOORED MFILES_PLAT MFILES_SHIP_IDPROC SHIP_CLEAN
MFILES_COAST MFILES_REJECT MFILES_SHIP_PROC
MFILES_DRIFT MFILES_RESEARCH NEW_DUP_FILES