discussion: test for icoads-r-hostace workflow (#15) · Issues · brivas / ICOADS R HOSTACE

discussion: test for icoads-r-hostace workflow

Hello,

I was thinking by looking at the plots that Liz (@eck) sent that we could make a testthat script probably with the very early years (a year like 1800-1900) for the entire workflow defined in the rscripts and implement the CI/CD gitlab feature to make @dyb 's runner test all rscripts (for every change done to the repository)... to let's say get: the same number of good reports, bad reports and duplicates for one year for the following periods 1662-1899.

This has several advantages:

We dont have to run the code in JASMIN every time we make a change
We ensure that the repository always stores a working version of the code

Lotus stats from Liz runs show that they don't require much. I think it can be run in the runner... At least one year. Or maybe we wait to see how the gitlab runner from the NOC will look like.

very.early.bsub  : runs all processing 1662-1849 (short-serial, memory 700mb, 2hr)
early.bsub       : runs all processing 1850-1899 (short-serial, memory 5gb, 4hr)
emid_190dec.bsub : all processing 1900-1909 (short-serial, default mem 8gb, 24hr)

The only problem I see is that if we concentrate on testing icoads-r-hostace and icoads.utils in a single big test we might not know where to spot errors. Ideally we will have to split this big test into testing the different parts of the processing (e.g. only a test for split_by_type.R) and for each main icoads.utils function (e.g. add_date2.R).

A possible solution could also be only test code that we think will revise or change in the future. Then we dont write test code for code that might never change.

Let me know if this is a good idea. It will require some time though but can also be done later.

Edited Jul 07, 2020 by brivas