bearecinos · f5710e90
Show whitespace changes
Inline Side-by-side

Showing with 4 additions and 2 deletions

Workflow/Duplicate-indentification.md Workflow/Duplicate-indentification.md +4 -2

No files found.
--- a/Workflow/Duplicate-indentification.md
+++ b/Workflow/Duplicate-indentification.md
@@ -23,9 +23,11 @@ Second stage

 Third stage
 -----------
-At this stage we are able to count the number of duplicated records and flag the best according to a [quality control criteria](Workflow/quality-control). The duplicate pairs are also combine into groups. Each group of possible duplicates is then assessed for quality control. This process it is important to account for known differences between `dck`'s that are not captured in the precision information of previous processing stages. 
+At this stage we are able to count the number of duplicated records and flag the best according to a [quality control criteria](Workflow/quality-control). The duplicate pairs are also combine into groups. Each group of possible duplicates is then assessed for quality control. This process is important to account for known differences between `dck`'s that are not captured in the precision information of previous processing stages. 

 Four stage
 -----------
 Once the date/time/location parameter value duplicates have been identified and flagged, the next stage in the processing considers together the data that have associated `id`'s. Sometimes the link between `id`'s can be used to homogenise the `id`'s beyond the individual pairs, sometimes the link is
 specific to a particular pair of reports, particularly if one of the matched `id`'s is generic. `id` matches are therefore only considered within-group. At the end of the processing the suffix “_gN” is appended to the `id`'s, where N is the group number. More information on the group assignments by `dck` and `id` can be found in the thecnical report (C3S_D311a_Lot2.dup_doc_v3.docx table 15 and 16).
+
+The linked `id`'s are then checked using the [MOQC track check](Workflow/Quality-control#met-office-track-check), and for time duplicates. Reports that fail the track check are flagged as a worst duplicate. Where positions are similar the best duplicate is selected by `dck` priority and number of elements with similar content of variables.
\ No newline at end of file