Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
I ICOADS R HOSTACE
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 7
    • Issues 7
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • brivas
  • ICOADS R HOSTACE
  • Wiki
    • Workflow
  • Matching criteria

Matching criteria · Changes

Page history
fixed matching page description authored May 26, 2020 by bearecinos's avatar bearecinos
Hide whitespace changes
Inline Side-by-side
Showing with 1 addition and 11 deletions
+1 -11
  • Workflow/Matching-criteria.md Workflow/Matching-criteria.md +1 -11
  • No files found.
Workflow/Matching-criteria.md
View page @ f1703b0b
A flag indicating whether an `id` match is allow, is added to the data frames by [INSERT-LINK-TO-SCRIPT](). Matches of reports where the `id`'s meet the criteria listed in the table below are allow. A flag indicating whether an `id` match is allowed is added to each report by [INSERT-THE-LINK-TO-SCRIPT](). Generic `id`'s (e.g. blank, "SHIP", "MASKSTID") are allowed to match within a `dck`. The table below contains the information used to decide whether `id`'s in a pair are an allowed match. *Italics* in the table below represents the *“`id` type”*.
Generic `id`'s (e.g. blank, "SHIP",
"MASKSTID") are allowed to match within a `dck`.
These criteria have been developed by inspection of the paired `id`'s and are therefore likely to be approximate.
[Damerau–Levenshtein (DL) distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) is the
number of insertions, deletions and swaps necessary to convert one string to another. A substring is where one `id` is contained within the other.
*Italics* in the table below represents the *"`id` type"*.
________________ ________________
DCK | ID DCK | ID
......
Clone repository
  • API Reference
  • Examples
  • Home
  • How to install
  • Introduction
  • JASMIN tips
  • Releases
  • Workflow
  • Workflow
    • Data selection
    • Duplicate indentification
    • Matching criteria
    • Processing of IDs
    • Quality control