ATM the duplication logic checks for identical entries in
|
'dup_cols':['FINNGENID','APPROX_EVENT_DATETIME','harmonization_omop::OMOP_ID','cleaned::TEST_NAME_ABBREVIATION','source::MEASUREMENT_VALUE','TEST_OUTCOME','MEASUREMENT_FREE_TEXT'], |
|
|
This causes duplicates to be preserved when
- the numerical entry is in the extracted data instead and everything else the same
- the data is uploaded by another entity (regional vs national) and test outcome differs
ideally we should collect all cases to then figure out a logic in order to choose how to preserve one over the other.
ATM the duplication logic checks for identical entries in
kanta_lab_preprocessing/core/magic_config.py
Lines 56 to 57 in c86e505
This causes duplicates to be preserved when
ideally we should collect all cases to then figure out a logic in order to choose how to preserve one over the other.