The way we wrote this algorithm without Pandas, does not hold true when there is a line break within a cell. We noticed this issue with ampscz_pps01_baseline.csv.
So the suggestion is to verify line number using vim. If line number+2 matches reported lines, omit_rows.py could be used. Otherwise, the problematic lines need to be removed manually through vim. For files with line breaks, identifying the problematic lines through vim is also a difficult task.
FIX: use pandas to read, write NDA compatible CSV file.
The way we wrote this algorithm without Pandas, does not hold true when there is a line break within a cell. We noticed this issue with
ampscz_pps01_baseline.csv.So the suggestion is to verify line number using vim. If line number+2 matches reported lines,
omit_rows.pycould be used. Otherwise, the problematic lines need to be removed manually through vim. For files with line breaks, identifying the problematic lines through vim is also a difficult task.FIX: use pandas to read, write NDA compatible CSV file.