update coloc readme to work with R14 data by Lipastomies · Pull Request #646 · FINNGEN/pheweb

Lipastomies · 2026-06-10T08:55:26Z

Imported R14 coloc data, and in the process noticed that some things had been changed.

Added preprocessing step to coloc import, and fixed columns where necessary.

tobtobtob · 2026-06-10T13:41:48Z

Read this through, looks ok to me. This data loading pipeline is quite complicated. In future we should try to make it simpler, by for example extracting the python scripts into files, so that they can be run directly from the command line. Similarly to what we do in SQL repo. Maybe also this could be moved to SQL repo..? But not to do in this PR, I get that this is just an update to the old pipeline, no need to refactor this now.

(I'm not familiar with the colocalization stuff so maybe it would be good if someone else reviews this also)

juhaa · 2026-06-10T14:11:49Z

Looks ok to me as well. As mentioned, this is quite complicated and could be simplified by quite a lot but needs time to be done properly. And definitely agree that this should be moved to the SQL repo.

juhaa · 2026-06-10T13:49:13Z


 ## Optional : Data Fixes

 Check the trait2 column for any rows where the entry begins with “seq”


This was unknown to me. Need to fix this directly in the source rather than here.

Lipastomies · 2026-06-11T06:58:43Z

Agree with that. We could do the following:

Move all sql statements, like the coloc & variant table creation, index & view creation to a create_coloc.sql
Make a script/scripts that do the necessary processing for the data to be importable directly into the tables using gcloud sql command.

That way it would be a four-step operation:

create sql tables
process data to ingestable form
copy to bucket
import to sql tables

update coloc readme to work with R14 data

eeb9ba4

Lipastomies requested review from Fedja, juhaa, majorseitan and tobtobtob June 10, 2026 08:55

Lipastomies self-assigned this Jun 10, 2026

juhaa approved these changes Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update coloc readme to work with R14 data#646

update coloc readme to work with R14 data#646
Lipastomies wants to merge 1 commit into
masterfrom
coloc_readme_update_r14.al

Lipastomies commented Jun 10, 2026

Uh oh!

tobtobtob commented Jun 10, 2026 •

edited

Loading

Uh oh!

juhaa commented Jun 10, 2026

Uh oh!

juhaa Jun 10, 2026

Uh oh!

Lipastomies commented Jun 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		## Optional : Data Fixes

		Check the trait2 column for any rows where the entry begins with “seq”

Uh oh!

Conversation

Lipastomies commented Jun 10, 2026

Uh oh!

tobtobtob commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juhaa commented Jun 10, 2026

Uh oh!

juhaa Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Lipastomies commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tobtobtob commented Jun 10, 2026 •

edited

Loading

Lipastomies commented Jun 11, 2026 •

edited

Loading