Skip to content

Introducing coarse-grained molecular dynamics simulations and updates to training#9

Open
datagato wants to merge 4 commits into
mainfrom
cg-experiments
Open

Introducing coarse-grained molecular dynamics simulations and updates to training#9
datagato wants to merge 4 commits into
mainfrom
cg-experiments

Conversation

@datagato

@datagato datagato commented Jun 27, 2026

Copy link
Copy Markdown
Owner

High-level updates from this PR:

  • Introduce coarse grained MD simulations using the Martini 3 force field (with solvent subtraction) for faster Proteogram generation (5-10x faster than atomistic)
  • Fixes to atomistic Proteogram creation for more consistent channel logic and stacking
  • New training script that does train/test split on-the-fly with seed argument for reproducibility (this can be used for k-fold cross-validation later on by simply changing the seed); also introduces use of focal loss (a flag on the command line) for the training loss function
  • PDB download script by date cutoff for future data pulls (when going beyond SCOPe)
  • Annotation script updates (want a "GO slim" as a future label for Proteograms for classification according to the Gene Ontology annotations (biological process, molecular function and cellular component)
  • Documentation updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant