Skip to content

Core code to-do list #18

Description

@sam-may

Here is my current to-do list on improvements to be made/features to be added to the core code.

Legend:
‼️ highest priority
❗ higher priority
🐢 lower priority/solution is time-intensive

  • 1. DataFetcher
    • 🐢 1.1 Automatically scale properly when running over a huge number of histograms (avoid huge memory usage)
  • 2. Algorithms/training
    • ‼️ 2.1 Train only with good runs by default
    • ‼️ 2.2 Implement flattening of 2d histograms for PCAs (merge Si's code)
    • 2.3 Autoencoders
      • ❗ 2.3.1 Make default behavior to train a single AutoEncoder per histogram
      • ❗ 2.3.2 Make algorithms and training configurable through json input (rather than just CLIs)
  • 3. Assessment
    • ‼️ 3.1 Make SSE histograms for good/bad runs in addition to train/test set
    • ❗ 3.2 Make SSE histograms both in per-events set format and per-algorithm format
    • ❗ 3.3 Plotting of 2d histograms
    • ❗ 3.4 Function for ROC curve plots
    • ❗ 3.5 Function to make summary table of AUC and tpr/fpr values
    • 🐢 3.6 Switch from yahist to boost, mplhep

edit: test

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions