From 6c149133f3df071779d54c7f85f3b565522dd390 Mon Sep 17 00:00:00 2001 From: Romeo Kienzler Date: Fri, 12 Jun 2026 11:55:54 +0200 Subject: [PATCH 1/2] add documentation for --mp_context Signed-off-by: Romeo Kienzler --- docs/quick_start/quick_start.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/quick_start/quick_start.md b/docs/quick_start/quick_start.md index 319a7396..4725fd89 100644 --- a/docs/quick_start/quick_start.md +++ b/docs/quick_start/quick_start.md @@ -40,6 +40,7 @@ gridfm_graphkit train --config path/to/config.yaml | `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` | | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | | `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | ### Examples @@ -76,6 +77,7 @@ gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model | `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` | | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | | `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | --- @@ -107,6 +109,7 @@ gridfm_graphkit evaluate --config path/to/eval.yaml --model_path path/to/model.p | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | | `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` | | `--save_output` | `flag` | Save predictions as `_predictions.parquet` under MLflow artifacts (`.../artifacts/test`). | `False` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | ### Example with saved normalizer stats @@ -150,6 +153,7 @@ gridfm_graphkit predict --config path/to/config.yaml --model_path path/to/model. | `--bfloat16` | `flag` | Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`). | `False` | | `--tf32` | `flag` | Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`. | `False` | | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | --- @@ -170,6 +174,7 @@ gridfm_graphkit benchmark --config path/to/config.yaml | `--dataset_wrapper_cache_dir` | `str` | Directory for dataset wrapper disk cache. | `None` | | `--num_workers` | `int` | Override `data.workers` from YAML. | `None` | | `--plugins` | `list[str]` | Python packages to import for plugin registration. | `[]` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | Use built-in help for full command details: From 09f81574df2bc3002629bee0ee7b2fbfc0568a45 Mon Sep 17 00:00:00 2001 From: Romeo Kienzler Date: Fri, 12 Jun 2026 15:28:24 +0200 Subject: [PATCH 2/2] add documentation Signed-off-by: Romeo Kienzler --- README.md | 11 ++++++++--- docs/install/installation.md | 8 ++++---- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index bf08fb43..7c9fdd02 100644 --- a/README.md +++ b/README.md @@ -28,9 +28,9 @@ python -m venv venv source venv/bin/activate ``` -Install gridfm-graphkit in editable mode +Install gridfm-graphkit from PyPI ```bash -pip install -e . +pip install gridfm-graphkit ``` **`torch-scatter` is a required dependency.** It cannot be bundled in `pyproject.toml` because the correct wheel depends on your PyTorch and CUDA versions, so it must be installed separately. @@ -49,7 +49,7 @@ pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION For documentation generation and unit testing, install with the optional `dev` and `test` extras: ```bash -pip install -e .[dev,test] +pip install "gridfm-graphkit[dev,test]" ``` @@ -94,6 +94,7 @@ gridfm_graphkit train --config path/to/config.yaml | `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` | | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | | `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | ### Examples @@ -130,6 +131,7 @@ gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model | `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` | | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | | `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | --- @@ -161,6 +163,7 @@ gridfm_graphkit evaluate --config path/to/eval.yaml --model_path path/to/model.p | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | | `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` | | `--save_output` | `flag` | Save predictions as `_predictions.parquet` under MLflow artifacts (`.../artifacts/test`). | `False` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | ### Example with saved normalizer stats @@ -204,6 +207,7 @@ gridfm_graphkit predict --config path/to/config.yaml --model_path path/to/model. | `--bfloat16` | `flag` | Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`). | `False` | | `--tf32` | `flag` | Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`. | `False` | | `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | --- @@ -224,6 +228,7 @@ gridfm_graphkit benchmark --config path/to/config.yaml | `--dataset_wrapper_cache_dir` | `str` | Directory for dataset wrapper disk cache. | `None` | | `--num_workers` | `int` | Override `data.workers` from YAML. | `None` | | `--plugins` | `list[str]` | Python packages to import for plugin registration. | `[]` | +| `--mp_context` | `str` | DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning. | `None` | Use built-in help for full command details: diff --git a/docs/install/installation.md b/docs/install/installation.md index 345091a9..d81a91c1 100644 --- a/docs/install/installation.md +++ b/docs/install/installation.md @@ -1,6 +1,6 @@ # Installation -The steps below mirror the [README](https://github.com/gridfm/gridfm-graphkit/blob/main/README.md#installation). Run them from the root of a local clone or source checkout of the repository. +The steps below mirror the [README](https://github.com/gridfm/gridfm-graphkit/blob/main/README.md#installation). Create and activate a virtual environment (make sure you use the right python version = 3.10, 3.11 or 3.12. I highly recommend 3.12) @@ -9,10 +9,10 @@ python -m venv venv source venv/bin/activate ``` -Install gridfm-graphkit in editable mode +Install gridfm-graphkit from PyPI ```bash -pip install -e . +pip install gridfm-graphkit ``` **`torch-scatter` is a required dependency.** It cannot be bundled in `pyproject.toml` because the correct wheel depends on your PyTorch and CUDA versions, so it must be installed separately. @@ -33,5 +33,5 @@ pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION For documentation generation and unit testing, install with the optional `dev` and `test` extras: ```bash -pip install -e .[dev,test] +pip install "gridfm-graphkit[dev,test]" ```