Skip to content

build: migrate extension build to scikit-build-core#230

Open
Eldergenix wants to merge 5 commits into
deepgenomics:mainfrom
Eldergenix:issue-177-scikit-build-core
Open

build: migrate extension build to scikit-build-core#230
Eldergenix wants to merge 5 commits into
deepgenomics:mainfrom
Eldergenix:issue-177-scikit-build-core

Conversation

@Eldergenix
Copy link
Copy Markdown

@Eldergenix Eldergenix commented May 13, 2026

Summary

  • migrate the C++ extension build from setuptools/distutils monkeypatching to scikit-build-core with CMake
  • move package metadata and build configuration into pyproject.toml, and remove the obsolete setup.py/MANIFEST.in path
  • update conda, wheel, docs, and development-environment build inputs for the new backend
  • harden the conda recipe patch helper and macOS SDK discovery used by the CMake build

Tests

  • python -m build --sdist --wheel
  • tar -tf dist/*.tar.gz | rg '(^|/)src/py_init.cpp$|(^|/)src/genome_kit.cpp$|(^|/)src/genome_kit.h$|(^|/)CMakeLists.txt$|(^|/)LICENSE$|(^|/)COPYRIGHT.txt$'
  • unzip -l dist/*.whl | rg 'genome_kit/_cxx|genomekit-7.4.4.dist-info/(METADATA|licenses/LICENSE|licenses/COPYRIGHT.txt)'
  • python -m pip install --force-reinstall dist/*.whl
  • cd /tmp && python -c "import genome_kit, genome_kit._cxx; print(genome_kit.__version__)"
  • CI=1 python -m unittest discover (500 tests, 50 skipped, 1 expected failure)
  • python -m compileall -q genome_kit tests
  • git diff --check

Notes

  • PR feat: support python 313 (wip) #169 is still an open draft touching legacy build files; this branch is rebased on current main.
  • Maintainer edits are enabled for this fork PR.

Closes #177

@Eldergenix Eldergenix force-pushed the issue-177-scikit-build-core branch from 6f41d0c to 97a2594 Compare May 13, 2026 11:08
@ovesh
Copy link
Copy Markdown
Contributor

ovesh commented May 13, 2026

This is a lot to take in.

  1. Why are we migrating to scikit-build-core and not vanilla python tooling?
  2. Can the change be made in smaller incremental PRs that would be easier to review?

@Eldergenix
Copy link
Copy Markdown
Author

Thanks for calling this out. I agree this is a fairly large change, but with good reason although I can split it up so it can be reviewed easier.

My reasoning for the scikit-build-core vs. “vanilla” Python tooling direction is that GenomeKit’s build is not just Python packaging. The main complexity is building the native C++ extension while preserving the existing compiler flags, linker behavior, NumPy include handling, zlib linkage, and other platform specific behaviors.

Using only setuptools would likely still require custom build_ext logic or another layer of glue to keep that native build working correctly. That is still relatively close to the kind of custom build machinery this change is trying to remove.

scikit-build-core still uses the standard Python packaging entry points: pyproject.toml, PEP 517 builds, pip install ., editable installs, and normal wheel/sdist frontends. The difference is that the backend delegates the C++ build to CMake. Since this repository already has CMake, that seemed like the lowest-risk path to take while still modernizing it.

It also matches the direction suggested in issue #177, where scikit-build-core was called out as a good fit for existing CMake projects.

That said, I agree this would be easier to review as smaller PRs.

What do you think about the following split:

  1. Add or modernize the CMake target for the _cxx extension while keeping the current setuptools build path intact.
  2. Switch pyproject.toml to scikit-build-core and update the minimum CI/build changes needed to verify the new path.
  3. Remove obsolete setuptools/distutils monkeypatch code and update developer documentation once the new path is proven.

The backend switch itself probably needs the CMake extension target and pyproject.toml changes to land together, otherwise the package may not build in the intermediate state. But the surrounding cleanup and documentation changes can definitely be split out.

I bundled the work because it creates one complete migration path that can be tested end to end, but I’m fine with breaking it into smaller reviewable PRs if that is the preferred approach just let me know if this is acceptable.

Let me know if the split above is acceptable and I'll make the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace parallel build monkeypatch in favor of a modern build framework

2 participants