Skip to content

[FR]: Sophisticated sdist_build support #860

@arrdem

Description

@arrdem

What is the current behavior?

Today the sdist_build support can be reasonably accused of being a placeholder. We need a rule which goes from a source distribution archive to an installable platform-specific way in the most generic possible manner.

Achieving this in the simple cases especially of legacy -none-any packages which just don't ship prebuilds is trivial, and about what's currently implemented. But the entire point of the uv project was to achieve generalized support for C-extensions such as PyO3 and cython which require the use of at a minimum a C toolchain and likely non-Python build time dependencies in order to produce hermetic build results. Today there's a hat tip in this direction in the form of the sdist_native_build rule, which uses execution constraints to allow for RBE crossbuilds, but there's no coherent mechanism in the current uv extension for native sdist builds to be autoconfigured, or for appropriate toolchains to be identified and injected as part of build orchestration.

Describe the feature

Long time Bazel ecosystem participants will recognize that this is just the latest incarnation of what seems to be the ever ongoing struggle between Bazel's phasing model which demands static dependencies and the outside world where dynamic dependencies are replete. The conventional solution to this problem is exemplified by the Gazelle and rules_go stack, which tries to cheat the phasing model by using a prebuilt tool during the repository phase to perform code and dependency analysis which can drive BUILD.bazel generation according to dependencies which it is intractable to analyze statically.

https://pypackaging-native.github.io/key-issues/native-dependencies/ has a useful writeup of some of the challenges in the Python ecosystem. The Astral folks have made mention of interest in efforts to standardize a native dependency story but as of today no such really exists. Pixi is a partial solution to this problem, astral-sh/uv#14735 and astral-sh/uv#2252 are I believe germane.

For more modern packages, there are conventions for specifying as requirements packages which must be installed in order to perform a source build. I do not believe that such build time transitive dependencies are included in UV's lock solutions at present. For older setuptools based packages interpretation of setup.py files is required. This is made worse by legacy setup.py systems which perform custom cmake and such calls.

An idea I've batted around before and which is in many ways an extension of the current annotations.toml pattern is to create a non-standardized rules_py specific "database" format noting the build time dependencies for key Python ecosystem packages, as a possible input to partial solutions to this rats nest of issues.

I highlight all these issues not to propose that we solve them comprehensively but to impress that there is at present to reasonable let alone comprehensive solution to these problems. The best I think we can technically achieve is to adjust the design of the ruleset so that we provide an sdist build configuration interface which is modular to a substantial degree precisely so that we do not have to be in the business of owning a comprehensive solution to this problem space; users can bring their own and if a more comprehensive ecosystem solution does become available hopefully we can integrate it without fuss.


The sketch I would propose is that we combine the new PBS toolchain support and the prebuilt tool pattern to allow for us to invoke a "simple" Python script with a PBS sourced interpreter during the repository phase of configuring sdist_build repos. Following the example of rules_go we would replace at least most sdist_build buildfile generation with a Python script whose responsibility would be to generate a conventional build target, using whatever rules rulesets and dependencies it may require. This would allow the satisfaction of @zbarsky-openai's ask that we have general support for sdist builds consuming toolchains such as cc and make, by allowing users to bring a configuration script which will do so appropriately without having to weigh down our baseline sdist_build rule with dependencies on toolchains which are irrelevant to the majority of wheel builds and perhaps worse inadequate to the configuration needs of those wheels which do have advanced builds.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions