Skip to content

Python bindings via PyO3 #76

Description

@devxrachit

RFC: Python bindings via PyO3 (tracking issue)

Motivation

The README lists "Straightforward Python bindings using pyo3" as a roadmap goal (README.md, "Somewhere down the line"). This issue proposes a concrete plan to deliver that in a staged, reviewable way, and tracks the work across multiple PRs.

Exposing Rustframe to Python widens its audience for the educational use case the crate is aimed at: notebooks, classroom demos, and quick numeric experiments, without compromising the pure-Rust experience for existing users.

Goals

  • Make the core Matrix<f64> API usable from Python with idiomatic ergonomics (constructors, indexing, arithmetic, shape).
  • Keep the default Rust build untouched: PyO3 must be opt-in via a Cargo feature so cargo build / cargo test / crates.io are unaffected.
  • Use maturin for building and packaging wheels.
  • Lay foundations that the frame, compute, and random modules can plug into later without rework.

Non-goals (for now)

  • Publishing to PyPI: wheel building in CI is deferred to a later PR.
  • numpy zero-copy interop: deferred to a follow-up so the first PR stays small.
  • Binding generic Matrix<T> over arbitrary types, so we start with Matrix<f64> and add Matrix<bool> / Matrix<i64> as separate classes if/when needed.
  • Frame, compute::*, and random::* bindings: each gets its own follow-up PR.

Proposed architecture

1. Feature flag

In Cargo.toml:

[features]
python = ["dep:pyo3"]

[dependencies]
pyo3 = { version = "0.22", features = ["extension-module"], optional = true }

crate-type = ["cdylib", "lib"] is already set, so no change there.

2. Source layout

Add a src/python/ module, gated on the python feature:

src/
  lib.rs              # add: #[cfg(feature = "python")] mod python;
  python/
    mod.rs            # #[pymodule] fn rustframe(...)
    matrix.rs         # PyMatrix wrapping Matrix<f64>
    errors.rs         # map Rustframe panics to IndexError / ValueError

A separate workspace crate was considered, but a feature-gated module keeps the diff small and avoids workspace churn. We can extract later if the binding surface grows.

3. First binding surface (PyMatrix)

Wrap Matrix<f64> and expose:

  • Constructors: Matrix(data: list[list[float]]) (mirrors Matrix::from_cols), Matrix.from_rows(data), Matrix.zeros(rows, cols), Matrix.from_flat(data, rows, cols, order="col").
  • Properties: .shape, .rows, .cols.
  • Access: m[r, c] (__getitem__ / __setitem__).
  • Arithmetic: __add__ / __sub__ / __mul__ / __truediv__ for both scalar and PyMatrix operands (mirrors mat.rs operator impls).
  • Methods: .transpose(), .matmul(other) / .dot(other).
  • Dunder: __repr__, __eq__, __len__ (== rows).

4. Error mapping

Rustframe's matrix code uses assert! / panic! (see src/matrix/mat.rs: out-of-bounds, shape mismatch, overlapping columns, etc.). For Python, panics-across-FFI are undefined behavior, so we wrap fallible operations with std::panic::catch_unwind at the binding boundary and translate to PyValueError / PyIndexError.

A cleaner long-term fix is to migrate panics in mat.rs to Result<_, Error>, but that's out of scope here. Happy to open a separate issue for it.

5. Build & test

  • pyproject.toml at repo root with [build-system] requiring maturin >= 1.5 and [tool.maturin] features = ["python"].
  • python/tests/test_matrix.py: pytest smoke tests, parity-checked against equivalent Rust assertions where it makes sense.
  • New CI job python-bindings.yml: matrix of {py3.10, py3.12}, runs maturin develop --features python then pytest. Existing run-unit-tests.yml is left alone.

6. Docs

  • New section in README under "What it offers": "Python bindings (optional)" with a short install-from-source + usage snippet.
  • A python/README.md with build instructions.
  • User-guide page deferred until the API surface settles.

Staged PR plan

| PR | Scope

-- | -- | --
1 | python: scaffold PyO3 + bind Matrix | Feature flag, src/python/, PyMatrix, error mapping, pytest smoke tests, CI job, README section. This issue's first deliverable.
2 | python: numpy interop | Optional numpy zero-copy as_array() / from_numpy() via PyReadonlyArray2.
3 | python: bind Frame | PyFrame wrapping RowIndex (Range / Int / Date).
4 | python: bind compute::stats | Descriptive stats, correlation, distributions.
5 | python: bind compute::models | Linear regression, PCA, GNB, DNN.
6 | python: bind random | PRNG + crypto generators.
7 | python: publish to PyPI | maturin-action release workflow, version sync, trusted publishing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions