Skip to content

MoSafi2/mojo_bindgen

Repository files navigation

CI

mojo-bindgen

Warning

Alpha stage: this project is under heavy development and may change quickly.

C headers -> Mojo FFI. mojo-bindgen parses real C with libclang, and emits Mojo bindings for external_call or owned_dl_handle workflows. this mirrors the spirit of rust-bindgen which follows the same approch for Rust

The goal is simple: make binding generation easy and faithful as possible to the actual C surface, and fail conservatively when a declaration cannot be modeled correctly.

Requirements

  • Python 3.14+
  • a system libclang compatible with the libclang Python wheel
  • a Mojo (nightly) toolchain if you want to build or run the generated bindings

Installation

System dependencies

Install Clang and the shared libclang library first:

# Ubuntu / Debian
sudo apt update && sudo apt install -y clang libclang1

# Fedora
sudo dnf install -y clang llvm-libs

# macOS (Homebrew)
brew install llvm

If the shared library is not on the default loader path, set LIBCLANG_PATH to the directory containing libclang.so or libclang.dylib.

Install from PyPI

pip install mojo-bindgen

PyPI package: mojo-bindgen

Install from source

git clone https://github.com/MoSafi2/mojo_bindgen
cd mojo_bindgen
pip install -e .

For development setup, checks, and Pixi workflows, see CONTRIBUTING.md.

Quick start

Generate bindings from a primary header:

mojo-bindgen path/to/header.h --output bindings.mojo --link-mode external-call

Pass common Clang inputs with structured flags:

mojo-bindgen include/mylib.h \
  --include-dir ./include \
  --define MYLIB_FEATURE=1 \
  --std c11 \
  --output mylib_bindings.mojo

Emit declarations from additional include headers with repeated --emit-header. Dependency headers are still parsed for type information and macro expansion, but are not emitted unless listed:

mojo-bindgen include/mylib.h \
  --emit-header include/mylib_extra.h \
  --include-dir ./include \
  --output mylib_bindings.mojo

By default the parser uses -std=gnu11 when no C standard is provided. Pin a standard explicitly if your header requires one:

mojo-bindgen include/mylib.h --std c99 --output mylib_bindings.mojo

Use --clang-arg for raw Clang flags that do not have a structured option:

mojo-bindgen include/mylib.h --clang-arg=-fms-extensions --output mylib_bindings.mojo

Linking modes

mojo-bindgen supports two output styles:

  • external_call Direct FFI wrappers. Use this when the target library is linked at Mojo build time.
  • owned_dl_handle Dynamic runtime symbol lookup via OwnedDLHandle for loading a shared library (.so, .dylib).

Examples:

# default
mojo-bindgen include/mylib.h --link-mode external-call --output mylib_bindings.mojo

# runtime-loaded shared library
mojo-bindgen include/mylib.h \
  --link-mode owned-dl-handle \
  --library-path /usr/lib/libmylib.so \
  --output mylib_bindings_dl.mojo

What works today?

mojo-bindgen is still alpha and evolves quickly, but it already supports a useful slice of real C headers and is practical today as a starting point for generating bindings.

Current support includes:

  • Parsing and mapping: real C parsing through libclang, repeatable --clang-arg support, and a structured IR pipeline rather than text-only generation.
  • Primitive types: scalar types, typedef chains, pointers with const-aware mutability, fixed arrays, incomplete-array decay cases, complex values, vector extension types, and representable atomics.
  • Mojo-native numeric mapping: vector types map to SIMD[...], complex values map to ComplexSIMD[...], and representable atomics map to Atomic[...].
  • Records: structs, anonymous members, mixed layouts that combine plain fields and bitfields, synthesized padding, and custom alignment emission where Mojo can represent the layout faithfully.
  • Bitfields: bitfields are emitted through explicit storage fields plus synthesized getter and setter methods.
  • Unions: eligible unions map to UnsafeUnion[...]; unions that cannot be represented safely fall back to opaque InlineArray[...] storage with diagnostics to preserve layout.
  • Opaque and difficult layouts: incomplete records, packed layouts, and alignment-sensitive record storages are preserved conservatively as opaque byte storage when a faithful typed layout is not possible.
  • Callbacks and function pointers: callback typedefs, function-pointer fields, and function-pointer parameters and returns are preserved in Mojo via emitted comptime callback declarations and synthesized aliases when needed.
  • Functions: thin wrappers are generated for non-variadic functions under both external_call and owned_dl_handle link modes.
  • Globals and constants: because Mojo does not currently expose native C globals directly, supported globals map through generated GlobalVar / GlobalConst helper structs with synthesized load() / store() methods; constants and supported object-like macros map to comptime declarations.
  • Macros: integer, float, string, and char literal macros, foldable macro chains, supported casts, and sizeof(type) expressions are emitted as Mojo code.
  • JSON IR output: the CLI can emit serialized parser IR for debugging, testing, or downstream tooling.

Current limitations

Known gaps you may still hit in generated code. For ABI-sensitive surfaces, verify emitted layouts and symbols against your target toolchain.

  • Macros: function-like macros, predefined macros, and more complex preprocessor behavior are preserved but usually emitted as comments for end-user review.
  • Variadics: variadic C functions are not wrapped as callable thin-FFI bindings yet and are emitted as comment stubs.
  • Non-prototype / K&R-style functions: older C declaration styles are only partially modeled and should be treated with caution.
  • Records with hostile layouts: some packed, ABI-sensitive, or otherwise difficult record layouts cannot be emitted as fully typed Mojo structs and fall back to opaque storage; layout-sensitive declarations may still require manual verification.
  • Anonymous members: anonymous struct and union members are preserved structurally, but they are not automatically promoted into a flattened parent record surface.
  • Atomics: atomic support is conservative. Representable atomic fields and pointer-based usage work, but atomic globals are still emitted as stubs and some surfaces require manual handling.
  • Linkage and compiler edge cases: inline, compiler-specific linkage hints, and other extension-heavy cases can still require manual review and may lead to symbol mismatches at runtime.
  • Emit-header model: declarations are emitted from the primary header you pass to the tool plus any headers named with --emit-header. Other headers included by those files are parsed as dependencies but not emitted.

Real-world examples

The repository includes worked examples and smoke programs for:

These examples do more than generate bindings: their generate.sh scripts also build smoke artifacts or run small functional tests to check the usability of the generated bindings. They also emit layout-test sidecars when file output is generated.

The test suite also has end-to-end runtime coverage for:

  • by-value records and enums
  • callbacks and function-pointer returns
  • globals and constants
  • vectors and complex values
  • atomic pointer-based APIs
  • opaque forward declarations
  • pointer-to-array and array-decay cases
  • both external_call and owned_dl_handle link modes

See tests/e2e/README.md for the current runtime case matrix.

Troubleshooting

The generated module is empty or missing declarations

mojo-bindgen emits declarations from the primary header you pass in and any headers listed with --emit-header. If a thin wrapper only includes another header whose declarations should be emitted, pass that included header with --emit-header or use it as the primary header directly.

Parsing fails on project headers

Most parser failures are missing include paths, target flags, or defines. Add the same flags your C build uses with --include-dir, --define, --undefine, --target, --sysroot, --std, or repeated --clang-arg.

Debugging parser failures

Print the normalized Clang arguments that mojo-bindgen will use:

mojo-bindgen include/mylib.h --include-dir ./include --dump-clang-args

Write diagnostics as JSON while still generating normal output:

mojo-bindgen include/mylib.h \
  --diagnostics json \
  --diagnostics-output diagnostics.json \
  --output mylib_bindings.mojo

Dump the preprocessed input that Clang sees:

mojo-bindgen include/mylib.h \
  --include-dir ./include \
  --dump-preprocessed mylib.preprocessed.c \
  --output mylib_bindings.mojo

Build succeeds but symbols are missing at runtime

Double-check:

  • --library and --link-name
  • your Mojo link flags for external_call
  • your --library-path for owned_dl_handle
  • whether the original C declaration involved tricky inline or exotic layout that needs manual review.

License

Licensed under the MIT License. See LICENSE.


Contributing: CONTRIBUTING.md.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors