Skip to content

Olympus-HPC/proteus

docs (gh-pages) Build and test codecov pre-commit License: Apache 2.0 with LLVM exceptions

Proteus

Proteus is a programmable runtime specialization and Just-In-Time (JIT) layer built on LLVM. It embeds into existing C++ codebases and accelerates host, CUDA, and HIP applications by using runtime context to specialize code and enable optimizations beyond static compilation.

Description

Standard ahead-of-time (AOT) compilation can only optimize a program with the information available at build time. Proteus goes further by embedding optimizing JIT compilation directly into C/C++ applications.

Runtime context, such as the actual values of variables during execution, lets it specialize code on the fly and apply advanced compiler optimizations that accelerate performance beyond what static compilation allows.

Several frontends are available, depending on how you want to describe JIT code:

Interface Input style Best for Specialization model Requires Clang AOT?
Code annotations Existing C/C++/CUDA/HIP code Incremental adoption in existing applications Values, arrays, objects, and launch configuration Yes
C++ frontend API C++ source strings Runtime-generated C++ and templates Values, arrays, objects, and launch configuration No
MLIR frontend API MLIR source strings Direct access to MLIR lowering Encoded in the provided MLIR source No
Embedded DSL API Programmatic builders Runtime code generation with high-level constructs Values, arrays, and launch configuration No

These frontends can target host, CUDA, and HIP execution paths, with backend support depending on how Proteus was configured:

Interface Host CUDA HIP Notes
Code annotations Yes Yes Yes Requires compiling with Clang and uses the Proteus LLVM pass
C++ frontend API Yes Yes Yes Uses Clang by default; CUDA paths can use NVCC
MLIR frontend API Yes Yes Yes Requires PROTEUS_ENABLE_MLIR=ON
Embedded DSL API Yes Yes Yes Uses the LLVM backend by default; MLIR backend requires PROTEUS_ENABLE_MLIR=ON

CUDA, HIP, and MLIR support are available when Proteus is built with the corresponding configuration options enabled.

Proteus includes both in-memory and persistent caching, ensuring that once code has been compiled and optimized, the cost of recompilation is avoided.

Proteus consists of an LLVM pass and a runtime library that implements JIT compilation and optimization using LLVM as a library.

  • The code annotation interface requires compiling your application with Clang so the Proteus LLVM pass can parse annotations.
  • The DSL, C++ frontend, and MLIR frontend APIs don’t depend on which AOT compiler you use.

In all cases, you link your application against the Proteus runtime library. Details are provided later.

Installation

Proteus can be installed from source or via spack.

Spack

We provide a packaging recipe for Spack in the subdirectory packaging/spack.

Assuming you have a Spack installation and preferably using an isolated Spack environment, you can add the spack repo by cloning Proteus and then install it by running:

git clone https://github.com/Olympus-HPC/proteus.git
spack repo add proteus/packaging/spack
spack install proteus

We provide several variants to match different configurations, including CUDA, ROCm, and MPI support. A complete list of variants and their descriptions is available in the Spack package file, or viewable through:

spack info proteus

Some typical examples:

# Install the latest version with CUDA support for sm_90 arch.
spack install proteus +cuda cuda_arch=90

# Install the latest version with ROCm support for gfx942 arch.
spack install proteus +rocm amdgpu_target=gfx942

# Install the latest version with MPI support.
spack install proteus +mpi

Building from source

The project uses cmake and requires an LLVM installation. CI tests currently cover LLVM 19, 20, 22 with CUDA versions 12.2, and AMD ROCm versions 6.4.3 (based on LLVM 19), 7.1.1 (based on LLVM 20), 7.2.0 (based on LLVM 22).

See the top-level CMakeLists.txt for the available build options. A typical build looks like this:

mkdir -p build && cd build
cmake -DLLVM_INSTALL_DIR=<llvm_install_path> -DCMAKE_INSTALL_PREFIX=<install_path> ..
make install

The scripts directory contains setup scripts for building on different targets (host-only, CUDA, ROCm) used on LLNL machines. They also serve as good starting points to adapt for other environments. Run them from the repository root:

source scripts/setup-<target>.sh

These scripts load environment modules (specific to LLNL systems) and create a build-<hostname>-<target>-<version> directory with a working configuration.

Integrating with your build system

CMake

To integrate Proteus with CMake, add the install prefix to CMAKE_PREFIX_PATH, or pass it explicitly with -Dproteus_DIR=<install_path>/<libdir>/cmake/proteus during configuration where <libdir> is typically lib or lib64. Then, in your project's CMakeLists.txt add:

find_package(proteus CONFIG REQUIRED)

add_proteus(<target>)

If you only need the DSL, C++ frontend, or MLIR frontend APIs, you can link directly against proteusFrontend. In this case, you don’t need to compile your target with Clang:

find_package(proteus CONFIG REQUIRED)

target_link_libraries(<target> ... proteusFrontend ...)

Make

With make, annotation-based integration requires adding compilation and linking flags, for example:

CXXFLAGS += -I<install_path>/include -fpass-plugin=<install_path>/<libdir>/libProteusPass.so

LDFLAGS += -L<install_path>/<libdir> -Wl,-rpath,<install_path>/<libdir> -lproteus $(llvm-config --libs) -lclang-cpp

If you don't use code annotations, you can omit the -fpass-plugin option, since the LLVM pass is only needed for processing annotations.

Using

Proteus's core optimization technique is runtime constant folding. It replaces runtime values with constants during JIT compilation, which in turn turbo-charges classical compiler optimizations such as loop unrolling, control-flow simplification, and constant propagation. Think of it as doing constexpr, but at runtime.

Values that can be folded include function or kernel arguments, kernel launch dimensions, launch bounds, and other runtime variables.

Choose the interface that matches how your application wants to describe JIT work:

Interface Detailed guide
Code annotations Code Annotations
C++ frontend API C++ Frontend API
MLIR frontend API MLIR Frontend API
DSL API DSL API

Proteus generates a unique specialization for each distinct set of runtime values and caches them in memory and on disk, so JIT overhead is minimized within and across runs.

Documentation

The Proteus documentation has more extensive information, including a user's guide and developer manual.

Contributing

We welcome contributions to Proteus in the form of pull requests targeting the main branch of the repo, as well as questions, feature requests, or bug reports via issues.

Code of Conduct

Please note that Proteus has a Code of Conduct. By participating in the Proteus community, you agree to abide by its rules.

Authors

Proteus was created by Giorgis Georgakoudis, georgakoudis1@llnl.gov.

Key contributors are:

License

Proteus is distributed under the terms of the Apache License (Version 2.0) with LLVM Exceptions.

All new contributions must be made under the Apache-2.0 with LLVM Exceptions license.

See LICENSE, COPYRIGHT, and NOTICE for details.

SPDX-License-Identifier: (Apache-2.0 WITH LLVM-exception)

LLNL-CODE-2000857

About

Programmable JIT Compilation and Optimization for C/C++ using LLVM

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors