This repository contains the SST-based simulation infrastructure for the IBEX compression architecture proposed in the paper:
IBEX: Internal Bandwidth-Efficient Compression Architecture for Scalable CXL Memory Expansion
Proceedings of the International Conference on Supercomputing (ICS 2026).
To touch our implementation: Majority of our source codes is modularly implemented in memHierarchy-memCompression
Besides compression source codes, we also put considerable efforts on workload instrumentation by augmenting ariel frontend to make memory states aware of compression, as it affects performance metrics according to compression ratios.
Assuming that the Ubuntu machine is in a vanilla state (tested on Ubuntu 22.04.4 LTS)
- Update apt list and reboot
$ sudo apt update
$ sudo apt upgrade
$ reboot now - Install packages
$ sudo apt install libtool-bin build-essential cmake patch uuid-dev python3-dev autoconf automake autotools-dev curl python3 python3-pip libmpc-dev libmpfr-dev libgmp-dev gawk bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ninja-build git cmake libglib2.0-dev- Download codes
$ git clone https://github.com/relacslab/ibex-ics26.git
$ cd ibex-ics26
$ wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.5.tar.gz
$ wget https://software.intel.com/sites/landingpage/pintool/downloads/pin-3.22-98547-g7a303a835-gcc-linux.tar.gz
$ tar -xzvf openmpi-4.0.5.tar.gz
$ tar -xzvf pin-3.22-98547-g7a303a835-gcc-linux.tar.gz- Set up environmental variables (Note: Re-run this command when building or running in a new terminal.)
$ source scripts/env.sh- Build OpenMPI 4.0.5
$ mkdir -p $SST_HOME/build/openmpi-4.0.5
$ cd $SST_HOME/build/openmpi-4.0.5
$ ../../openmpi-4.0.5/configure --prefix=$MPIHOME
$ make all
$ make install- Build SST-core
$ cd $SST_HOME/sst-core
$ ./autogen.sh
$ mkdir -p $SST_HOME/build/sst-core
$ cd $SST_HOME/build/sst-core
$ ../../sst-core/configure --prefix=$SST_HOME/install/sst-core CC=$CC CXX=$CXX MPICC=$MPICC MPICXX=$MPICXX
$ make
$ make install- Build SST-elements (requires Step 4)
$ cd $SST_HOME/sst-elements
$ ./autogen.sh
$ mkdir -p $SST_HOME/build/sst-elements
$ cd $SST_HOME/build/sst-elements
$ ../../sst-elements/configure --prefix=$SST_HOME/install/sst-elements --with-sst-core=$SST_HOME/install/sst-core --with-pin=$INTEL_PIN_DIRECTORY
$ make
$ make install- GAPBS
$ cd $SST_HOME
$ git submodule update --init benchmarks/gapbs
$ cd benchmarks/gapbs
$ make
$ make bench-graphs- XSBench
$ cd $SST_HOME
$ git submodule update --init benchmarks/XSBench
$ cd benchmarks/XSBench/openmp-threading
$ make
$ ./XSBench -b write-
SPEC CPU2017
Due to licensing restrictions, SPEC CPU2017 cannot be redistributed with this repository. Users must obtain the benchmark suite from the official SPEC CPU2017 website and build it locally.
The execution configurations used in our experiments are provided in
scripts/compressed_cxl.py.
To run a workload with the default settings (4 core, no compression), use the following command:
$ python $SST_HOME/scripts/run.py <app>For non-default settings, use the following command line options:
| Argument | Description | Options | Default |
|---|---|---|---|
app |
Target application (workload) | bwaves, mcf, parest, lbm, omnetpp, bfs, cc, pr, tc, XSBench | Required |
--analyze |
Run compression ratio analysis mode (single core, omit cycle-level simulation) | flag | disabled |
--core |
Number of simulated CPU cores | integer | 4 |
--compression |
Compression scheme | none, compresso, ibex, mxt, dmc, tmcc, dylect | none |
--cxl_lat |
CXL turnaround latency (ns) | integer | 70 |
--decomp_lat |
Decompression latency per 1KB (cycles) | integer | 64 |
--pregion |
Size of promoted region (MB) | integer | 512 |
--disable_dTraffic |
Disable background demotion tracking traffic | flag | disabled |
--ibex_shadowed |
Enable IBEX shadowed demotion | flag | disabled |
--ibex_block |
Enable IBEX block co-allocation | flag | disabled |
--ibex_compaction |
Enable IBEX metadata compaction | flag | disabled |
--ibex_read_ratio |
IBEX read weight for probabilistic read-to-write conversion (0 to disable) | integer | 0 |
--ibex_write_ratio |
IBEX write weight for probabilistic read-to-write conversion (0 to disable) | integer | 0 |
Example runs:
# Run without any compression scheme
$ python $SST_HOME/scripts/run.py XSBench
# Run with modified CXL latency
$ python $SST_HOME/scripts/run.py XSBench --cxl_lat 150
# Run IBEX compression with all optimizations enabled
$ python $SST_HOME/scripts/run.py XSBench --compression ibex --ibex_shadowed --ibex_block --ibex_compaction
# Run compression ratio analysis mode
$ python $SST_HOME/scripts/run.py XSBench --analyzeSimulation outputs will be generated under $SST_HOME/outputs/<configuration>/<app>.
- Younghoon Ko [Lead] (ykhoon3810@snu.ac.kr)
- Hyemin Park
- Hyuk-Jae Lee
- Hyokeun Lee (hyokeun.lee.phd@gmail.com)
@inproceedings{ibex-ics2026,
author = {Ko, Younghoon and Park, Hyemin and Lee, Hyuk-Jae and Lee, Hyokeun},
title = {IBEX: Internal Bandwidth‑Efficient Compression Architecture for Scalable CXL Memory Expansion},
booktitle = {Proceedings of the 40th ACM International Conference on Supercomputing},
year = {2026},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
isbn = {9798400725227},
doi = {10.1145/3797905.3800521},
url = {https://doi.org/10.1145/3797905.3800521},
series = {ICS '26}
}