Learned Control Layers for Combinatorial Local Search

Code, benchmarks, and trained models for the paper "Learned Control Layers for Combinatorial Local Search."

We propose a control-layer architecture for Dynamic Algorithm Configuration (DAC): an external RL controller adjusts solver parameters at runtime through a standardized checkpoint interface. The same PPO training pipeline is validated on two solver families without modification beyond domain-specific state--action definitions.

Key Results

MaxSAT (NuWLS, 4 parameters, 10 state features)

Comparison	PPO Mean Cost	Baseline Mean Cost	Improvement	Wilcoxon p
PPO vs Random	372.1	459.6	-19.0%	2.4 x 10^-5
PPO vs Best Static	372.1	415.1	-10.4%	0.0069
PPO vs BO Static (v500)	--	--	7/10 wins	0.037

Scale-dependent crossover: BO static wins at small scale, PPO overtakes at 500 variables.

SAT (WalkSAT, 2 parameters, 8 state features)

PPO achieves the lowest mean unsatisfied clause count among all tested configurations, with statistically significant improvements over suboptimal baselines (Wilcoxon p < 0.01).

Cross-Domain Patterns

Exploration parameter dominance: noise_prob accounts for nearly all DAC benefit in both domains
Scale-dependent value of dynamic control: learned policies improve relative to static tuning as instance size grows

Architecture

The checkpoint protocol decouples parameter scheduling from solver implementation:

Solver pauses every N flips
Reports search state features to the controller
Receives updated parameter values
Resumes with new parameters

Any solver exposing this interface can be governed by the same RL pipeline.

Repository Structure

maxsat-dac/
├── src/
│   ├── train_domain.py               # PPO training script
│   ├── eval_domain.py                # Evaluation with statistical tests
│   └── domains/sat/
│       ├── solver.py                 # WalkSAT solver with checkpoint protocol
│       ├── gym_env.py                # Gymnasium DAC environment (8 features, 2 actions)
│       ├── baselines.py              # 5 static baseline configurations
│       └── generate_instances.py     # Phase-transition 3-SAT instance generator
├── data/
│   ├── benchmarks/sat/               # 90 CNF instances (v50/v100/v200) + split.json
│   └── results/                      # Trained models and evaluation results (both domains)
├── archive/                          # Original MaxSAT-only code (see archive/README.md)
├── configs/sat.json                  # SAT domain configuration reference
├── requirements.txt
└── REPRODUCE.md                      # Step-by-step reproduction guide

MaxSAT code is in archive/ (NuWLS C++ solver wrapper, Gymnasium env, all experiment scripts). MaxSAT results are in data/results/ (JSON files + trained PPO models in ppo_csolver_500k/).

SAT code is the active codebase in src/domains/sat/ (pure Python WalkSAT). SAT results are in data/results/sat_*/.

Quick Start (SAT domain)

# Install dependencies
pip install -r requirements.txt

# Generate benchmark instances (or use the included 90 instances)
python src/domains/sat/generate_instances.py

# Train PPO controller (3 seeds as in the paper)
python src/train_domain.py --domain sat --timesteps 200000 --seed 42
python src/train_domain.py --domain sat --timesteps 200000 --seed 123
python src/train_domain.py --domain sat --timesteps 200000 --seed 999

# Evaluate against baselines
python src/eval_domain.py --domain sat --model data/results/sat_ppo_*/best_model/best_model.zip

No external solver compilation is needed for SAT -- the WalkSAT solver is pure Python.

For MaxSAT reproduction, see archive/ and the original training script archive/train_500k.py (requires NuWLS C++ compilation).

See REPRODUCE.md for detailed SAT reproduction instructions.

Requirements

Python 3.10+
SAT domain: no C/C++ compiler needed (pure Python solver), ~1 hour training
MaxSAT domain (archive): g++ or MSVC for NuWLS compilation, ~4 hours training

Paper

Learned Control Layers for Combinatorial Local Search Alex Chengyu Li, 2026

Submitted to the Journal of Heuristics (Springer).

Citation

@article{li2026controllayers,
  author    = {Li, Alex Chengyu},
  title     = {Learned Control Layers for Combinatorial Local Search},
  journal   = {Journal of Heuristics},
  year      = {2026},
  note      = {Under review}
}

License

This project is licensed under the MIT License. See LICENSE for details.

Acknowledgments

NuWLS solver by Yi Chu et al. (AAAI 2023): GitHub
DACBench framework: GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learned Control Layers for Combinatorial Local Search

Key Results

MaxSAT (NuWLS, 4 parameters, 10 state features)

SAT (WalkSAT, 2 parameters, 8 state features)

Cross-Domain Patterns

Architecture

Repository Structure

Quick Start (SAT domain)

Requirements

Paper

Citation

License

Acknowledgments

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
archive		archive
configs		configs
data		data
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
REPRODUCE.md		REPRODUCE.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Learned Control Layers for Combinatorial Local Search

Key Results

MaxSAT (NuWLS, 4 parameters, 10 state features)

SAT (WalkSAT, 2 parameters, 8 state features)

Cross-Domain Patterns

Architecture

Repository Structure

Quick Start (SAT domain)

Requirements

Paper

Citation

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages