Harmony Agent

First independent reproduction of OpenAI's published SWE Verified and AIME2025 scores for gpt-oss-20b with tools.

Harmony Agent encodes and decodes messages in gpt-oss's native Harmony format, bypassing the lossy Chat Completions conversion. It also provides the model's in-distribution tools (container.exec, repo_browser.*, and apply_patch), which we reverse-engineered from the model's training priors.

Results

Benchmark	Published	HarmonyAgent	95% CI
SWE Verified HIGH	60.7%	60.4%	[56.2%, 64.8%]
SWE Verified MEDIUM	53.2%	53.3%	[49.3%, 57.7%]
AIME 2025 MEDIUM w/ tools	90.4%	91.7%	[87.5%, 95.0%]

How to run

# Start vLLM server
docker run --ipc=host --gpus all --rm --memory 20g --cpus 6 -p 8000:8000 -v ~/.cache/:/root/.cache/ vllm/vllm-openai:v0.14.1-cu130 --model openai/gpt-oss-20b --tensor-parallel-size 1 --max-model-len 131072

# Set up environment
uv venv --python 3.12

# Run benchmarks
uv run python run_swe.py
uv run python run_aime2025.py

Paper

In harmony with gpt-oss arxiv

In harmony with gpt-oss

Citation

@misc{mavrin2026harmonygptoss,
      title={In harmony with gpt-oss}, 
      author={Borislav Mavrin},
      year={2026},
      eprint={2604.00362},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.00362}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
evaluation/swe_medium		evaluation/swe_medium
src		src
tests		tests
tools/bin		tools/bin
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
bash-only-20.txt		bash-only-20.txt
paper.pdf		paper.pdf
pyproject.toml		pyproject.toml
run_aime2025.py		run_aime2025.py
run_swe.py		run_swe.py
uv.lock		uv.lock
yuiyang.png		yuiyang.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harmony Agent

Results

How to run

Paper

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Harmony Agent

Results

How to run

Paper

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages