Bayesian A/B Testing for Proportions

A Python package for Bayesian hypothesis testing of success-rate differences in any Bernoulli-like experiment, using analytic and approximate inference methods — lightweight and dependency-lean (no PyMC, Pyro, Stan, or other heavy probabilistic-programming frameworks required). Input data can be binary (0/1) or real-valued on (0, 1) — continuous scores are automatically binarized at a configurable threshold. Typical applications include comparing treatments, groups, items, model variants, or any two conditions whose outcomes can be expressed as proportions. Please check out our Getting Started guide for installation and quick examples.

Features

Effect-size inference for proportions — estimate and test the difference in success rates for both paired and non-paired samples
Hierarchical logistic regression — optionally place Inverse-Gamma hyperpriors on the prior variances so the model learns the prior scales from data, reducing sensitivity to prior choice (Jeffreys–Lindley robustness)
Savage–Dickey Bayes Factor — test a point-null hypothesis ('treatment effect / difference is zero') without fitting a separate null model
Posterior of the null & ROPE — quantify the posterior mass inside a Region of Practical Equivalence for nuanced decisions beyond simple reject/accept
Posterior predictive checks — assess model fit by comparing observed data to data simulated from the posterior
Bayes Factor Design Analysis (BFDA) — plan sample sizes to reach a target level of evidence before running the experiment
Sequential / streaming design — update the posterior batch-by-batch as data arrive and stop early once the Bayes factor crosses an upper or lower threshold (SequentialNonPairedBayesPropTest, SequentialPairedBayesPropTest)
Operating-characteristic analysis — calibrated-Bayes frequentist evaluation of the chosen decision rule: three-way decision rates (reject / accept / inconclusive), Type-I sweep over the baseline rate, 95 % credible-interval coverage, and the sequential stopping-time distribution, with matched-α Fisher's exact (non-paired) or McNemar exact (paired) baselines overlaid. Pre-built Monte-Carlo harness in bayesprop.utils.operation_characteristics and …_paired, plus turnkey notebooks for both designs
Publication-ready plots — posterior distributions, predictive checks, Savage–Dickey density-ratio plots, BFDA power curves, sequential BF₁₀ trajectories, and OC diagnostic plots (with Wilson Monte-Carlo bands) out of the box

Models

All paired methods are accessible through a single unified facade — PairedBayesPropTest(method=…) — that dispatches to the chosen inference backend.

Model	Class / `method`	Method	When to use
Non-paired Beta–Bernoulli	`NonPairedBayesPropTest`	Conjugate Beta posteriors per arm; P(B>A) by quadrature, Δ summaries by Monte Carlo	Independent groups, exact & fast
Paired Logistic (Laplace)	`PairedBayesPropTest(method="laplace")`	MAP + Laplace (fixed or hierarchical IG hyperpriors)	Paired scores, fast, default
Paired Logistic (Pólya–Gamma)	`PairedBayesPropTest(method="pg")`	Exact Gibbs sampling (fixed or hierarchical IG hyperpriors)	Paired scores, small n, exact posterior
Paired Bayesian Bootstrap	`PairedBayesPropTest(method="bootstrap")`	Nonparametric — Dirichlet weights on paired differences	Paired scores, no prior elicitation, ROPE-driven (no Savage–Dickey BF)

Quick start

import numpy as np
from bayesprop.resources.bayes_paired import PairedBayesPropTest

# Paired binary data (y_A[i] and y_B[i] refer to the same item)
y_A = np.array([1,1,0,1,1,0,1,1,1,1,1,1,1,0,1,1,1,0,1,1])     # 16/20 = 0.80
y_B = np.array([0,1,0,0,1,0,0,1,0,0,1,0,1,0,0,1,0,0,0,0])     #  6/20 = 0.30

# Fit posterior & summarise
model = PairedBayesPropTest(seed=42).fit(y_A, y_B)

s = model.summary
print(f"θ_A = {s.theta_A_mean:.4f},  θ_B = {s.theta_B_mean:.4f}")
print(f"Mean Δ (θ_A − θ_B) = {s.mean_delta:+.4f}")
print(f"95% CI = [{s.ci_95.lower:.4f}, {s.ci_95.upper:.4f}]")
print(f"P(A > B) = {s.p_A_greater_B:.4f}")

# ── Unified decision ─────────────────────────────────────────────────
d = model.decide()
bf = d.bayes_factor

print("\n--- Unified Decision ---")
print(f"  Bayes Factor: BF_10 = {bf.BF_10:.2f}  → {bf.decision}")
print(f"  Posterior Null: P(H0|D) = {d.posterior_null.p_H0:.4f}  → {d.posterior_null.decision}")
print(f"  ROPE: {d.rope.decision} ({d.rope.pct_in_rope:.1%} in ROPE)")

# Plots
model.plot_posteriors()
model.plot_posterior_delta()
model.plot_savage_dickey()

Installation

pip install bayesprop

Or with uv:

uv add bayesprop

For development (from source):

git clone https://github.com/AVoss84/bayesProp.git
cd bayesprop
uv venv --python 3.13
uv sync
source .venv/bin/activate

Dependencies

Python ≥ 3.13
numpy, scipy, matplotlib, pandas
pydantic (v2)
polyagamma

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. & Rubin, D. B. (2013). Bayesian Data Analysis (3rd ed.). Chapman & Hall/CRC.
Kruschke, J. K. (2018). Rejecting or accepting parameter values in Bayesian estimation. Advances in Methods and Practices in Psychological Science, 1(2), 270–280.
Polson, N. G., Scott, J. G. & Windle, J. (2013). Bayesian inference for logistic models using Pólya–Gamma latent variables. JASA, 108(504), 1339–1349.
Rubin, D. B. (1981). The Bayesian Bootstrap. The Annals of Statistics, 9(1), 130–134.
Schönbrodt, F. D. & Wagenmakers, E.-J. (2018). Bayes factor design analysis: Planning for compelling evidence. Psychonomic Bulletin & Review, 25(1), 128–142.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github		.github
.vscode		.vscode
docs		docs
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
README.md		README.md
justfile		justfile
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian A/B Testing for Proportions

Features

Models

Quick start

Installation

Dependencies

References

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bayesian A/B Testing for Proportions

Features

Models

Quick start

Installation

Dependencies

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages