Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
4708407
fix(sell inference): emit registration spec + honor operator overrides
bussyjd May 12, 2026
b92b78e
fix(sell inference): auto-register on-chain to match sell http parity
bussyjd May 12, 2026
dbf6c89
fix(sell): allow signer != payTo on ERC-8004 register (not a spec con…
bussyjd May 12, 2026
2d36745
feat(stack): resume sell-inference offers on stack up
bussyjd May 12, 2026
6efdf3f
feat(stack): zero-command resume — auto-start gateway + relax storefr…
bussyjd May 12, 2026
9947bb0
fix(stack): three cosmetic fixes on top of the zero-command resume
bussyjd May 12, 2026
702d7c6
feat(stack): resume obol sell http offers on stack up
bussyjd May 12, 2026
e81dcad
Persist ERC-8004 agent identity
bussyjd May 12, 2026
e7766a5
chore(x402): strip debug log statements left from release-smoke 2026-…
bussyjd May 13, 2026
7125662
Merge #487 — paid services survive stack down/up + storefront fixes
bussyjd May 13, 2026
8fb42c9
Merge #489 — persist ERC-8004 agent identity
bussyjd May 13, 2026
2408b8a
feat(flows): add buy-external.sh for arbitrary external x402 sellers
bussyjd May 13, 2026
d1f97e3
fix(flow-13): align prompt + assertion with flow-14
bussyjd May 13, 2026
7f29a02
chore: drop X402_FACILITATOR_SKIP_PULL knob (upstream fix landed)
bussyjd May 13, 2026
3d8e231
fix(buy-external): normalize chain ids before comparing 402 accepts[0]
bussyjd May 13, 2026
7554b5e
fix(buy-external): respect k3d 32-char cluster-name cap
bussyjd May 13, 2026
c2dddc1
fix(buy-x402): set a non-Python User-Agent on outbound HTTP
bussyjd May 13, 2026
9b795eb
docs(plans): inference.v1337.org live-buy report
bussyjd May 13, 2026
b749f95
feat(buy-external): add KEEP_CLUSTER_ON_FAIL knob + diagnostic snapsh…
bussyjd May 14, 2026
eb13055
fix(flows): pick freshest of .build/obol vs .workspace/bin/obol in bo…
bussyjd May 14, 2026
849cd93
docs(skill): document Cloudflare-WAF UA pitfall + Go-side follow-up
bussyjd May 14, 2026
82108c3
docs(plans): retract v1337 controller-gap hypothesis — controller works
bussyjd May 14, 2026
df5fcff
refactor(buy-external): green-only cleanup gate, drop KEEP_CLUSTER_ON…
bussyjd May 15, 2026
2a1c1b2
Merge pull request #493 from ObolNetwork/chore/buy-external-followups
bussyjd May 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .agents/skills/obol-stack-dev/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ When the smoke gate goes red, check these first — each was a multi-hour debug:
| Public facilitator stuck on stale image | `x402.gcp.obol.tech` was on vanilla `1.4.5`, not the `prometheus-overlay` variant clients are paired with. | `obol-infrastructure#2612` — chart pivot to overlay 1.4.9 + Traefik HTTPRoute filter denies `/metrics` on the public hostname (matches existing `vmauth` idiom). |
| Free-tier RPC 408 on balance reads | `drpc.org`/`sepolia.base.org` rate-limit aggressively under release-smoke load. | Set `BASE_SEPOLIA_RPC` to a paid drpc lb URL or `ALCHEMY_BASE_SEPOLIA_API_KEY`. `flows/release-smoke.sh` runs `warn_unpaid_base_sepolia_rpc` preflight. `flows/lib.sh::scrub_secrets` collapses paid-RPC URLs to TLD-only in logs. |
| First request after fresh verifier deploy returns empty body | Traefik HTTPRoute is wired but verifier's serviceoffer-source watcher hasn't loaded the route yet. | `flows/flow-07-sell-verify.sh` + `flows/flow-08-buy.sh` — wrap 402-body fetch in 12×5s retry loop. |
| facilitator arm64 image runs amd64 binary | `ObolNetwork/x402-rs` v1.4.9 prom-overlay arm64 manifest packaging bug. | Workaround: `X402_FACILITATOR_SKIP_PULL=true` + locally-built arm64 image on QA host. Upstream fix on `fix/multiarch-overlay-arm64`. |
| facilitator arm64 image runs amd64 binary | Was an `ObolNetwork/x402-rs` prom-overlay arm64 manifest packaging bug. | **Fixed upstream**: `ObolNetwork/x402-rs#3` (merged 2026-05-13, `668b7bb`) dropped the redundant `--platform=$BUILDPLATFORM` pin from the prom-overlay builder stage. Registry image republished; arm64 manifest now ships an aarch64 ELF (digest `sha256:b209345c…`). The `X402_FACILITATOR_SKIP_PULL` knob has been removed from `flows/lib.sh`. |

**Diagnosis pattern**: a 503 from the verifier or 404 from a paid route almost never means the verifier is bad — it usually means the deployed image isn't what you think it is, the chain id form mismatched, or the upstream wasn't reachable. Confirm the running image first (`kubectl get deploy -n x402 x402-verifier -o jsonpath='{.spec.template.spec.containers[*].image}'`) before diving into x402 logic.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,13 +92,20 @@ flow-07 step 9 + flow-08 step 3 POST once to a freshly-deployed verifier and JSO

- **Fix in repo**: `b46f5d9` — wrap both 402-body assertions in 12×5s retry loops that break the moment the response parses as JSON.

### 9. `obol-infrastructure x402-rs` arm64 manifest contains amd64 binary
### 9. `ObolNetwork/x402-rs` arm64 manifest contained amd64 binary (historical)

`ObolNetwork/x402-rs` v1.4.9 prometheus-overlay's arm64 manifest variant ships a cross-built amd64 ELF (overlay Dockerfile pinned `--platform=$BUILDPLATFORM` on the builder stage).
Earlier `1.4.9` of `ghcr.io/obolnetwork/x402-facilitator-prometheus-overlay` shipped a cross-built amd64 ELF inside the arm64 manifest variant (overlay Dockerfile pinned `--platform=$BUILDPLATFORM` on the builder stage). Facilitator crashlooped on arm64 hosts with `exec format error`.

- **Symptom**: facilitator container on arm64 hosts crashloops with `exec format error` or comparable runtime errors.
- **Workaround in repo**: `X402_FACILITATOR_SKIP_PULL=true` keeps `flow-10` from re-pulling the broken registry image; build the facilitator locally on arm64 hosts. Knob landed in `flows/lib.sh`.
- **Upstream fix**: prepared but not pushed — `fix/multiarch-overlay-arm64` (drops the redundant `--platform` pin from the builder stage).
- **Fixed upstream**: `ObolNetwork/x402-rs#3` (merged 2026-05-13, `668b7bb`) dropped the platform pin. The publish workflow republished `1.4.9` on push to `main`; arm64 digest is now `sha256:b209345c5e05415df36444b307213c61f9ca08db9f8131d0ebfebefc244ba4ec`.
- **`X402_FACILITATOR_SKIP_PULL` knob removed** from `flows/lib.sh` once the republished image was validated against the release-smoke. If you encounter `exec format error` on an arm64 host now, the registry image is wrong, not the host — pull-fresh (`docker pull ghcr.io/obolnetwork/x402-facilitator-prometheus-overlay:1.4.9`) and check the manifest with `docker buildx imagetools inspect`.

### 10. Cloudflare WAF blocks default `Python-urllib` User-Agent on external sellers

When buying from external x402 sellers (sellers running outside our k3d cluster — e.g. `https://inference.v1337.org/...`), some sit behind Cloudflare's managed WAF, which **blocks the default `Python-urllib/X.Y` UA with HTTP 403 + Cloudflare error 1010** ("the owner of this website has banned your access based on your browser's signature"). Both the unpaid 402 probe and the paid `X-PAYMENT` request fail; buyers see misleading auth/signing errors instead of the real cause.

- **Symptom**: `buy.py probe` against an external seller fails with 403 (often surfaced as a JSON-decode error or "no accepts" downstream); `buy.py buy` against the same endpoint also fails before signature verification. Curl with default browser UA against the same URL returns 402 cleanly.
- **Fix in repo**: `c2dddc1` — added module-level `USER_AGENT = os.environ.get("OBOL_BUYER_USER_AGENT", "obol-buy-x402/1.0 (+https://github.com/ObolNetwork/obol-stack)")` to `internal/embed/skills/buy-x402/scripts/buy.py`, applied in `_probe_endpoint` (kind=http), `_probe_endpoint` (kind=inference), and the paid `X-PAYMENT` request in `buy_paid_oneshot`. Tested four UAs against v1337 (`curl/*`, generic `Mozilla/*`, `Chrome/*`, custom `obol-buy-x402/*`) — all four returned 402 cleanly. The fix is "send anything that isn't `Python-urllib`", not "send a specific browser UA". Operator override: `OBOL_BUYER_USER_AGENT`.
- **Follow-up (not yet confirmed)**: the same WAF block likely affects the Go-side controller probe at `internal/serviceoffercontroller/purchase.go:183`, since Go's `http.Client` defaults to `User-Agent: Go-http-client/1.1`. Verify against v1337 and apply the same UA override on the Go side if reproduced.

## Diagnostic Patterns

Expand Down
17 changes: 16 additions & 1 deletion cmd/obol/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,22 @@ GLOBAL OPTIONS:{{template "visibleFlagTemplate" .}}{{end}}
},
},
Action: func(ctx context.Context, cmd *cli.Command) error {
return stack.Up(cfg, getUI(cmd), cmd.Bool("wildcard-dns"))
u := getUI(cmd)
if err := stack.Up(cfg, u, cmd.Bool("wildcard-dns")); err != nil {
return err
}
// Re-apply cluster-side state for locally-persisted
// `obol sell *` offers. ServiceOffer CRs and the
// Service/Endpoints that route to the host gateway
// live in etcd, which is destroyed by `obol stack
// down`, so a fresh `stack up` would otherwise come
// back with the descriptors still on disk but no
// matching cluster resources. Best-effort: a resume
// failure does not block stack-up.
if err := resumeSellOffers(ctx, cfg, u); err != nil {
u.Warnf("Could not resume sell offers: %v", err)
}
return nil
},
},
{
Expand Down
Loading
Loading