Skip to content

fix(sell inference): emit registration spec + honor operator overrides#485

Closed
bussyjd wants to merge 3 commits into
mainfrom
fix/sell-inference-registration
Closed

fix(sell inference): emit registration spec + honor operator overrides#485
bussyjd wants to merge 3 commits into
mainfrom
fix/sell-inference-registration

Conversation

@bussyjd
Copy link
Copy Markdown
Collaborator

@bussyjd bussyjd commented May 12, 2026

Summary

obol sell inference produced ServiceOffers that were silently missing the ERC-8004 registration block, so /.well-known/agent-registration.json was never routed for inference-typed sells. Surfaced today on spark2 while probing https://inference.v1337.org/.well-known/agent-registration.json — got the Traefik fall-through 404 with a Next.js HTML body. Manual kubectl patch enabled registration but exposed a second, controller-side defect where any operator-supplied description was clobbered on the way out.

Four related defects in one cohesive fix:

# Layer What
A cmd/obol/sell.go sellInferenceCommand No --register-* flags exposed
B cmd/obol/sell.go buildInferenceServiceOfferSpec Never wrote spec.registration; hardcoded spec.model.name="ollama" regardless of --model
C cmd/obol/sell.go buildInferenceServiceOfferSpec call site No plumbing from the operator's flags into the spec
D internal/serviceoffercontroller/render.go buildActiveRegistrationDocument unconditionally overwrote operator Spec.Registration.Description for inference offers

Test coverage that would have caught this

Adding tests was the asked-for half of the work — each one pins a specific defect so a regression has to fight every layer.

  • TestSellInference_Flags now requires --no-register, --register-name, --register-description, --register-image, --register-skills, --register-domains, --register-metadata. Their absence on main was defect A.
  • TestBuildInferenceServiceOfferSpec_RegistrationEnabledByDefault — defaults produce spec.registration.enabled = true and name = <offer name>.
  • TestBuildInferenceServiceOfferSpec_NoRegisterOmitsRegistration--no-register opt-out path.
  • TestBuildInferenceServiceOfferSpec_OperatorOverridesWin — operator-supplied name/description/image/skills/domains all survive into spec.registration verbatim.
  • TestBuildInferenceServiceOfferSpec_ModelNameNotHardcodedspec.model.name reflects --model, not the historical "ollama" literal.
  • TestBuildActiveRegistrationDocument_KeepsOperatorDescription — controller-side: operator description survives into the published AgentRegistration document (defect D).
  • TestBuildActiveRegistrationDocument_FallsBackToModelDescriptionForInference — the other branch: empty operator description on inference offers still gets the model-aware default, not the generic one.

Run:

$ go test ./cmd/obol ./internal/serviceoffercontroller -count=1
ok  cmd/obol  3.624s
ok  internal/serviceoffercontroller  10.259s

Reverse-validation that the tests catch the bugs

Each new test maps to a specific line that was wrong on main:

  • The flag-list assertion fails before adding cmd/obol/sell.go:188216 (the new flag block).
  • The _RegistrationEnabledByDefault / _OperatorOverridesWin tests fail before changing buildInferenceServiceOfferSpec's signature to accept the registration map.
  • The _ModelNameNotHardcoded test fails on the name: "ollama" literal at the old line.
  • The controller test fails on the if owner.IsInference() && owner.Spec.Model.Name != "" unconditional overwrite at render.go:589.

Drive-by

TestSellInference_Flags was already failing on main after #470 removed the --price default but didn't update the assertStringDefault(t, flags, "price", "0.001") assertion. Bumped to assert "" with a comment explaining the post-#470 contract. Not the headline fix but rolling it in keeps make test green.

Notes for the reviewer

  • The rename sellHTTPRegistrationInput → sellRegistrationInput (and the matching helper) is mechanical and contained: only sell.go + sell_test.go reference these names. Did it because the helper is now shared between inference and http call sites.
  • Live end-to-end validation on spark2 was attempted; the dev k3d cluster had wound down between checks. The unit-test coverage above pins the three layers independently; happy to re-do the live test against a refreshed cluster if reviewers want belt-and-suspenders.

Generated with Claude Code

bussyjd added 3 commits May 12, 2026 17:31
`obol sell inference` was producing ServiceOffers with three latent
defects that together stopped /.well-known/agent-registration.json
from ever being routed for inference-typed sells:

1. The inference subcommand never exposed `--register-*` flags
   (compare with `obol sell http` which does), even though the help
   text on `--register` says "registration is enabled by default".
2. `buildInferenceServiceOfferSpec` never wrote `spec.registration`
   onto the offer at all, so the controller's
   `reconcileRegistrationStatus` saw `Enabled=false` (zero value) and
   emitted `Registered=True Disabled` with no RegistrationRequest CR
   and no /.well-known HTTPRoute.
3. `buildInferenceServiceOfferSpec` hardcoded
   `spec.model.name = "ollama"` regardless of the actual `--model`
   value, so anything downstream that keyed off the model id (the
   controller's description default included) was looking at the
   wrong string.

Surfaced today on spark2 while trying to fetch
`https://inference.v1337.org/.well-known/agent-registration.json` —
got the Traefik fall-through 404. Manual `kubectl patch` enabled
registration, exposed a fourth defect:

4. `buildActiveRegistrationDocument` in the serviceoffer-controller
   unconditionally overwrote `Spec.Registration.Description` for
   inference offers with `"<model.name> inference via x402
   micropayments"`, even when the operator had supplied an explicit
   description at sell time.

This PR fixes all four:

- Add `--no-register`, `--register-name`, `--register-description`,
  `--register-image`, `--register-skills`, `--register-domains`,
  `--register-metadata` to `obol sell inference`.
- Rename `sellHTTPRegistrationInput` / `buildSellHTTPRegistrationConfig`
  to the unqualified `sellRegistrationInput` /
  `buildSellRegistrationConfig` since they now serve both inference
  and http call sites.
- Extend `buildInferenceServiceOfferSpec` to accept the resolved
  model name and the registration block, write
  `spec.model.name = <real model id>`, and merge the registration
  block into `spec.registration` when non-empty.
- In the controller's `buildActiveRegistrationDocument`, only fall
  back to the model-aware description string when the operator left
  `Spec.Registration.Description` empty.

Tests that would have caught the regression earlier:

- `TestSellInference_Flags` now requires the six registration flags;
  their absence on `main` was the bug.
- `TestBuildInferenceServiceOfferSpec_RegistrationEnabledByDefault`
  pins that defaults produce `spec.registration.enabled = true` and
  the offer name as `spec.registration.name`.
- `TestBuildInferenceServiceOfferSpec_NoRegisterOmitsRegistration`
  pins the `--no-register` opt-out.
- `TestBuildInferenceServiceOfferSpec_OperatorOverridesWin` pins
  that operator-supplied name/description/image/skills/domains all
  survive into the spec verbatim.
- `TestBuildInferenceServiceOfferSpec_ModelNameNotHardcoded` pins
  that `spec.model.name` reflects `--model`, not the historical
  "ollama" literal.
- `TestBuildActiveRegistrationDocument_KeepsOperatorDescription`
  pins the controller-side fix: an operator description survives
  the buildActiveRegistrationDocument pass.
- `TestBuildActiveRegistrationDocument_FallsBackToModelDescriptionForInference`
  pins the other branch — inference offers with no operator
  description still get the model-aware default, not the generic
  one.

Drive-by: `TestSellInference_Flags` was failing on `main` after #470
removed the `--price` default but didn't update the corresponding
assertion. Updated to assert `--price` default = "" with a comment
explaining the contract.
The first revision of #485 stopped at "produce a ServiceOffer with a
populated spec.registration block". That made
/.well-known/agent-registration.json get routed, but left the offer in
Registered=False AwaitingExternalRegistration until someone manually
ran `obol sell register`. The serviceoffer-controller's storefront
filter (buildServiceCatalogJSON in render.go) requires Ready=True,
which transitively requires Registered=True, so the offer was silently
excluded from /api/services.json — the very feed the operator's own
storefront UI consumes.

obol sell http already auto-registers on the same code path. Mirror
that here so the inference path reaches the same end state without a
follow-up obol sell register step.

Changes:
- Extract shouldAutoRegisterSell(spec, tunnelURL) as the shared
  decision predicate. The same gate now drives both the http and
  inference auto-register call sites; defensively returns false when
  the registration block is missing, disabled, malformed, or the
  tunnel URL is empty.
- sellInferenceCommand Action: after kubectlApply + EnsureTunnelForSell
  succeed, if shouldAutoRegisterSell says yes, call
  autoRegisterServiceOffer with the resolved name/description/wallet
  pulled from the spec. Surface failures as warnings + a re-run hint
  rather than aborting the gateway start, because the underlying call
  needs gas on the target chain and we do not want a one-off RPC
  hiccup to block local dev.

Test coverage that would have caught the regression:
- TestShouldAutoRegisterSell - table-driven over six scenarios
  including the defensive cases (registration not a map,
  registration.enabled not a bool). Both call sites use the helper.
- TestSellInferenceAction_InvokesAutoRegister - source-level guard
  that scans the sellInferenceCommand body for shouldAutoRegisterSell
  and autoRegisterServiceOffer. The bug we just fixed was "Action
  calls neither"; an innocent refactor could remove the calls without
  any unit-level signal otherwise.

Operational note: the agent's remote-signer wallet needs a small ETH
balance on the target chain (~0.20-0.50 USD typical) for the on-chain
register tx. If the wallet is unfunded the warning fires and the
offer stays in AwaitingExternalRegistration; the operator can fund
the wallet and re-run obol sell register to finish.
…straint)

The autoRegisterServiceOffer pre-flight check rejected any registration
whose signer didn't match the offer's payTo wallet:

  registration signer 0xA... does not match the payment wallet 0xB...
  Use a matching signer, omit --wallet so the remote-signer wallet is
  used, or pass --no-register

The error wording read like an ERC-8004 limitation but isn't. ERC-8004
treats the agent OWNER (msg.sender at register time) and the agent
WALLET (settable post-mint via setAgentWallet) as independent
addresses. x402 settlement honors the offer's spec.payment.payTo
directly — buyers pay that address regardless of what the registry's
getAgentWallet returns. The "hot signer, cold/multisig payee" split is
the canonical pattern.

The historic guard existed because the obol CLI never exposed
setAgentWallet, so a mismatched registration left operators with no
in-CLI recovery path. This change instead surfaces the split as an
informational note + adds `obol sell update <name> --pay-to <new>` as
the recovery surface (already in tree; just needed test coverage and
the connection wired into the diagnostic).

Changes:
- signerPayeeDelegationNote(signer, payTo) returns a human-readable
  note when the two diverge (case-insensitive, whitespace-tolerant,
  empty on either side) and "" otherwise. Used by
  autoRegisterServiceOffer instead of the previous early-return.
- buildSellUpdatePatch(payTo, chain, price) extracted from the inline
  sellUpdateCommand Action so the patch shape — the thing that
  actually hits the cluster — is testable without a live offer.
  Action calls the helper instead of inlining the same logic.

Tests:
- TestSignerPayeeDelegationNote — 6-case table: match,
  case-insensitive, whitespace, empty payTo, empty signer (defensive),
  true mismatch (assertions name the addresses + advise sell update).
- TestAutoRegister_AllowsSignerPayeeMismatch — source-level guard
  asserting the banned error wording is gone from
  autoRegisterServiceOffer and the soft-notice path is wired. Anyone
  re-introducing the check has to delete this test too, which forces
  them to read the rationale.
- TestBuildSellUpdatePatch_PayToOnly — `obol sell update <name>
  --pay-to 0xBooB` builds a patch that touches only
  spec.payment.payTo, not network or price.
- TestBuildSellUpdatePatch_PriceSwitchNullsOldKeys — table over
  perRequest/perMTok/perHour: the unused keys are explicitly null so
  a switch (e.g. perRequest → perMTok) doesn't leave the previous key
  fighting through merge semantics.
- TestBuildSellUpdatePatch_NoFieldsErrors — error fires when no
  fields are set, and the error names the flags the operator should
  pass.
- TestSellUpdate_PayToFlagSurface — `obol sell update` exposes
  --pay-to (with --wallet/--recipient/-w aliases via payToFlag), and
  --namespace is Required.
@bussyjd
Copy link
Copy Markdown
Collaborator Author

bussyjd commented May 12, 2026

Superseded by #487 — all three commits (4708407, b92b78e, dbf6c89) folded into the umbrella branch verbatim, plus the resume + storefront work that grew out of them. Reviewing #487 reviews these changes too.

@bussyjd bussyjd closed this May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant