Skip to content

fix(server,relay): bind dualstack wildcard for IPv6-only clusters#225

Merged
rg0now merged 1 commit into
l7mp:mainfrom
ti:main
Jun 3, 2026
Merged

fix(server,relay): bind dualstack wildcard for IPv6-only clusters#225
rg0now merged 1 commit into
l7mp:mainfrom
ti:main

Conversation

@ti

@ti ti commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

StartServer and NewRelayGen hardcoded 0.0.0.0 together with the
IPv4-only network strings udp4/tcp4 (in DTLS as well). On IPv4-only
hosts this is fine, but on IPv6-only hosts the listener and relay
sockets cannot be reached at all and TURN allocations silently fail
with no obvious log line.

This change switches both call sites to the unspecified-address /
family-neutral network strings:

  • server.go — listener bind: "0.0.0.0:%d"":%d",
    socketPool.ListenPacket("udp4", ...)"udp",
    net.ResolveUDPAddr("udp4", ...) and dtls.ListenWithOptions("udp4", ...)
    "udp". The TCP/TLS branches were already family-neutral, so they
    just inherit the new :port form.
  • relay.goRelayGen.Address default: "0.0.0.0""". The
    comment at AllocatePacketConn already noted that an empty address is
    the only correct value when the requested network can be either v4 or
    v6 — this just makes the default match that comment.

On Linux with the default net.ipv6.bindv6only=0, this opens a single
IPv6 wildcard socket that accepts both IPv4 (mapped) and IPv6 peers; on
IPv4-only or IPv6-only hosts it still binds the available family. Net
result: IPv4-only deployments are unaffected, IPv6-only deployments
start working, and dual-stack deployments now actually carry both
families instead of silently dropping IPv6 traffic.

Verification

Verified end-to-end on an IPv6-only Linux host with a custom build of
stunnerd from this branch:

  • Before: all 16 UDP socket-pool sockets bound on 0.0.0.0:<port>,
    /proc/net/udp6 empty, TURN allocate from an IPv6-only client times
    out (no allocate response).
  • After: all 16 sockets bound on [::]:<port> (IPv6 wildcard,
    dual-stack via bindv6only=0), /proc/net/udp empty, TURN allocate
    from an IPv6-only client succeeds — coturn turnutils_uclient
    receives an Allocate Success and a relay address on the listener's
    IPv6 prefix.

Illustrative turnutils_uclient output (addresses elided / replaced
with documentation-prefix examples):

0: INFO: IPv6. Connected from: <client-ipv6>:<ephemeral>
0: INFO: IPv6. Connected to:   <listener-ipv6>:<turn-port>
0: INFO: allocate sent
0: INFO: allocate response received:
0: INFO: success
0: INFO: IPv6. Received relay addr: <relay-ipv6>:<relay-port>

Smoke-tested in a dual-stack environment with no regression.

Test plan

  • CI green on main
  • make test locally
  • Manual sanity on a dual-stack host (no behavior change)
  • Manual sanity on an IPv4-only host (no behavior change)
  • Manual verification on an IPv6-only host (TURN allocate now
    succeeds, was previously silently broken)

Listener and relay sockets hardcoded "0.0.0.0" + the IPv4-only network
strings "udp4"/"tcp4". On IPv4-only hosts that's fine, but on IPv6-only
hosts (e.g. IPv6-only EKS) the socket cannot be reached at all, so
TURN allocations silently fail. On dual-stack hosts (Linux default with
net.ipv6.bindv6only=0) it still works for IPv4 peers but breaks for
IPv6-only peers, including pod-to-pod traffic on IPv6-only Kubernetes.

Switch to the unspecified address (":port" / "") plus the family-neutral
network strings ("udp" / "tcp"). The kernel then opens an IPv6 wildcard
socket that accepts IPv4 (mapped) and IPv6 peers on dual-stack hosts and
the available family on single-family hosts, matching the behavior of
Go's net.Listen with no host. Verified end-to-end on IPv6-only EKS:

  - 16-thread UDP listener now bound on [::]:7006 (was 0.0.0.0:7006)
  - TURN allocate succeeds from an IPv6-only client
  - relay address returned to the client is on the listener's IPv6 addr

No semantic change for IPv4-only deployments: ":port" continues to bind
to the IPv4 wildcard when no IPv6 stack is present.
@rg0now rg0now self-assigned this May 26, 2026
@rg0now

rg0now commented May 26, 2026

Copy link
Copy Markdown
Member

Thanks, looks legit. Unfortunately, this is pending on some upstream pion infra fixes (see the failing tests), see pion/transport#377. I'll try to push that PR and then we can merge this right away.

@ti

ti commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

Confirmed the failing test job is fully resolved by pion/transport#377 — no fix needed in this PR.

Reproduction

The failures (TestStunnerDefaultServerVNet, TestStunnerConfigFileRoundTrip, TestStunnerURIParser, TestCredentialParser, TestStunnerAuthServerVNet, TestStunnerReconcile, TestStunnerReconcileWithVNetE2E, TestStunnerReconcileWithVNetRollback, TestPortRangePacketConn, TestTurncatLongterm) all stem from VNet's ResolveIPAddr("ip", "") returning "host not found" once the listener bind switches from "0.0.0.0:port" to the wildcard ":port". That's exactly the path pion/transport#377 fixes by short-circuiting the empty address to INADDR_ANY (and IPv6unspecified for ip6).

Verification on this branch

Applying pion/transport#377 via a temporary go.mod replace and re-running just the previously-failing tests:

$ go mod edit -replace=github.com/pion/transport/v4=github.com/rg0now/transport/v4@954fadaa518d
$ go mod tidy
$ go test -run "TestStunnerDefaultServerVNet|TestStunnerConfigFileRoundTrip|TestStunnerURIParser|TestCredentialParser|TestStunnerAuthServerVNet|TestStunnerReconcile|TestStunnerReconcileWithVNetE2E|TestStunnerReconcileWithVNetRollback|TestPortRangePacketConn|TestTurncatLongterm" -count=1 .
ok  	github.com/l7mp/stunner	80.978s

The replace was rolled back; this comment is informational only — adding the replace to this PR's go.mod would be the wrong fix because it would pin stunner to a fork.

Suggested path

Merge pion/transport#377 (already approved by @JoTurk) and cut a pion/transport/v4 patch release; then this PR's CI goes green on the next push without further changes here. Happy to help in any way that's useful — review/test-cases on #377, retesting #225 after the bump, etc.


Related work (cross-linked for indexing)

The four together close the loop: this PR fixes the listener bind, #226 fixes the relay socket bind, #566/#567 fix the protocol-level family check on a dual-stack relay, and #377 unblocks CI for this PR specifically.

@rg0now

rg0now commented May 29, 2026

Copy link
Copy Markdown
Member

Can you please go-get the latest version of pion/transport (go get github.com/pion/transport/v4@v4.0.2), commit and re-push? Your patch must pick up the latest upstream changes otherwise the integration tests will never stop failing.

@rg0now rg0now merged commit a8ea3b0 into l7mp:main Jun 3, 2026
2 of 3 checks passed
@rg0now

rg0now commented Jun 3, 2026

Copy link
Copy Markdown
Member

Thanks, applied. I'll take care of the version bump separately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants