Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,39 @@ RUN set -eux; \
rm -f "taplo-linux-${ARCH}"; \
taplo --version;

# Install Tailscale (binary tarball — Tailscale ships static x86_64 + arm64
# Linux binaries directly, no apt repo dance needed at build time). Bakes
# both `tailscale` (CLI) and `tailscaled` (daemon) into /usr/local/bin so the
# container image can join a tailnet on first start without per-host install
# friction. Opt-in remains gated by the TAILSCALE_AUTHKEY env var at runtime
# — no key, no daemon (see setup-tailscale.sh).
RUN set -eux; \
case "${TARGETARCH}" in \
amd64) ARCH=amd64 ;; \
arm64) ARCH=arm64 ;; \
*) echo "Unsupported architecture: ${TARGETARCH}"; exit 1 ;; \
esac; \
TS_VERSION="$( \
curl -fsSL https://pkgs.tailscale.com/stable/?mode=json \
| sed -n 's/.*"'"$ARCH"'":[[:space:]]*"tailscale_\([^_]*\)_'"$ARCH"'\.tgz".*/\1/p' \
)"; \
if [ -z "$TS_VERSION" ]; then \
echo "Failed to resolve Tailscale latest version for $ARCH"; \
exit 1; \
fi; \
URL="https://pkgs.tailscale.com/stable"; \
FILE="tailscale_${TS_VERSION}_${ARCH}.tgz"; \
curl -fsSL "${URL}/${FILE}" -o ts.tgz; \
curl -fsSL "${URL}/${FILE}.sha256" -o ts.tgz.sha256; \
EXPECTED_SHA="$(awk '{print $1}' ts.tgz.sha256)"; \
echo "${EXPECTED_SHA} ts.tgz" | sha256sum -c -; \
tar -xzf ts.tgz; \
install -m 0755 "tailscale_${TS_VERSION}_${ARCH}/tailscale" /usr/local/bin/tailscale; \
install -m 0755 "tailscale_${TS_VERSION}_${ARCH}/tailscaled" /usr/local/sbin/tailscaled; \
rm -rf "tailscale_${TS_VERSION}_${ARCH}" ts.tgz ts.tgz.sha256; \
tailscale version; \
tailscaled --version;

# Install cursor-agent CLI (installs to ~/.local/bin)
ENV PATH="/root/.local/bin:${PATH}"
RUN set -eux; \
Expand Down
12 changes: 12 additions & 0 deletions assets/workspace/.devcontainer/docker-compose.local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,16 @@
# environment:
# - MY_API_KEY=secret123

# Optional: Tailscale SSH for direct mesh access (e.g. Cursor GUI workaround).
# Generate an auth key at https://login.tailscale.com/admin/settings/keys
# (Reusable + Ephemeral recommended). The base docker-compose.yml already
# ships /dev/net/tun + NET_ADMIN/NET_RAW + the tailscale-state volume; setting
# TAILSCALE_AUTHKEY here is the only opt-in step. See docs/designs/tailscale-ssh.md.
#
# services:
# devcontainer:
# environment:
# - TAILSCALE_AUTHKEY=tskey-auth-XXXX
# - TAILSCALE_HOSTNAME=myproject-devc-mybox # optional override

services: {}
15 changes: 15 additions & 0 deletions assets/workspace/.devcontainer/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,23 @@ services:
# Container socket for Docker-out-of-Docker (DooD) - enables sidecars
# Path is set by initialize.sh in .env, falls back to Docker default
- "${CONTAINER_SOCKET_PATH:-/var/run/docker.sock}:/var/run/docker.sock"
# Tailscale state — persistent across container recreate so the same
# node identity is reused (avoids ephemeral-key collisions on the
# tailnet when the previous still-online ephemeral hasn't expired yet).
# Idle when Tailscale isn't enabled. See setup-tailscale.sh.
- tailscale-state:/var/lib/tailscale
environment:
# PRE_COMMIT_HOME, UV_PROJECT_ENVIRONMENT, VIRTUAL_ENV are set in the image (Containerfile)
# Tell Podman/Docker to use the mounted socket (for sidecars and container builds)
- CONTAINER_HOST=unix:///var/run/docker.sock
- DOCKER_HOST=unix:///var/run/docker.sock
# Required for Tailscale SSH (real TUN device, not userspace networking).
# Idle when no TAILSCALE_AUTHKEY is set. See docs/designs/tailscale-ssh.md.
devices:
- /dev/net/tun:/dev/net/tun
cap_add:
- NET_ADMIN
- NET_RAW
# Keep container running for VS Code to attach
command: sleep infinity
# Use root user (default from image)
Expand All @@ -23,3 +35,6 @@ services:
# Disable SELinux labeling to allow socket access (required for sidecars)
security_opt:
- label=disable

volumes:
tailscale-state:
8 changes: 8 additions & 0 deletions assets/workspace/.devcontainer/scripts/post-start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,12 @@ sudo chmod 666 /var/run/docker.sock 2>/dev/null || true
echo "Syncing dependencies..."
just --justfile "$PROJECT_ROOT/justfile" --working-directory "$PROJECT_ROOT" sync

# Bring container into tailnet if TAILSCALE_AUTHKEY is set (no-op otherwise).
# Image bake handles install; this script only handles runtime connect.
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -x "$SCRIPT_DIR/setup-tailscale.sh" ]; then
"$SCRIPT_DIR/setup-tailscale.sh" connect || \
echo "Tailscale: connect failed but post-start continues (container still usable via devcontainer protocol)"
fi

echo "Post-start setup complete"
153 changes: 153 additions & 0 deletions assets/workspace/.devcontainer/scripts/setup-tailscale.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
#!/bin/bash

# setup-tailscale.sh — bring the container into the user's tailnet.
#
# Tailscale binaries (tailscale + tailscaled) are baked into the image at
# build time, so this script only handles the runtime concerns:
# - opt-in via TAILSCALE_AUTHKEY env var (no key, no daemon)
# - daemon startup with state in a persistent compose volume
# - idempotent: re-runs are no-ops when already authed under the same hostname
# - fail loud (exit non-zero) on missing /dev/net/tun + caps — the silent
# userspace-networking fallback was removed because it cannot serve
# inbound SSH, which is the entire point of running Tailscale here
#
# Single subcommand `connect` (kept for forward-compat with potential `down`,
# `status` siblings later — see issue #545+ for `just tailscale-*` recipes).

set -euo pipefail

STATE_DIR="/var/lib/tailscale"
STATE_FILE="$STATE_DIR/tailscaled.state"
SOCKET="/var/run/tailscale/tailscaled.sock"
LOG_TAG="Tailscale:"

log() { printf '%s %s\n' "$LOG_TAG" "$*"; }
warn() { printf '%s WARNING: %s\n' "$LOG_TAG" "$*" >&2; }
die() { printf '%s ERROR: %s\n' "$LOG_TAG" "$*" >&2; exit 1; }

require_authkey() {
[ -n "${TAILSCALE_AUTHKEY:-}" ] && return 0
log "TAILSCALE_AUTHKEY not set, skipping (this is the documented opt-out path)."
return 1
}

require_tun() {
if [ ! -c /dev/net/tun ]; then
die "/dev/net/tun is not available inside the container — Tailscale SSH cannot work without a real TUN device.

The default workspace docker-compose.yml ships with the necessary device + caps:
devices:
- /dev/net/tun:/dev/net/tun
cap_add:
- NET_ADMIN
- NET_RAW

If you've customized your compose file, restore those entries (or unset TAILSCALE_AUTHKEY to skip Tailscale entirely)."
fi
}

resolve_hostname() {
if [ -n "${TAILSCALE_HOSTNAME:-}" ]; then
printf '%s\n' "$TAILSCALE_HOSTNAME"
return
fi
local project="devc"
local devc_json
devc_json="$(dirname "${BASH_SOURCE[0]}")/../devcontainer.json"
if [ -f "$devc_json" ]; then
local name
name=$(python3 -c \
"import json,sys; print(json.load(sys.stdin).get('name',''))" \
< "$devc_json" 2>/dev/null || true)
if [ -n "$name" ]; then
project="${name%-devc}"
fi
fi
# DNS labels: lowercase + alphanumerics + hyphens. Replace anything else.
project="$(printf '%s' "$project" | tr '[:upper:]_' '[:lower:]-' | tr -cd 'a-z0-9-')"
printf '%s-devc-%s\n' "$project" "$(hostname -s)"
}

# Returns 0 if tailscaled is already running and "up" against our intended
# hostname; otherwise returns non-zero so cmd_connect proceeds with `up`.
already_connected_as() {
local want="$1"
pgrep -x tailscaled >/dev/null 2>&1 || return 1
[ -S "$SOCKET" ] || return 1
# Tailscale CLI uses the daemon socket; --self gives the local node info.
local got
got=$(tailscale status --self --json 2>/dev/null \
| python3 -c \
"import json,sys; d=json.load(sys.stdin); print(d.get('Self',{}).get('HostName',''))" \
2>/dev/null || true)
[ -n "$got" ] && [ "$got" = "$want" ]
}

cmd_connect() {
require_authkey || return 0
require_tun

local hostname
hostname=$(resolve_hostname)
log "target hostname: $hostname"

if already_connected_as "$hostname"; then
log "already connected as $hostname — no-op"
return 0
fi

mkdir -p "$STATE_DIR" "$(dirname "$SOCKET")"

if ! pgrep -x tailscaled >/dev/null 2>&1; then
log "starting tailscaled (state=$STATE_FILE socket=$SOCKET)"
# setsid detaches the daemon from this shell's process group so it
# survives postStartCommand's exit. Output to a log file (overflowing
# to container stderr would spam compose logs).
setsid /usr/local/sbin/tailscaled \
--state="$STATE_FILE" \
--socket="$SOCKET" \
>/var/log/tailscaled.log 2>&1 &
# Wait briefly for the socket — gives a clean error if the daemon
# crashes immediately (TUN missing, perms wrong) rather than a vague
# "tailscale up failed" later.
local _
for _ in $(seq 1 20); do
[ -S "$SOCKET" ] && break
sleep 0.25
done
[ -S "$SOCKET" ] || die "tailscaled failed to create socket within 5s — check /var/log/tailscaled.log"
fi

log "tailscale up --ssh --hostname=$hostname"
if tailscale --socket="$SOCKET" up \
--ssh \
--authkey="$TAILSCALE_AUTHKEY" \
--hostname="$hostname" \
--accept-routes; then
log "connected as $hostname"
else
die "tailscale up failed — check /var/log/tailscaled.log + Tailscale ACLs (must allow SSH for autogroup:member -> autogroup:self)"
fi
}

case "${1:-}" in
connect) cmd_connect ;;
*)
cat <<EOF >&2
Usage: $(basename "$0") connect

Brings the container into the tailnet identified by \$TAILSCALE_AUTHKEY.
No-ops silently when TAILSCALE_AUTHKEY is unset.

Env vars:
TAILSCALE_AUTHKEY required — opt-in. Without it, this is a no-op.
TAILSCALE_HOSTNAME optional — overrides the auto-derived <project>-devc-<host> name.

Required compose config (shipped by default in the workspace template):
devices: [/dev/net/tun:/dev/net/tun]
cap_add: [NET_ADMIN, NET_RAW]
volumes: [tailscale-state:/var/lib/tailscale]
EOF
exit 1
;;
esac
109 changes: 109 additions & 0 deletions docs/designs/tailscale-ssh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Tailscale SSH for Devcontainers

Design document for opt-in Tailscale SSH access to vigOS devcontainers.

Refs: #85

## Problem

Cursor's devcontainer-protocol mode cannot route shell commands through the AI agent (Cursor IDE limitation; the agent's shell tool fails). VS Code's devcontainer protocol works fine, and Cursor's CLI/terminal mode also works — only Cursor GUI + devcontainer protocol is broken.

The workaround is to bypass the devcontainer protocol entirely: connect Cursor (and other IDEs) via SSH to the container, treating it as a regular host on the user's tailnet. No port forwarding, no jump hosts, no manual ssh-key juggling.

## Solution

Run Tailscale inside the devcontainer with SSH enabled. The user generates an auth key once (manually or via OAuth API), exposes it via `TAILSCALE_AUTHKEY` in `docker-compose.local.yaml`, and the post-start lifecycle hook brings the container into the tailnet on every start.

Connect with `ssh root@<hostname>` from anywhere on the tailnet.

## Architecture decisions

| Decision | Choice | Rationale |
|---|---|---|
| Networking mode | **Real TUN** (`/dev/net/tun` device + `NET_ADMIN`/`NET_RAW` caps) | Tailscale's `--ssh` server requires a real TUN device. `--tun=userspace-networking` works for outbound but cannot serve inbound SSH connections, which is the entire point of this feature. The compose template ships TUN + caps **by default** so this is never a footgun — they're idle when no `TAILSCALE_AUTHKEY` is set. |
| SSH server | Tailscale SSH (`tailscale up --ssh`) | No openssh-server needed. Auth handled by Tailscale ACLs. |
| Auth mechanism | `TAILSCALE_AUTHKEY` env var | Set in `docker-compose.local.yaml` (git-ignored). Recommended: reusable + ephemeral keys so stale containers auto-expire from the tailnet. |
| Opt-in strategy | No-op when `TAILSCALE_AUTHKEY` is unset | Connect step skips silently in `post-start.sh`. Zero impact on users who don't set the key. |
| Install method | **Baked into image** at build time | Static binary tarball from `pkgs.tailscale.com/stable/`, sha256-verified. No apt-repo dance, no apt clock-skew workaround at start time. ~25MB delta. |
| Daemon lifecycle | `setsid /usr/local/sbin/tailscaled ... &` from `setup-tailscale.sh connect` | `setsid` detaches the daemon from the post-start shell so it survives the script's exit. State at `/var/lib/tailscale/tailscaled.state`. |
| State persistence | Named volume `tailscale-state` mounted at `/var/lib/tailscale` | Survives `compose down` + `compose up` cycles. Same node identity is re-used → no ephemeral-key collisions on the tailnet. Wiped only on `compose down -v`. |
| Hostname | `TAILSCALE_HOSTNAME` env var, default `<project>-devc-<server>` | Disambiguates same repo on different machines. Project name parsed from `devcontainer.json`'s `name` field, sanitized to a valid DNS label (lowercase, alphanumerics + hyphens). |
| Failure mode | **Fail loud** when `/dev/net/tun` is missing | Hard exit with actionable error pointing at the compose entries to restore. Previous design quietly fell back to userspace-networking; users would never see the warning, then wonder why SSH didn't work. |
| Idempotency | `setup-tailscale.sh connect` checks `tailscale status --self` before running `tailscale up` | Re-runs are no-ops when already authed under the same hostname. Avoids regenerating sessions on every container start. |

## Lifecycle hook placement

| Hook | Script | Tailscale action |
|------|--------|-----------------|
| `postCreateCommand` | `post-create.sh` | (no Tailscale work — image bake + state volume handle install) |
| `postStartCommand` | `post-start.sh` | `setup-tailscale.sh connect` — start daemon + connect to tailnet (idempotent) |

`postStartCommand` runs on every container start (create + restart), **before** the IDE attaches. This is critical — `postAttachCommand` runs in a transient shell tied to the IDE session, and background processes started there die when the shell exits.

## Files

| File | Role |
|------|------|
| `Containerfile` | Bakes `tailscale` (CLI) + `tailscaled` (daemon) into `/usr/local/{bin,sbin}` |
| `assets/workspace/.devcontainer/docker-compose.yml` | Ships `/dev/net/tun` + `NET_ADMIN`/`NET_RAW` + `tailscale-state` volume by default |
| `assets/workspace/.devcontainer/scripts/setup-tailscale.sh` | Single `connect` subcommand; idempotent + state-aware + fail-loud on missing TUN |
| `assets/workspace/.devcontainer/scripts/post-start.sh` | Calls `setup-tailscale.sh connect` (silent no-op when `TAILSCALE_AUTHKEY` unset) |
| `assets/workspace/.devcontainer/docker-compose.local.yaml` | Commented example showing where the user sets `TAILSCALE_AUTHKEY` |

## User setup

### 1. Configure Tailscale SSH ACLs

The tailnet's ACL policy must allow SSH access. In the [Tailscale admin console](https://login.tailscale.com/admin/acls/file):

```jsonc
"ssh": [
{
"action": "accept",
"src": ["autogroup:member"],
"dst": ["autogroup:self"],
"users": ["root", "autogroup:nonroot"]
}
]
```

### 2. Generate a Tailscale auth key

Generate at https://login.tailscale.com/admin/settings/keys. **Reusable + Ephemeral** recommended — the container can re-register on recreate without manual key rotation, and stale ephemerals expire automatically from the tailnet.

### 3. Configure the devcontainer

Edit `.devcontainer/docker-compose.local.yaml` (git-ignored, your personal overrides):

```yaml
services:
devcontainer:
environment:
- TAILSCALE_AUTHKEY=tskey-auth-XXXX
- TAILSCALE_HOSTNAME=myproject-devc-mybox # optional override
```

### 4. Rebuild

Rebuild (or recreate) the devcontainer. Post-start connects to the tailnet on every start — typically <2 seconds when the state volume is warm.

### 5. Connect

```bash
ssh root@<tailscale-hostname>
```

For Cursor: "Remote - SSH" → `root@<hostname>`.

## Programmatic auth key generation (devc-remote)

For unattended deploys, the `devc-remote.sh` orchestration script (separate issue) generates ephemeral auth keys via the Tailscale API using OAuth client credentials stored in macOS Keychain. See `docs/designs/devc-remote.md` (when added) for the OAuth client setup.

## What a manual recovery looks like

| Scenario | Action |
|----------|--------|
| Auth key expired between deploys | `compose exec <service> /workspace/.../scripts/setup-tailscale.sh connect` (with refreshed env) |
| `tailscaled` crashed | Same — script detects no daemon and starts one |
| Hostname changed | Update env, re-run script. Old hostname remains on tailnet until ephemeral expires. |
| Want to disconnect | `compose exec <service> tailscale logout` (manual; no `down` subcommand yet — see issue #545+ for `just tailscale-*` recipes) |
Loading
Loading