fix(net): protocol fixes for Tailscale-compatible peer reachability

A pile of correctness bugs that all stopped real Tailscale peers from
being able to send WireGuard packets back to us. Found while building
out the e2e test against the docker-compose stack.

1. WireGuard static key was wrong (lifecycle.rs)
   We were initializing the WgTunnel with `keys.wg_private`, a separate
   x25519 key from the one Tailscale advertises in netmaps. Peers know
   us by `node_public` and compute mac1 against it; signing handshakes
   with a different private key meant every init we sent was silently
   dropped. Use `keys.node_private` instead — node_key IS the WG static
   key in Tailscale.

2. DERP relay couldn't route packets to us (derp/client.rs)
   Our DerpClient was sealing the ClientInfo frame with a fresh
   ephemeral NaCl keypair and putting the ephemeral public in the frame
   prefix. Tailscale's protocol expects the *long-term* node public key
   in the prefix — that's how the relay knows where to forward packets
   addressed to our node_key. With the ephemeral key, the relay
   accepted the connection but never delivered our peers' responses.
   Now seal with the long-term node key.

3. Headscale never persisted our DiscoKey (proto/types.rs, control/*)
   The streaming /machine/map handler in Headscale ≥ capVer 68 doesn't
   update DiscoKey on the node record — only the "Lite endpoint update"
   path does, gated on Stream:false + OmitPeers:true + ReadOnly:false.
   Without DiscoKey our nodes appeared in `headscale nodes list` with
   `discokey:000…` and never propagated into peer netmaps. Add the
   DiscoKey field to RegisterRequest, add OmitPeers/ReadOnly fields to
   MapRequest, and call a new `lite_update` between register and the
   streaming map. Also add `post_json_no_response` for endpoints that
   reply with an empty body.

4. EncapAction is now a struct instead of an enum (wg/tunnel.rs)
   Routing was forced to either UDP or DERP. With a peer whose
   advertised UDP endpoint is on an unreachable RFC1918 network (e.g.
   docker bridge IPs), we'd send via UDP, get nothing, and never fall
   back. Send over every available transport — receivers dedupe via
   the WireGuard replay window — and let dispatch_encap forward each
   populated arm to its respective channel.

5. Drop the dead PacketRouter (wg/router.rs)
   Skeleton from an earlier design that never got wired up; it's been
   accumulating dead-code warnings.
This commit is contained in:
2026-04-07 14:33:43 +01:00
parent 85d34bb035
commit dca8c3b643
9 changed files with 169 additions and 246 deletions

View File

@@ -98,6 +98,12 @@ async fn run_session(
set_status(status, DaemonStatus::Registering);
let _reg = control.register(&config.auth_key, &config.hostname, keys).await?;
// 2a. Send a Lite endpoint update so Headscale persists our DiscoKey on
// the node record. The streaming /machine/map handler doesn't
// update DiscoKey at capability versions ≥ 68 — only the Lite path
// does, and without it our peers can't see us in their netmaps.
control.lite_update(keys, &config.hostname, None).await?;
// 3. Start map stream
let mut map_stream = control.map_stream(keys, &config.hostname).await?;
@@ -119,8 +125,12 @@ async fn run_session(
let peer_count = peers.len();
// 5. Initialize WireGuard tunnel with our WG private key
let mut wg_tunnel = WgTunnel::new(keys.wg_private.clone());
// 5. Initialize WireGuard tunnel. Tailscale uses the node_key as the
// WireGuard static key — they are the same key, not separate. Peers
// only know our node_public from the netmap, so boringtun must be
// signing with the matching private key or peers will drop our
// handshakes for failing mac1 validation.
let mut wg_tunnel = WgTunnel::new(keys.node_private.clone());
wg_tunnel.update_peers(&peers);
// 6. Set up NetworkEngine with our VPN IP
@@ -342,6 +352,7 @@ async fn run_wg_loop(
incoming = derp_in_rx.recv() => {
match incoming {
Some((src_key, data)) => {
tracing::trace!("WG ← DERP ({} bytes)", data.len());
let action = tunnel.decapsulate(&src_key, &data);
handle_decap(action, src_key, &to_engine, &derp_out_tx).await;
}
@@ -351,6 +362,7 @@ async fn run_wg_loop(
incoming = udp_in_rx.recv() => {
match incoming {
Some((src_addr, data)) => {
tracing::trace!("WG ← UDP {src_addr} ({} bytes)", data.len());
let Some(peer_key) = identify_udp_peer(&tunnel, src_addr, &data) else {
tracing::trace!("UDP packet from {src_addr}: no peer match");
continue;
@@ -371,22 +383,21 @@ async fn run_wg_loop(
}
}
/// Dispatch a WG encap action to the appropriate transport.
/// Dispatch a WG encap action to whichever transports it carries. We send
/// over both UDP and DERP when both are populated; the remote peer dedupes
/// duplicate ciphertexts via the WireGuard replay window.
async fn dispatch_encap(
action: crate::wg::tunnel::EncapAction,
derp_out_tx: &mpsc::Sender<([u8; 32], Vec<u8>)>,
udp_out_tx: &mpsc::Sender<(std::net::SocketAddr, Vec<u8>)>,
) {
match action {
crate::wg::tunnel::EncapAction::SendUdp { endpoint, data } => {
tracing::trace!("WG → UDP {endpoint} ({} bytes)", data.len());
let _ = udp_out_tx.send((endpoint, data)).await;
}
crate::wg::tunnel::EncapAction::SendDerp { dest_key, data } => {
tracing::trace!("WG → DERP ({} bytes)", data.len());
let _ = derp_out_tx.send((dest_key, data)).await;
}
crate::wg::tunnel::EncapAction::Nothing => {}
if let Some((endpoint, data)) = action.udp {
tracing::trace!("WG → UDP {endpoint} ({} bytes)", data.len());
let _ = udp_out_tx.send((endpoint, data)).await;
}
if let Some((dest_key, data)) = action.derp {
tracing::trace!("WG → DERP ({} bytes)", data.len());
let _ = derp_out_tx.send((dest_key, data)).await;
}
}
@@ -523,6 +534,7 @@ async fn run_derp_loop(
incoming = client.recv_packet() => {
match incoming {
Ok((src_key, data)) => {
tracing::trace!("DERP recv ({} bytes)", data.len());
if in_tx.send((src_key, data)).await.is_err() {
return;
}