Tailscale connection fails in both Docker container and new LXC container on Proxmox #668

Closed
opened 2025-12-29 02:21:49 +01:00 by adam · 10 comments
Owner

Originally created by @adoolaard on GitHub (Mar 12, 2024).

Bug description

I have successfully installed Headscale in a Docker container running on a Proxmox LXC container. I opened ports 80, 443, and 8080 in the Proxmox firewall, forwarding them to port 8080 on the Headscale container.

I can successfully connect to Headscale using the Tailscale apps on my iPhone and Macbook. However, I am unable to connect from:

A Tailscale Docker container running on the same LXC container as Headscale.
A new LXC container where I installed Tailscale with apt install tailscale and ran tailscale up --login-server https://headscale.mydomain.com:443.
When attempting to connect from these containers, nothing happens for 15 minutes before the command times out. I have tried with and without the --authkey option.

For the Docker container, I have some logs, but they are not helpful in understanding the issue. I have tried using both the stable version of Headscale and "v0.23.0-alpha5." My iPhone and Macbook connect successfully with both versions, but Linux and Docker connections fail.

Environment

What I have tried:

Opened the necessary ports in the Proxmox firewall.
Used both stable and alpha versions of Headscale.
Tried connecting with and without the --authkey option.
Checked the Docker container logs (limited information).

Docker Compose configuration:

services:
  tailscale:
    container_name: tailscale
    #image: tailscale/tailscale:stable
    image: tailscale/tailscale:v1.58.2
    hostname: headtailscale
    volumes:
      - ./data:/var/lib/tailscale
      - /dev/net/tun:/dev/net/tun
    network_mode: "host"
    cap_add:
      - NET_ADMIN
      - NET_RAW
    environment:
      - TS_STATE_DIR=/var/lib/tailscale
      - TS_EXTRA_ARGS=--login-server=https://headscale.mydomain.nl --advertise-exit-node --advertise-routes=192.168.1.0/24 --accept-dns=true
      - TS_NO_LOGS_NO_SUPPORT=true
      - TS_AUTHKEY=<my_generated_key>
    restart: unless-stopped

Docker logs:

docker compose up
[+] Running 4/4
 ✔ tailscale 3 layers [⣿⣿⣿]   0B/0B   Pulled                                                                             3.8s 
  ✔ c926b61bad3b Pull complete                                                                                       0.4s 
  ✔ 74bc9945fe25 Pull complete                                                                                       0.4s 
  ✔ 7726f8056532 Pull complete                                                                                       0.8s 
[+] Running 1/1
 ✔ Container tailscale Created                                                                                       7.6s 
Attaching to tailscale
tailscale | boot: 2024/03/12 18:13:34 Starting tailscaled
tailscale | boot: 2024/03/12 18:13:34 Waiting for tailscaled socket
tailscale | 2024/03/12 18:13:34 You have disabled logging. Tailscale will not be able to provide support.
tailscale | 2024/03/12 18:13:34 logtail started
tailscale | 2024/03/12 18:13:34 Program starting: v1.58.2-tb0e1bbb62, Go 1.21.5: []string{"tailscaled", "--socket=/tmp/tailscaled.sock", "--statedir=/var/lib/tailscale", "--tun=userspace-networking"}
tailscale | 2024/03/12 18:13:34 LogID: cc2ab974be4ad126eb5f7d816f99afa6b4c9055812fc865241444fb35aa137fa
tailscale | 2024/03/12 18:13:34 logpolicy: using system state directory "/var/lib/tailscale"
tailscale | 2024/03/12 18:13:34 wgengine.NewUserspaceEngine(tun "userspace-networking") ...
tailscale | 2024/03/12 18:13:34 dns: using dns.noopManager
tailscale | 2024/03/12 18:13:34 link state: interfaces.State{defaultRoute=eth0 ifs={br-249257de4702:[172.20.0.1/16 llu6] br-448c3b3b6366:[172.26.0.1/16 llu6] br-6de1989c1aba:[172.19.0.1/16 llu6] br-da5fa8b46807:[172.18.0.1/16 llu6] br-ffb655d9d88a:[172.21.0.1/16 llu6] docker0:[172.17.0.1/16] eth0:[192.168.1.4/24 llu6] wg0:[10.10.88.1/24]} v4=true v6=false}
tailscale | 2024/03/12 18:13:34 onPortUpdate(port=51777, network=udp6)
tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP read buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only)
tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP write buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only)
tailscale | 2024/03/12 18:13:34 onPortUpdate(port=46084, network=udp4)
tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP read buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only)
tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP write buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only)
tailscale | 2024/03/12 18:13:34 magicsock: disco key = d:ff5a60f30ec136bd
tailscale | 2024/03/12 18:13:34 Creating WireGuard device...
tailscale | 2024/03/12 18:13:34 Bringing WireGuard device up...
tailscale | 2024/03/12 18:13:34 Bringing router up...
tailscale | 2024/03/12 18:13:34 Clearing router settings...
tailscale | 2024/03/12 18:13:34 Starting network monitor...
tailscale | 2024/03/12 18:13:34 Engine created.
tailscale | 2024/03/12 18:13:34 pm: migrating "_daemon" profile to new format
tailscale | 2024/03/12 18:13:34 envknob: TS_NO_LOGS_NO_SUPPORT="true"
tailscale | 2024/03/12 18:13:34 logpolicy: using system state directory "/var/lib/tailscale"
tailscale | 2024/03/12 18:13:34 got LocalBackend in 18ms
tailscale | 2024/03/12 18:13:34 Start
tailscale | 2024/03/12 18:13:34 Backend: logs: be:cc2ab974be4ad126eb5f7d816f99afa6b4c9055812fc865241444fb35aa137fa fe:
tailscale | 2024/03/12 18:13:34 Switching ipn state NoState -> NeedsLogin (WantRunning=false, nm=false)
tailscale | 2024/03/12 18:13:34 blockEngineUpdates(true)
tailscale | 2024/03/12 18:13:34 health("overall"): error: state=NeedsLogin, wantRunning=false
tailscale | 2024/03/12 18:13:34 wgengine: Reconfig: configuring userspace WireGuard config (with 0/0 peers)
tailscale | 2024/03/12 18:13:34 wgengine: Reconfig: configuring router
tailscale | 2024/03/12 18:13:34 wgengine: Reconfig: configuring DNS
tailscale | 2024/03/12 18:13:34 dns: Set: {DefaultResolvers:[] Routes:{} SearchDomains:[] Hosts:0}
tailscale | 2024/03/12 18:13:34 dns: Resolvercfg: {Routes:{} Hosts:0 LocalDomains:[]}
tailscale | 2024/03/12 18:13:34 dns: OScfg: {}
tailscale | boot: 2024/03/12 18:13:34 Running 'tailscale up'
tailscale | 2024/03/12 18:13:34 Start
tailscale | 2024/03/12 18:13:34 control: client.Shutdown()
tailscale | 2024/03/12 18:13:34 control: client.Shutdown
tailscale | 2024/03/12 18:13:34 control: authRoutine: exiting
tailscale | 2024/03/12 18:13:34 control: mapRoutine: exiting
tailscale | 2024/03/12 18:13:34 control: updateRoutine: exiting
tailscale | 2024/03/12 18:13:34 control: Client.Shutdown done.
tailscale | 2024/03/12 18:13:34 Backend: logs: be:cc2ab974be4ad126eb5f7d816f99afa6b4c9055812fc865241444fb35aa137fa fe:
tailscale | 2024/03/12 18:13:34 Switching ipn state NoState -> NeedsLogin (WantRunning=true, nm=false)
tailscale | 2024/03/12 18:13:34 blockEngineUpdates(true)
tailscale | 2024/03/12 18:13:34 StartLoginInteractive: url=false
tailscale | 2024/03/12 18:13:34 control: client.Login(false, 2)
tailscale | 2024/03/12 18:13:34 control: LoginInteractive -> regen=true
tailscale | 2024/03/12 18:13:34 control: doLogin(regen=true, hasUrl=false)
tailscale | boot: 2024/03/12 18:14:34 failed to auth tailscale: failed to auth tailscale: tailscale up failed: signal: killed
tailscale exited with code 1

I have searched for similar issues in the existing tickets and documentation but could not find a solution. Any help would be greatly appreciated!

Originally created by @adoolaard on GitHub (Mar 12, 2024). ## Bug description I have successfully installed Headscale in a Docker container running on a Proxmox LXC container. I opened ports 80, 443, and 8080 in the Proxmox firewall, forwarding them to port 8080 on the Headscale container. I can successfully connect to Headscale using the Tailscale apps on my iPhone and Macbook. However, I am unable to connect from: A Tailscale Docker container running on the same LXC container as Headscale. A new LXC container where I installed Tailscale with apt install tailscale and ran tailscale up --login-server https://headscale.mydomain.com:443. When attempting to connect from these containers, nothing happens for 15 minutes before the command times out. I have tried with and without the --authkey option. For the Docker container, I have some logs, but they are not helpful in understanding the issue. I have tried using both the stable version of Headscale and "v0.23.0-alpha5." My iPhone and Macbook connect successfully with both versions, but Linux and Docker connections fail. ## Environment What I have tried: Opened the necessary ports in the Proxmox firewall. Used both stable and alpha versions of Headscale. Tried connecting with and without the --authkey option. Checked the Docker container logs (limited information). Docker Compose configuration: ``` services: tailscale: container_name: tailscale #image: tailscale/tailscale:stable image: tailscale/tailscale:v1.58.2 hostname: headtailscale volumes: - ./data:/var/lib/tailscale - /dev/net/tun:/dev/net/tun network_mode: "host" cap_add: - NET_ADMIN - NET_RAW environment: - TS_STATE_DIR=/var/lib/tailscale - TS_EXTRA_ARGS=--login-server=https://headscale.mydomain.nl --advertise-exit-node --advertise-routes=192.168.1.0/24 --accept-dns=true - TS_NO_LOGS_NO_SUPPORT=true - TS_AUTHKEY=<my_generated_key> restart: unless-stopped ``` Docker logs: ``` docker compose up [+] Running 4/4 ✔ tailscale 3 layers [⣿⣿⣿] 0B/0B Pulled 3.8s ✔ c926b61bad3b Pull complete 0.4s ✔ 74bc9945fe25 Pull complete 0.4s ✔ 7726f8056532 Pull complete 0.8s [+] Running 1/1 ✔ Container tailscale Created 7.6s Attaching to tailscale tailscale | boot: 2024/03/12 18:13:34 Starting tailscaled tailscale | boot: 2024/03/12 18:13:34 Waiting for tailscaled socket tailscale | 2024/03/12 18:13:34 You have disabled logging. Tailscale will not be able to provide support. tailscale | 2024/03/12 18:13:34 logtail started tailscale | 2024/03/12 18:13:34 Program starting: v1.58.2-tb0e1bbb62, Go 1.21.5: []string{"tailscaled", "--socket=/tmp/tailscaled.sock", "--statedir=/var/lib/tailscale", "--tun=userspace-networking"} tailscale | 2024/03/12 18:13:34 LogID: cc2ab974be4ad126eb5f7d816f99afa6b4c9055812fc865241444fb35aa137fa tailscale | 2024/03/12 18:13:34 logpolicy: using system state directory "/var/lib/tailscale" tailscale | 2024/03/12 18:13:34 wgengine.NewUserspaceEngine(tun "userspace-networking") ... tailscale | 2024/03/12 18:13:34 dns: using dns.noopManager tailscale | 2024/03/12 18:13:34 link state: interfaces.State{defaultRoute=eth0 ifs={br-249257de4702:[172.20.0.1/16 llu6] br-448c3b3b6366:[172.26.0.1/16 llu6] br-6de1989c1aba:[172.19.0.1/16 llu6] br-da5fa8b46807:[172.18.0.1/16 llu6] br-ffb655d9d88a:[172.21.0.1/16 llu6] docker0:[172.17.0.1/16] eth0:[192.168.1.4/24 llu6] wg0:[10.10.88.1/24]} v4=true v6=false} tailscale | 2024/03/12 18:13:34 onPortUpdate(port=51777, network=udp6) tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP read buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only) tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP write buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only) tailscale | 2024/03/12 18:13:34 onPortUpdate(port=46084, network=udp4) tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP read buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only) tailscale | 2024/03/12 18:13:34 magicsock: [warning] failed to force-set UDP write buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only) tailscale | 2024/03/12 18:13:34 magicsock: disco key = d:ff5a60f30ec136bd tailscale | 2024/03/12 18:13:34 Creating WireGuard device... tailscale | 2024/03/12 18:13:34 Bringing WireGuard device up... tailscale | 2024/03/12 18:13:34 Bringing router up... tailscale | 2024/03/12 18:13:34 Clearing router settings... tailscale | 2024/03/12 18:13:34 Starting network monitor... tailscale | 2024/03/12 18:13:34 Engine created. tailscale | 2024/03/12 18:13:34 pm: migrating "_daemon" profile to new format tailscale | 2024/03/12 18:13:34 envknob: TS_NO_LOGS_NO_SUPPORT="true" tailscale | 2024/03/12 18:13:34 logpolicy: using system state directory "/var/lib/tailscale" tailscale | 2024/03/12 18:13:34 got LocalBackend in 18ms tailscale | 2024/03/12 18:13:34 Start tailscale | 2024/03/12 18:13:34 Backend: logs: be:cc2ab974be4ad126eb5f7d816f99afa6b4c9055812fc865241444fb35aa137fa fe: tailscale | 2024/03/12 18:13:34 Switching ipn state NoState -> NeedsLogin (WantRunning=false, nm=false) tailscale | 2024/03/12 18:13:34 blockEngineUpdates(true) tailscale | 2024/03/12 18:13:34 health("overall"): error: state=NeedsLogin, wantRunning=false tailscale | 2024/03/12 18:13:34 wgengine: Reconfig: configuring userspace WireGuard config (with 0/0 peers) tailscale | 2024/03/12 18:13:34 wgengine: Reconfig: configuring router tailscale | 2024/03/12 18:13:34 wgengine: Reconfig: configuring DNS tailscale | 2024/03/12 18:13:34 dns: Set: {DefaultResolvers:[] Routes:{} SearchDomains:[] Hosts:0} tailscale | 2024/03/12 18:13:34 dns: Resolvercfg: {Routes:{} Hosts:0 LocalDomains:[]} tailscale | 2024/03/12 18:13:34 dns: OScfg: {} tailscale | boot: 2024/03/12 18:13:34 Running 'tailscale up' tailscale | 2024/03/12 18:13:34 Start tailscale | 2024/03/12 18:13:34 control: client.Shutdown() tailscale | 2024/03/12 18:13:34 control: client.Shutdown tailscale | 2024/03/12 18:13:34 control: authRoutine: exiting tailscale | 2024/03/12 18:13:34 control: mapRoutine: exiting tailscale | 2024/03/12 18:13:34 control: updateRoutine: exiting tailscale | 2024/03/12 18:13:34 control: Client.Shutdown done. tailscale | 2024/03/12 18:13:34 Backend: logs: be:cc2ab974be4ad126eb5f7d816f99afa6b4c9055812fc865241444fb35aa137fa fe: tailscale | 2024/03/12 18:13:34 Switching ipn state NoState -> NeedsLogin (WantRunning=true, nm=false) tailscale | 2024/03/12 18:13:34 blockEngineUpdates(true) tailscale | 2024/03/12 18:13:34 StartLoginInteractive: url=false tailscale | 2024/03/12 18:13:34 control: client.Login(false, 2) tailscale | 2024/03/12 18:13:34 control: LoginInteractive -> regen=true tailscale | 2024/03/12 18:13:34 control: doLogin(regen=true, hasUrl=false) tailscale | boot: 2024/03/12 18:14:34 failed to auth tailscale: failed to auth tailscale: tailscale up failed: signal: killed tailscale exited with code 1 ``` I have searched for similar issues in the existing tickets and documentation but could not find a solution. Any help would be greatly appreciated!
adam added the stalebug labels 2025-12-29 02:21:49 +01:00
adam closed this issue 2025-12-29 02:21:49 +01:00
Author
Owner

@adoolaard commented on GitHub (Mar 12, 2024):

Update:

In the meantime, I have also installed Headscale bare metal (in a Debian VM in Proxmox). I am experiencing the same issue here. I can connect my Mac and iPhone, but not Linux (via the tailscale up command or the Tailscale Docker container).

@adoolaard commented on GitHub (Mar 12, 2024): ## Update: In the meantime, I have also installed Headscale bare metal (in a Debian VM in Proxmox). I am experiencing the same issue here. I can connect my Mac and iPhone, but not Linux (via the tailscale up command or the Tailscale Docker container).
Author
Owner

@pax0707 commented on GitHub (Apr 3, 2024):

Did you check this:

https://tailscale.com/kb/1130/lxc-unprivileged

@pax0707 commented on GitHub (Apr 3, 2024): Did you check this: https://tailscale.com/kb/1130/lxc-unprivileged
Author
Owner

@sthomson-wyn commented on GitHub (Apr 4, 2024):

We see this occasionally as well.

Normally restarting the headscale instance a couple of times fixes it.

This only happens after we update the routes of a subnet router, and only subnet routers are affected. Other clients can connect fine. (We are running the subnet routers in docker containers as well)

The tailscale up command fails with no output, It just times out ac574d875c/cmd/containerboot/main.go (L704)

We're unable to find any relevant logs in headscale indicating an error. In fact, headscale logs that it authenticates the node correctly

Our tailscale client containers are configured as such (using container config on GCP GCE)

  - name: test-container
    image: tailscale/tailscale:v1.56.1@sha256:196044d4d339f10bef9bdd639504fb359afbbb6486608f2bc9851aa1f2014e0b
    env:
    - name: TS_EXTRA_ARGS
      value: --login-server https://{headscale} --reset
    - name: TS_ROUTES
      value: {list of routes}
    - name: TS_USERSPACE
      value: 'false'
    - name: TS_STATE_DIR
      value: /var/headscale
    securityContext:
      privileged: true

@sthomson-wyn commented on GitHub (Apr 4, 2024): We see this occasionally as well. Normally restarting the headscale instance a couple of times fixes it. This only happens after we update the routes of a subnet router, and only subnet routers are affected. Other clients can connect fine. (We are running the subnet routers in docker containers as well) The tailscale up command fails with no output, It just times out https://github.com/tailscale/tailscale/blob/ac574d875c7bf6ce16e744b47ce94b74622d550b/cmd/containerboot/main.go#L704 We're unable to find any relevant logs in headscale indicating an error. In fact, headscale logs that it authenticates the node correctly Our tailscale client containers are configured as such (using container config on GCP GCE) ``` - name: test-container image: tailscale/tailscale:v1.56.1@sha256:196044d4d339f10bef9bdd639504fb359afbbb6486608f2bc9851aa1f2014e0b env: - name: TS_EXTRA_ARGS value: --login-server https://{headscale} --reset - name: TS_ROUTES value: {list of routes} - name: TS_USERSPACE value: 'false' - name: TS_STATE_DIR value: /var/headscale securityContext: privileged: true ```
Author
Owner

@sthomson-wyn commented on GitHub (Apr 4, 2024):

image

Here are the logs on headscale's side regarding the particular node

@sthomson-wyn commented on GitHub (Apr 4, 2024): ![image](https://github.com/juanfont/headscale/assets/108886656/16310d26-8ea4-49c8-b7ea-aacefcc1fed2) Here are the logs on headscale's side regarding the particular node
Author
Owner

@sthomson-wyn commented on GitHub (Apr 4, 2024):

I wonder if it's an issue of awkward timing where a machine is declared to be offline while it is trying to authenticate

@sthomson-wyn commented on GitHub (Apr 4, 2024): I wonder if it's an issue of awkward timing where a machine is declared to be offline while it is trying to authenticate
Author
Owner

@sthomson-wyn commented on GitHub (Apr 4, 2024):

Some info on timing:

At 2024-04-04 10:14:50.000 headscale reports "Machine successfully authorized"
At 2024-04-04 10:14:51.000 headscale reports "Machine successfully authorized"
At 2024-04-04T14:14:51.078128612Z subnet router node reports "RegisterReq: got response; nodeKeyExpired=false, machineAuthorized=true; authURL=false"
At 2024-04-04 10:15:49.845 subnet router node reports "failed to auth tailscale: failed to auth tailscale: tailscale up failed: signal: killed"
{subnet router docker container restarts}
At 2024-04-04 10:15:50.000 headscale reports "Machine successfully authorized"
At 2024-04-04 10:15:50.454 subnet router node reports "RegisterReq: got response; nodeKeyExpired=false, machineAuthorized=true; authURL=false"
At 2024-04-04 10:15:59.000 headscale reports "Machine successfully authorized"
At 2024-04-04 10:16:50.106 subnet router node reports "failed to auth tailscale: failed to auth tailscale: tailscale up failed: signal: killed"

This auth + timeout behaviour loops indefinitely until we restart headscale a couple of times.

So kind of interesting that headscale reports "machine successfully authorized" twice for each auth attempt

Between that and the fact that this only happens to us intermittently, it feels like some kind of race condition

@sthomson-wyn commented on GitHub (Apr 4, 2024): Some info on timing: At 2024-04-04 10:14:50.000 headscale reports "Machine successfully authorized" At 2024-04-04 10:14:51.000 headscale reports "Machine successfully authorized" At 2024-04-04T14:14:51.078128612Z subnet router node reports "RegisterReq: got response; nodeKeyExpired=false, machineAuthorized=true; authURL=false" At 2024-04-04 10:15:49.845 subnet router node reports "failed to auth tailscale: failed to auth tailscale: tailscale up failed: signal: killed" {subnet router docker container restarts} At 2024-04-04 10:15:50.000 headscale reports "Machine successfully authorized" At 2024-04-04 10:15:50.454 subnet router node reports "RegisterReq: got response; nodeKeyExpired=false, machineAuthorized=true; authURL=false" At 2024-04-04 10:15:59.000 headscale reports "Machine successfully authorized" At 2024-04-04 10:16:50.106 subnet router node reports "failed to auth tailscale: failed to auth tailscale: tailscale up failed: signal: killed" This auth + timeout behaviour loops indefinitely until we restart headscale a couple of times. So kind of interesting that headscale reports "machine successfully authorized" twice for each auth attempt Between that and the fact that this only happens to us intermittently, it feels like some kind of race condition
Author
Owner

@simonszu commented on GitHub (May 25, 2024):

I have the same problem as @adoolaard . Connection from Mac and iOS device is fine, connection from linux is fine on the server side:

2024-05-25T08:58:05+02:00 DBG Registering machine from API/CLI or auth callback expiresAt=<nil> nodeKey=[iYXXZ] registrationMethod=cli userName=simonszu
2024-05-25T08:58:05+02:00 DBG Registering machine machine=naugol machine_key=b5416c5da860668ded90885d6d7a283aec8bf96dcb427f9f70f304f273babc24 node_key=8985d7673375cec652fb5956a3010419a5f6056cf9ac0dee63362a132ecf9204 user=simonszu
2024-05-25T08:58:05+02:00 INF unary dur=21.093901 md={":authority":"/var/run/headscale/headscale.sock","content-type":"application/grpc","user-agent":"grpc-go/1.54.0"} method=RegisterMachine req={"key":"nodekey:8985d7673375cec652fb5956a3010419a5f6056cf9ac0dee63362a132ecf9204","user":"simonszu"} service=headscale.v1.H
eadscaleService
2024-05-25T08:58:05+02:00 DBG go/src/headscale/hscontrol/protocol_common.go:665 > Client is registered and we have the current NodeKey. All clear to /map machine=naugol noise=true
2024-05-25T08:58:05+02:00 INF go/src/headscale/hscontrol/protocol_common.go:703 > Machine successfully authorized machine=naugol noise=true
2024-05-25T08:58:05+02:00 DBG A machine is entering polling via the Noise protocol handler=NoisePollNetMap machine=naugol
2024-05-25T08:58:05+02:00 DBG Client map request processed handler=PollNetMap machine=naugol noise=true omitPeers=true readOnly=false stream=false
2024-05-25T08:58:05+02:00 INF Client sent endpoint update and is ok with a response without peer list handler=PollNetMap machine=naugol noise=true
2024-05-25T08:58:05+02:00 DBG A machine is entering polling via the Noise protocol handler=NoisePollNetMap machine=naugol
2024-05-25T08:58:05+02:00 DBG Client map request processed handler=PollNetMap machine=naugol noise=true omitPeers=false readOnly=false stream=true
2024-05-25T08:58:05+02:00 INF Client is ready to access the tailnet handler=PollNetMap machine=naugol noise=true
2024-05-25T08:58:05+02:00 INF Sending initial map handler=PollNetMap machine=naugol noise=true
2024-05-25T08:58:05+02:00 INF Notifying peers handler=PollNetMap machine=naugol noise=true

However, the client side does not seem to get the callback/response, and therefore the login command hangs indefinitely. No idea why, any help would be appreciated.

@simonszu commented on GitHub (May 25, 2024): I have the same problem as @adoolaard . Connection from Mac and iOS device is fine, connection from linux is fine on the server side: ``` 2024-05-25T08:58:05+02:00 DBG Registering machine from API/CLI or auth callback expiresAt=<nil> nodeKey=[iYXXZ] registrationMethod=cli userName=simonszu 2024-05-25T08:58:05+02:00 DBG Registering machine machine=naugol machine_key=b5416c5da860668ded90885d6d7a283aec8bf96dcb427f9f70f304f273babc24 node_key=8985d7673375cec652fb5956a3010419a5f6056cf9ac0dee63362a132ecf9204 user=simonszu 2024-05-25T08:58:05+02:00 INF unary dur=21.093901 md={":authority":"/var/run/headscale/headscale.sock","content-type":"application/grpc","user-agent":"grpc-go/1.54.0"} method=RegisterMachine req={"key":"nodekey:8985d7673375cec652fb5956a3010419a5f6056cf9ac0dee63362a132ecf9204","user":"simonszu"} service=headscale.v1.H eadscaleService 2024-05-25T08:58:05+02:00 DBG go/src/headscale/hscontrol/protocol_common.go:665 > Client is registered and we have the current NodeKey. All clear to /map machine=naugol noise=true 2024-05-25T08:58:05+02:00 INF go/src/headscale/hscontrol/protocol_common.go:703 > Machine successfully authorized machine=naugol noise=true 2024-05-25T08:58:05+02:00 DBG A machine is entering polling via the Noise protocol handler=NoisePollNetMap machine=naugol 2024-05-25T08:58:05+02:00 DBG Client map request processed handler=PollNetMap machine=naugol noise=true omitPeers=true readOnly=false stream=false 2024-05-25T08:58:05+02:00 INF Client sent endpoint update and is ok with a response without peer list handler=PollNetMap machine=naugol noise=true 2024-05-25T08:58:05+02:00 DBG A machine is entering polling via the Noise protocol handler=NoisePollNetMap machine=naugol 2024-05-25T08:58:05+02:00 DBG Client map request processed handler=PollNetMap machine=naugol noise=true omitPeers=false readOnly=false stream=true 2024-05-25T08:58:05+02:00 INF Client is ready to access the tailnet handler=PollNetMap machine=naugol noise=true 2024-05-25T08:58:05+02:00 INF Sending initial map handler=PollNetMap machine=naugol noise=true 2024-05-25T08:58:05+02:00 INF Notifying peers handler=PollNetMap machine=naugol noise=true ``` However, the client side does not seem to get the callback/response, and therefore the login command hangs indefinitely. No idea why, any help would be appreciated.
Author
Owner

@okfro commented on GitHub (Aug 18, 2024):

I had (what appears to be, at least to me) the same issue. Like the OP, I was able to setup all of my non-linux nodes without an issue. None of my linux nodes were able to authenticate against my https://head.sca.le:443. When I run tailscale up --login-server https://head.sca.le (with or without :443), the linux nodes would hang. Nothing in the headscale logs. Logs in general didn't seem useful. I added the timeout flag to aid troubleshooting, e.g. tailscale up --login-server http://head.sca.le timeout=360s (the default is "0s which blocks forever").

What ended up solving the issue for me was to make sure that all my linux nodes could resolve head.sca.le > 192.168.x.x by adding DNS entries in my router. Then I changed the login call to tailscale up --login-server http://head.sca.le:8080. Bingo. This instantly returns the auth URL (as https://head.sca.le:443) to complete the chain. I cannot get preauth tokens to work--these still hang, but I can work around that.

In my environment, I have:

  1. Proxmox host with an LXC container running Headscale. Tailscale is not installed on this host or on any of its containers.
  2. Proxmox host with Tailscale and LXC container running docker with headscale-ui.
  3. Home Assistant OS with Tailscale plugin.
  4. Windows host with Tailscale.
  5. Android host with Tailscale.
  6. iPad host with Tailscale.
  7. Rasbian lite OS with Tailscale.
  8. Debian virtual machines with Tailscale.
  9. Debian LXCs with Tailscale.
  10. OpnSense router with Tailscale. This runs AdGuard Home on :53, which routes my internal domains to resolve through Unbound > BIND internally. This also runs Caddy, which handles Headscale.

Version notes:

Did you check this:

https://tailscale.com/kb/1130/lxc-unprivileged

I did this, but it did not seem to make a difference. I did not "undo" this fix to test before/after.

@okfro commented on GitHub (Aug 18, 2024): I had (what appears to be, at least to me) the same issue. Like the OP, I was able to setup all of my non-linux nodes without an issue. None of my linux nodes were able to authenticate against my `https://head.sca.le:443`. When I run `tailscale up --login-server https://head.sca.le` (with or without :443), the linux nodes would hang. Nothing in the headscale logs. Logs in general didn't seem useful. I added the timeout flag to aid troubleshooting, e.g. `tailscale up --login-server http://head.sca.le timeout=360s` (the default is "[0s which blocks forever](https://tailscale.com/kb/1080/cli#login)"). What ended up solving the issue for me was to make sure that all my linux nodes could resolve `head.sca.le` > `192.168.x.x` by adding DNS entries in my router. Then I changed the login call to `tailscale up --login-server http://head.sca.le:8080`. Bingo. This instantly returns the auth URL (as `https://head.sca.le:443`) to complete the chain. I cannot get preauth tokens to work--these still hang, but I can work around that. In my environment, I have: 1. Proxmox host with an LXC container running Headscale. Tailscale is not installed on this host or on any of its containers. 2. Proxmox host with Tailscale and LXC container running docker with headscale-ui. 4. Home Assistant OS with Tailscale plugin. 5. Windows host with Tailscale. 6. Android host with Tailscale. 7. iPad host with Tailscale. 8. Rasbian lite OS with Tailscale. 9. Debian virtual machines with Tailscale. 10. Debian LXCs with Tailscale. 11. OpnSense router with Tailscale. This runs AdGuard Home on :53, which routes my internal domains to resolve through Unbound > BIND internally. This also runs Caddy, which handles Headscale. Version notes: - Proxmox hosts are running linux kernel 6.8.8-4-pve, PVE v8.2.4 - Headscale LXC from https://tteck.github.io/Proxmox/#headscale-lxc - Tailscale tried stable 1.7.0 and unstable 1.7.1 from https://pkgs.tailscale.com/unstable/debian/bookworm - Headscale tried version both v0.23.0-beta1 and v0.22 stable - Running all OSes at their latest versions at the time of this writing 2024-08-18. > Did you check this: > > https://tailscale.com/kb/1130/lxc-unprivileged I did this, but it did not seem to make a difference. I did not "undo" this fix to test before/after.
Author
Owner

@github-actions[bot] commented on GitHub (Dec 27, 2024):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Dec 27, 2024): This issue is stale because it has been open for 90 days with no activity.
Author
Owner

@github-actions[bot] commented on GitHub (Jan 4, 2025):

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions[bot] commented on GitHub (Jan 4, 2025): This issue was closed because it has been inactive for 14 days since being marked as stale.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#668